Re: Help me with Bio::DB::GenBank

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Help me with Bio::DB::GenBank

Jason Stajich-3
What have  you tried in terms of coding? I would suggest looking at the
HOWTOs as this is an exact problem we've documented there.
Please try asking your question on the mailing list in the future.

Jason Stajich
[hidden email]
http://bioperl.org/wiki/User:Jason
http://twitter.com/hyphaltip


On Mon, Jun 9, 2014 at 9:12 PM, menghaowei <[hidden email]> wrote:

> Hi, I am a college student and a beginner in Bioperl. Now I have a
> question.
>
> I need to download some seqs from NCBI database with Entrez terms.
>
> At this time , I can use Bioperl to acquire query acc num, but I dont know
> how to get CDS seq from downloaded seq .
>
> Or are there any other methods to download a query seq then get CDS seq
> very easily?
>
> can you give me a hand ?
>
> I'll really appreciate it.
>
>
>
>
>
> menghaowei
_______________________________________________
Bioperl-l mailing list
[hidden email]
http://lists.open-bio.org/mailman/listinfo/bioperl-l
Reply | Threaded
Open this post in threaded view
|

Re: Help me with Bio::DB::GenBank

Logust Yu
>
> On Mon, Jun 9, 2014 at 9:12 PM, menghaowei <[hidden email]> wrote:
>
>> Hi, I am a college student and a beginner in Bioperl. Now I have a
>> question.
>>
>> I need to download some seqs from NCBI database with Entrez terms.
>>
>> At this time , I can use Bioperl to acquire query acc num, but I dont know
>> how to get CDS seq from downloaded seq .
>>
>> Or are there any other methods to download a query seq then get CDS seq
>> very easily?
>>
>> can you give me a hand ?
>>
>> I'll really appreciate it.
>>
>>
>>
>>
>>
>> menghaowei

Hi Menghaowei,

I happened to have written some code that does this. Most of the code was simply copy and paste from Bioperl's HowTo

use 5.16.1;
use warnings;
use Data::Dumper;
use Autodie;
use Bio::DB::EUtilities;
use Bio::SeqIO;

my $factory = Bio::DB::EUtilities->new(-eutil  => 'esearch',
-db     => 'gene',
-term   => '(marA[Gene Name]) AND Escherichia coli str. K-12 substr. MG1655[Organism]',
-email  => '[hidden email]',
-retmax => 1);

# query terms are mapped; what's the actual query?
print "Query translation: ",$factory->get_query_translation,"\n";
# query hits
print "Count = ",$factory->get_count,"\n";
# UIDs
my @ids = $factory->get_ids;



my $eutil   = Bio::DB::EUtilities->new(-eutil => 'esummary',
-email => '[hidden email]',
-db    => 'gene',
-id    => \@ids);

my $fetcher = Bio::DB::EUtilities->new(-eutil   => 'efetch',
-email => '[hidden email]',
-db      => 'nucleotide',
-rettype => 'gb');

while (my $docsum = $eutil->next_DocSum) {
    # to ensure we grab the right ChrStart information, we grab the Item above
    # it in the Item hierarchy (visible via print_all from the eutil instance)
    my ($item) = $docsum->get_Items_by_name('GenomicInfoType');
   
    my %item_data = map {$_ => 0} qw(ChrAccVer ChrStart ChrStop);
   
    while (my $sub_item = $item->next_subItem) {
        if (exists $item_data{$sub_item->get_name}) {
            $item_data{$sub_item->get_name} = $sub_item->get_content;
        }
    }
    # check to make sure everything is set
    for my $check (qw(ChrAccVer ChrStart ChrStop)) {
        die "$check not set" unless $item_data{$check};
    }
   
    my $strand = $item_data{ChrStart} > $item_data{ChrStop} ? 2 : 1;
    printf("Retrieving %s, from %d-%d, strand %d\n", $item_data{ChrAccVer},
    $item_data{ChrStart},
    $item_data{ChrStop},
    $strand
    );
   
    $fetcher->set_parameters(-id => $item_data{ChrAccVer},
    -seq_start => $item_data{ChrStart} + 1,
    -seq_stop  => $item_data{ChrStop} + 1,
    -strand    => $strand);
    print $fetcher->get_Response->content;
}

Hope this helps
Jing


_______________________________________________
Bioperl-l mailing list
[hidden email]
http://lists.open-bio.org/mailman/listinfo/bioperl-l