parse genebank for gene name/ id

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

parse genebank for gene name/ id

Dear All,

I am trying to parse a genebank summary files for the following records:

Sequence, description, division, GI number, gene id/name, version and organism.

I used a script from the bioperl webpage and used to parse the above.

I am getting problem parsing the gene id/name, version and cds information, organism.

Could you please help with the same.

Below is the code I am using:

use strict;
use warnings;
use Bio::SeqIO;
use Bio::Seq;

my $seqobj;
my $file = "NM_000040.summary";

my $seqio = Bio::SeqIO->new (-format => 'GenBank',
                             -file   => $file);
print ref($seqio);
while ($seqobj = $seqio->next_seq ()) {
    printf "Sequence:    %s\n",$seqobj->seq;
    printf "Display ID:  %s\n",$seqobj->display_id;
    printf "Description: %s\n",$seqobj->desc;
    printf "Division:    %s\n",$seqobj->division;
    printf "Accession:   %s\n",$seqobj->accession_number;
    printf "GI number:   %s\n",$seqobj->primary_id;
    printf "Definition:  %s\n",$seqobj->seq_version;

Any help is greatly appreciated.

Bioperl-l mailing list
[hidden email]