barcode split a paired-end fastq file

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

barcode split a paired-end fastq file

mattcel
Are there any BioPerl modules that would help to split a barcoded fastq
file ?

I have tried FASTX-toolkit, but it does not work on paired-end data.

The file I have been given is paired-end but a single file, and I have a
list of 8 or so 4 letter barcodes.


Matthew

_______________________________________________
Bioperl-l mailing list
[hidden email]
http://mailman.open-bio.org/mailman/listinfo/bioperl-l
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: barcode split a paired-end fastq file

Paul Cantalupo
Hi Matthew,

I'm not sure about any Bioperl modules (hopefully somebody else will chime in). But there are several routes for help on this matter.

1. Reply back to this message showing enough of your sequence file and your list of 8 barcodes so we can provide further advice.
2. Post your question to http://stackoverflow.com/. Here though, you want to make sure follow guidelines found here and here.
3. Post your question to https://www.biostars.org/. They probably have similar links for how to ask a good question like stackoverflow but I do'nt have the links for them.

Paul


Paul Cantalupo
University of Pittsburgh


On Wed, May 25, 2016 at 4:08 PM, Matthew <[hidden email]> wrote:
Are there any BioPerl modules that would help to split a barcoded fastq file ?

I have tried FASTX-toolkit, but it does not work on paired-end data.

The file I have been given is paired-end but a single file, and I have a list of 8 or so 4 letter barcodes.


Matthew

_______________________________________________
Bioperl-l mailing list
[hidden email]
http://mailman.open-bio.org/mailman/listinfo/bioperl-l


_______________________________________________
Bioperl-l mailing list
[hidden email]
http://mailman.open-bio.org/mailman/listinfo/bioperl-l
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: barcode split a paired-end fastq file

mattcel
Thank you for your reply, Paul.

Here is the first 18 lines of my fastq file (strangely enough it seems to have missed the quality coding line for the first sequence. However the last line of this file is a single quality code line), and below it are the barcodes.

@M02850:116:000000000-ANY8B:1:1101:12375:1762 1:N:0:1
CATGTAATTCAAAATCCAATAAATCAAAAATAAAAAAAATCAAAAAACAAAAAACTAATCAACATATTAAACTTTCCAAATTCATTAACCAAAAACTAATCTCACTTCTCACAAACCACATCAACTTTCTTCATTCTTCAAT
+
FFFGFGGGGGGGGGHHHGHHHHHHCFHHHHHHGHHHGGGHHHHGHHGGHHHGHGGHHHHHHHGHHHHHGHHHFHHHHHHHHHHHHHHHHHFHHHFEFHHHGHHHHHHGHHHBGHHGGHGHHHHHHHHHGHHHHHHHGHHGHF
@M02850:116:000000000-ANY8B:1:1101:12375:1762 2:N:0:1
CATGGTGTAAATAGTTTAAAGTATTGTTATTATGTTTATTTTTGTATGGTTTTTGGAAATTGAGAAAGAGGAGAATTTAAAGAGGATGTGTGAGAGAGATAGTTTGATTTTTTGATTGAAGAATGAAGAAAGTTGATGTGGT
+
FFFGGGGGGGFGGHGHHHHHHDGGHHFHFHHHHHHGGFHHHHHCGHHHHHHHGGGGFHHHHGFHFHHFFGHHGEGHHHHHFFGGHFHHHFHHBFGFFEFHHHHHHGHHHHFHGGAGHHHHHFGHHHHGHGHHFHFHHFGHFG
@M02850:116:000000000-ANY8B:1:1101:18019:1825 1:N:0:1
ATTCGAGAATCTCATATATTCTTTATCGAAACCCATACATCTTTCCGTCGAAAATCTCATATATACCTTATCCCATTCAACATTCATACGAACGCCGCTCTAGAATTTTTACTTTTCGCCATTAATCCAAATACTATTTAAT
+
FFFGGGGGGGGGGHHHHHHHHHHHHHHHHHGGHGGHHHGHHHHHHHGHHGGGGGHHHHHHGHHHHHGHHHHHHHHHHHHHHHHHHHHHHHGHGGGGGGGGHHHHHHHHHHHHHHHHHGGGGGHHHHHHHHHHHHHHHHHHHH
@M02850:116:000000000-ANY8B:1:1101:18019:1825 2:N:0:1
TGGAAAAAATAATAATAATTTGATTGTTAGTATTTTATAAATCGATAAATCGTAAGAAGAAAAATATAAAAATAATATTAAAGTTGTGTGCTAAAAGCAATTTTAAATAATTAAATAGTATTTGGATTAATGGCGAAAAGTA
+
FFFFGGGGGGGGGHHHGHHHHHHHHHGHHHHHHHHHHHHHHHGEHAGGHHHGFHHHGHHHHHHGHHHGHGHHHHHHHHHHHGHHHHHGHHFHHHHHHHHHHHGHHHHHHHHHHHHHHHHHHHHHGGHHHHHHGGFGGGFHHH
@M02850:116:000000000-ANY8B:1:1101:16147:1845 1:N:0:1
AATATATTAATATTAAAGAGTTATGGGTTGGAGTTTATATATTTTTTCGTCGAGAATTTTATATATATTTTATTTTATTTAATATTTATACGAGCGTCGTTTTAGGGTTTTTGTTTTTCGTTATTGGTTTAAGTGCTATTTG


FWABisF0    TATA
FWABisF1    CATA
FWABisF2    GATA
FWABisF3    GGTA
FWABisF4    CGTA
FWABisF5    AATA
FWABisF6    AGTA
FWABisR1    ATTC
FWABisR2    ACTC
FWABisR3    CATC
FWABisR4    CCTC
FWABisR5    TATC
FWABisR6    TCTC
WRKYBisF1    CAAG
WRKYBisF2    GTAG
WRKYBisF3    GGAG
WRKYBisR1    ACTG
WRKYBisR2    CATG
WRKYBisR3    TATG

Matthew

On 5/25/2016 7:32 PM, Paul Cantalupo wrote:
Hi Matthew,

I'm not sure about any Bioperl modules (hopefully somebody else will chime in). But there are several routes for help on this matter.

1. Reply back to this message showing enough of your sequence file and your list of 8 barcodes so we can provide further advice.
2. Post your question to http://stackoverflow.com/. Here though, you want to make sure follow guidelines found here and here.
3. Post your question to https://www.biostars.org/. They probably have similar links for how to ask a good question like stackoverflow but I do'nt have the links for them.

Paul


Paul Cantalupo
University of Pittsburgh


On Wed, May 25, 2016 at 4:08 PM, Matthew <[hidden email]> wrote:
Are there any BioPerl modules that would help to split a barcoded fastq file ?

I have tried FASTX-toolkit, but it does not work on paired-end data.

The file I have been given is paired-end but a single file, and I have a list of 8 or so 4 letter barcodes.


Matthew

_______________________________________________
Bioperl-l mailing list
[hidden email]
http://mailman.open-bio.org/mailman/listinfo/bioperl-l



_______________________________________________
Bioperl-l mailing list
[hidden email]
http://mailman.open-bio.org/mailman/listinfo/bioperl-l


_______________________________________________
Bioperl-l mailing list
[hidden email]
http://mailman.open-bio.org/mailman/listinfo/bioperl-l
Loading...