Perl Script 2 : Convert Multi Fasta file into a Single line FASTA File
|
Question :
Answer
Result will be like that
Update : 7.11.18
You can also use AWK to solve this problemawk '/^>/ {printf("\n%s\n",$0);next; } { printf("%s",$0);} END {printf("\n");}' < file.fa >result
I've a multi- fasta file with several unique names, (e.g. > Cryptococcus gattii ), and I need to generate another file with single line sequence i.e. header in one line sequence in single line below.
>Cryptococcus gattii
MGIKGLTGLLSENAPKCMKDHEMKTLFGRKVAIDASMSIYQFLIAVRQQDGQMLMNESGDVTSHLMGFFYRTIRMVDHGIKPCYIFDGKPPELKGSVLAKRFARREEAKEGEEEAKETGTAEDVDKLARRQVRVTREHNEECKKLLSLMGIPVVTAPGEAEAQCAELARAGKVYAAGSEDMDTLTFHSPILLRHLTFSEAKKMPISEIHLDVALRDLEMSMDQFIELCILLGCDYLEPCKGIGPKTALKLMREHGTLGKVVEHIRGKMAEKAEEIKAAADEEAEAEAEAEKYDSDPENEEGGETMINSDGEEVPAPSKPKSPKKKAPAKKKKIASSGMQIPEFWPWEEAKQLFLKPDVVNGDDLVLEWKQPDTEGLVEFLCRDKGFNEDRVRAGAAKLSKMLAAKQQGRLDGFFTVKPKEPAAKDAGKGKGKDTKGEKRKAEEKGAAKKKTKK
>Daphnia pulex
MGIKGLTQVIGDTAPTAIKENEIKNYFGRKVAIDASMSIYQFLIAVRSEGAMLTSADGETTSHLMGIFYRTIRMVDNGIKPVYVFDGKPPDMKGGELTKRAEKREEASKQLVLATDAGDAVEMEKMNKRLVKVNKGHTDECKQLLTLMGIPYVEAPCEAEAQCAALVKAGKVYATATEDMDSLTFGSNVLLRYLTYSEAKKMPIKEFHLDKILDGLSYTMDEFIDLCIMLGCDYCDTIKGIGAKRAKELIDKHRCIEKVIENLDTKKYTVPENWPYQEARRLFKTPDVADAETLDLKWTQPDEEGLVKFMCGDKNFNEERIRSGAKKLCKAKTGQTQGRLDSFFKVLPSSKPSTPSTPASKRKVGCIIYLFLYF
MGIKGLTGLLSENAPKCMKDHEMKTLFGRKVAIDASMSIYQFLIAVRQQDGQMLMNESGDVTSHLMGFFYRTIRMVDHGIKPCYIFDGKPPELKGSVLAKRFARREEAKEGEEEAKETGTAEDVDKLARRQVRVTREHNEECKKLLSLMGIPVVTAPGEAEAQCAELARAGKVYAAGSEDMDTLTFHSPILLRHLTFSEAKKMPISEIHLDVALRDLEMSMDQFIELCILLGCDYLEPCKGIGPKTALKLMREHGTLGKVVEHIRGKMAEKAEEIKAAADEEAEAEAEAEKYDSDPENEEGGETMINSDGEEVPAPSKPKSPKKKAPAKKKKIASSGMQIPEFWPWEEAKQLFLKPDVVNGDDLVLEWKQPDTEGLVEFLCRDKGFNEDRVRAGAAKLSKMLAAKQQGRLDGFFTVKPKEPAAKDAGKGKGKDTKGEKRKAEEKGAAKKKTKK
>Daphnia pulex
MGIKGLTQVIGDTAPTAIKENEIKNYFGRKVAIDASMSIYQFLIAVRSEGAMLTSADGETTSHLMGIFYRTIRMVDNGIKPVYVFDGKPPDMKGGELTKRAEKREEASKQLVLATDAGDAVEMEKMNKRLVKVNKGHTDECKQLLTLMGIPYVEAPCEAEAQCAALVKAGKVYATATEDMDSLTFGSNVLLRYLTYSEAKKMPIKEFHLDKILDGLSYTMDEFIDLCIMLGCDYCDTIKGIGAKRAKELIDKHRCIEKVIENLDTKKYTVPENWPYQEARRLFKTPDVADAETLDLKWTQPDEEGLVKFMCGDKNFNEERIRSGAKKLCKAKTGQTQGRLDSFFKVLPSSKPSTPSTPASKRKVGCIIYLFLYF
Answer
Multi fasta or multi line fasta is a useful file format where fasta header is followed by sequence in several lines instead of in single line. But some softwares accept sequences in a single lines so we need to compress sequences in a single line. Following PERL script can help to do that
singleline.plScript name | Download |
---|---|
singleline.pl |
Considering sequences are stored in 'input.txt' , result will be stored in 'output.txt' and given PERL script is stored at your 'desktop' named as 'singleline.pl', you can use this script as follows
perl singleline.pl input.txt > output.txt
Result will be like that
Related Posts Bioinformatics resources,
PERL,
Perl Script
|
Was This Post Useful? Add This To Del.icio.us Share on Facebook StumbleUpon This Add to Technorati Share on Twitter |
Labels:
Bioinformatics resources,
PERL,
Perl Script
Subscribe to:
Post Comments (Atom)
I noticed a few possible errors with your script (only by attempting to run it naively myself). I had to make a few small changes for this to actually work. I realized the issue arises from the way this website handles the use of '<', '>' characters. This effectively removes the file handlers from your script, resulting in it not working. You might want to consider hosting this elsewhere or finding a way around this. Either way, thank you for posting this, I too have been in the need to single line sequence files.
ReplyDeleteHi jordyn,
ReplyDeleteThanks for your suggestion. I will consider it in future.
Thanks. It's very useful
ReplyDeleteThanks, the template helped me a lot.
ReplyDeleteThank you. Your blog has been very useful. This script works very well
ReplyDeleteHi Chucao,
DeleteAlways welcome. Please to learn that PERL script worked for you.
No such file or directory at ./singleline.pl line 8.
ReplyDeletei am getting this error while running the script for multi to single fasta could you please help me to resolve this error.
thanl you
No such file or directory at ./singleline.pl line 8.
ReplyDeleteI am getting this error while converting multi to single fasta could you please help me to fix this error.
thank you
Thanks for using this script. Error said that your input file is not present. Please make sure to give the proper path. Thanks
Delete