Perl Script 2 : Convert Multi Fasta file into a Single line FASTA File

Question : 

Update : 7.11.18

You can also use AWK to solve this problem
awk '/^>/ {printf("\n%s\n",$0);next; } { printf("%s",$0);}  END {printf("\n");}' < file.fa >result

I've a multi- fasta file with several unique names, (e.g. > Cryptococcus gattii ), and I need to generate  another file with single line sequence i.e. header in one line sequence in single line below.

>Cryptococcus gattii
MGIKGLTGLLSENAPKCMKDHEMKTLFGRKVAIDASMSIYQFLIAVRQQDGQMLMNESGDVTSHLMGFFYRTIRMVDHGIKPCYIFDGKPPELKGSVLAKRFARREEAKEGEEEAKETGTAEDVDKLARRQVRVTREHNEECKKLLSLMGIPVVTAPGEAEAQCAELARAGKVYAAGSEDMDTLTFHSPILLRHLTFSEAKKMPISEIHLDVALRDLEMSMDQFIELCILLGCDYLEPCKGIGPKTALKLMREHGTLGKVVEHIRGKMAEKAEEIKAAADEEAEAEAEAEKYDSDPENEEGGETMINSDGEEVPAPSKPKSPKKKAPAKKKKIASSGMQIPEFWPWEEAKQLFLKPDVVNGDDLVLEWKQPDTEGLVEFLCRDKGFNEDRVRAGAAKLSKMLAAKQQGRLDGFFTVKPKEPAAKDAGKGKGKDTKGEKRKAEEKGAAKKKTKK
>Daphnia pulex
MGIKGLTQVIGDTAPTAIKENEIKNYFGRKVAIDASMSIYQFLIAVRSEGAMLTSADGETTSHLMGIFYRTIRMVDNGIKPVYVFDGKPPDMKGGELTKRAEKREEASKQLVLATDAGDAVEMEKMNKRLVKVNKGHTDECKQLLTLMGIPYVEAPCEAEAQCAALVKAGKVYATATEDMDSLTFGSNVLLRYLTYSEAKKMPIKEFHLDKILDGLSYTMDEFIDLCIMLGCDYCDTIKGIGAKRAKELIDKHRCIEKVIENLDTKKYTVPENWPYQEARRLFKTPDVADAETLDLKWTQPDEEGLVKFMCGDKNFNEERIRSGAKKLCKAKTGQTQGRLDSFFKVLPSSKPSTPSTPASKRKVGCIIYLFLYF

Answer
Multi fasta or multi line fasta is a useful file format where fasta header is followed by sequence in several lines instead of in single line. But some softwares accept sequences in a single lines so we need to compress sequences in a single line. Following PERL script can help to do that
singleline.pl

Script name Download
singleline.pl


Considering sequences are stored in 'input.txt' , result will be stored in 'output.txt' and given PERL script is stored at your 'desktop' named as 'singleline.pl', you can use this script as follows


perl  singleline.pl   input.txt  > output.txt



Result will be like that

9 comments:

  1. I noticed a few possible errors with your script (only by attempting to run it naively myself). I had to make a few small changes for this to actually work. I realized the issue arises from the way this website handles the use of '<', '>' characters. This effectively removes the file handlers from your script, resulting in it not working. You might want to consider hosting this elsewhere or finding a way around this. Either way, thank you for posting this, I too have been in the need to single line sequence files.

    ReplyDelete
  2. Hi jordyn,
    Thanks for your suggestion. I will consider it in future.

    ReplyDelete
  3. Thanks. It's very useful

    ReplyDelete
  4. Thanks, the template helped me a lot.

    ReplyDelete
  5. Thank you. Your blog has been very useful. This script works very well

    ReplyDelete
    Replies
    1. Hi Chucao,
      Always welcome. Please to learn that PERL script worked for you.

      Delete
  6. No such file or directory at ./singleline.pl line 8.
    i am getting this error while running the script for multi to single fasta could you please help me to resolve this error.


    thanl you

    ReplyDelete
  7. No such file or directory at ./singleline.pl line 8.
    I am getting this error while converting multi to single fasta could you please help me to fix this error.

    thank you

    ReplyDelete
    Replies
    1. Thanks for using this script. Error said that your input file is not present. Please make sure to give the proper path. Thanks

      Delete

Have Problem ?? Drop a comments here!