How to Remove All Empty Fasta Sequences from a file using PERL script

Suppose a condition a where you have a file with multiple fasta file which contain several hundred FASTA sequence but few of them have only header of FASTA sequence but not the sequence and you want to get rid of unwanted FASTA header. In this situation it will be very difficult and cumbersome to delete FASTA headers without any sequences. In this post, I am going to share a PERL script that will delete all FASTA header without any sequence from a multi FASTA file. Example of input file is given below.




Input.txt
>I-am without-sequence1

>I-am without-sequence1
CTTTACGTAGCGGAAAATTAGATACGGACAGATAAATGTTAGAAGAATTAAATATCGATC
TATCTAGCTTAAAAGCAAATAAAGTAGATAACAGAATATTAGGGGTTGTTNNTTTGACGA
TCTCCGCTCAAAAGAAAAAGAAGAGATCATTCAATGTGTTTACAATGT

>I-am without-sequence2

>I-am without-sequence3

>I-am without-sequence4
>Throat_LANL_73_orf00002 begin=91 end=222 rf=1 score=0.18
GATCGTCAAANNAACAACCCCTAATATTCTGTTATCTACTTTATTTGCTTTTAAGCTAGA
TAGATCGATATTTAATTCTTCTAACATTTATCTGTCCGTATCTAATTTTCCGCTACGTAA
AGCGTCAAGTAA
>I-am without-sequence4

>I-am without-sequence5
>I-am without-sequence2
GATCGTCAAANNAACAACCCCTAATATTCTGTTATCTACTTTATTTGCTTTTAAGCTAGA
TAGATCGATATTTAATTCTTCTAACATTTATCTGTCCGTATCTAATTTTCCGCTACGTAA
AGCGTCAAGTAA
>I-am without-sequence6


Script

Script name Download
remove-blank-fasta.pl

Uses
remove-blank-fasta.pl input.txt >result.txt





No comments:

Post a Comment

Have Problem ?? Drop a comments here!