Perl Script 1 : Store Multi Fasta Header into a Separate File

Question : 
I've a fasta file with several unique names, (e.g. >Chlorella), and I need to generate a tab delimited file with just those names in a continuous column without any sequences.




Answer : 


These perl scripts can be used for this purpose. Suppose sequences are present in 'input.txt' file and want to store the result in 'output.txt'. Them this job czn be done by these scripts..


Script. 1 DOWNLOAD
#!/usr/bin/perl -w
use strict;
# Downloaded from http://www.bioinformatics-made-simple.com

open (IN, "input.txt") or die "Couldn't find filename.ext: $!\n";
open (OUT, ">result.txt") or die "Couldn't open fasta_names.txt: $!\n";

while (<IN>) {
if (/^>(.+?)\s/) {
print OUT "$1\n";
}
}



Results 



or

Script 2 DOWNLOAD

#!/usr/bin/perl -w
use strict;
# Downloaded from http://www.bioinformatics-made-simple.com

open (IN, "input.txt") or die "Couldn't find filename.ext: $!\n";
open (OUT, ">result.txt") or die "Couldn't open fasta_names.txt: $!\n";

while (<IN>) {
if (/^>(.+?)\s/) {

print OUT "$1\t";
}
}

print OUT "\n";

Result







This job can also be done by these PERL one linerss


 perl -ne "while(/^>(.+?)\s/g){print \"$1\t\"}" input.txt >output.txt


Source : Protocol- Online


No comments:

Post a Comment

Have Problem ?? Drop a comments here!