How to rename fasta headers according to a matching name list
|
FaBox has several utilities to manipulate the FASTA sequence. I wanted to replace the FASTA header with the new header or description which are saved in a file. Although I can do it with FaBox, but it handles difficult when the number of sequences is huge. This PERL script will rename the fasta sequence as per store in another file.
If you are working with unix based system, then this AWK one-liner will be very useful
Header
Header and new FASTA header should be separated by TABM54089d protein1
M54089c protein2
M54089b protein3
M54089a protein4
Sequence
FASTA should be in one line
Convert Multi line Fasta file into a Single line FASTA File HERE
>M54089d
MEQCRQGSRQNGSVTSGKGLALRAGHGGPSPEPVGCRWTARAAPAARAGRRVPAGGRTGNGSFGGLPRASHSQLRTGTDKGNPTV
>M54089c
MINFDHLFACLHGHYGEVENKLKCILHYFGRICSSMPLGYVSFERKVLSLECTPSCIPYPKEKAWSQSNISLCPIEITISGLIEDQSREAIEVDFANMYLGGGALVRGCVQQEEIRFMINPELIAGMLFLPCMADNEAVEIVGTERFSSYTGRLTKHFVASWINSSVISINSFSKMMASWDFNMIKMLKTPVEGPLLIFCRLVILQLHLKKLRKHRKTS
>M54089b
MIGRADIEGSKSNVAMNAWLHKPVIPVVTFLTPLASNSEGLKIVRPRFHGSYSYWKSESNELLPSVPHEISVRVELILGHLRYLLTDVPPQPNSPPDNVFRRIGLQASLGSKKRGSAPLPLHGISKITLEVVVFHFRLSAPTYTTPLKSFTKSD
>M54089a
MNGLTRFHCPCLLSSETTAKGTGLAESAGKEDPVELDSSRLCEMT
Script
This PERL script will ask for header list and FASTA sequences (file format given above) and save the FASTA file with new header in result.fastaIf you are working with unix based system, then this AWK one-liner will be very useful
awk 'FNR==NR{ a[">"$1]=$2;next}$1 in a{ sub(/>/,">"a[$1]"|",$1)}1' header_list.txt sequence.fasta
Related Posts HOW TO,
PERL,
Perl Script
|
Was This Post Useful? Add This To Del.icio.us Share on Facebook StumbleUpon This Add to Technorati Share on Twitter |
Labels:
HOW TO,
PERL,
Perl Script
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Have Problem ?? Drop a comments here!