KEGG Sequence Downloader : retrieve gene sequences in Fasta format from KEGG database
|
I wanted to download the gene sequence of tobacco from NCBI. Since NCBI also contains the isoform and some other unwanted genes, therefore I choose to get it from KEGG. Although KEGGREST is a wonderful R package to retrieve the data from KEGG, but it limits the retrieval. The following bash script can help to download the thousands of sequences in a single go without any limitation. Although this is a crude solution and there must be an efficient way to do it but it worked for me. Basically, this bash script works in three steps:
- Split IDs in a given chunk
- Download fasta sequences as HTML file
- Clean HTML file and save the result
Uses
bash KEGG_sequence_downloader.sh query_file number_of_sequence
How to download only viridiplantae miRNA from miRBase HERE
Script
Script name | Download |
---|---|
KEGG_sequence_downloader.sh |
Related Posts Bioinformatics resources,
HOW TO
|
Was This Post Useful? Add This To Del.icio.us Share on Facebook StumbleUpon This Add to Technorati Share on Twitter |
Labels:
Bioinformatics resources,
HOW TO
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Have Problem ?? Drop a comments here!