How to Predict Gene from Multiple Sequence
|
Gene prediction tools or ORF finders are inevitable tool for both molecular biologist and bioinformaticians. Therefore, There are so many softwares and server all around to predict the gene in given genomic DNA sequences. Some of these gene prediction tools are trained to predict gene in a specific genome while some work ab initio also. Problem with most of gene prediction servers is their output. Output of ORF finders are OK if you have single or few genes as input but it is hard to handle the output if size of input file is very big in other words gene prediction for multiple sequence is difficult if you don't any programming language.
In this post, lets discuss about a server that use to prediction gene from multiple sequences.
ORF FIND is hosted on GreenGene, University of Massachusetts, Lowell. It's simple interface is really easy to use. This ORF finder at Greengnene server find ORFs in multiple DNA sequence file by using GLIMMER to find the ORF coordinates and EMBOSS to extract the amino acid sequences out of predicted ORF DNA sequences.Steps in gene prediction from multiple sequences by ORF finder |
Finally, result of gene prediction from many sequence will appear in a temporary folder where predicted ORFS, predicted protein and input can be easily found.
ORF Finder result folder |
Here, It's important to note that input file format is important for successful prediction from multiple sequences. Your multiple fasta format should always contain sequences in single line after '>sequence description' line. Look below for detail :
> Correct Format
CCTCCTCCTGTTTTTCCCTCAATACAACCTCATTGGATTATTCAATTCACCATCCTGCCCTTGTTCCTTCCATTATACAGCTGTCTTTGCCCTCTCCTTCTCTCGCTGGACTGTTCACCAACTCTCAGCCCGCGATCCCAATTTCCAGACAACCCATCTTATCAGCTTGGCCACGGCCTCGACCCGAACAGACCGGCGTCCAGCGAGAAGAGCGTCGCCTCGACGCCTCTGCTTGACCGCACCTTGATGCTCAAGACTTATCGCGATGCCAAGAAGCGTCTCATCATGTTCGACTACGA
> Wrong Format
CGAAACGGGCACCTATACAACGATTGAAACCATTATTCAAGCTCAGCAAGCGTCTATGC
TAGCGGTTATTGCGAGCACTTCAGCGGTTGCTACTACGACTACTACTTGATAAATGAAA
CGGCTATAAAAGAGGCTGGGGCAAAAGTATGTTAGTTGAAGGGTGACCTGAACGATGAA
TCGGTCGAATTTTTTATTGGCAGAGGGAAGGTAGGTTTACTCAATTTAGTTACTTCTAG
CCGTTGATTGGAGGAGCGCAAGCGACGAGGAGGCTCATCGGCCGCCCGCGGAAAGCGTA
GTCTTACACGGAAATCAACGGCGGTGTCATAAGCGAG
Also Read :
Related Posts Bioinformatics resources,
HOW TO,
Sequence analysis,
Software
|
Was This Post Useful? Add This To Del.icio.us Share on Facebook StumbleUpon This Add to Technorati Share on Twitter |
Labels:
Bioinformatics resources,
HOW TO,
Sequence analysis,
Software
Subscribe to:
Post Comments (Atom)
Thanks for updated Information...
ReplyDeleteGood Job....