Perl Script 4: How to BLAST multiple sequences against NCBI database using PERL script

Have you ever thought about BLASTing your single/multiple sequences against NCBI database without going to their webpage? Sounds good!! OK, there is a PERL script written by NCBI itself to BLAST multiple sequences against NCBI database without going to their BLAST page. You can find that PERL script below 





NCBI PERL Script : 

Script name Download
web_blast.pl

Uses : You can use this NCBI BLAST PERL script by this way


web_blast.pl program database query 

web_blast.pl blastp nr input.fasta
web_blast.pl rpsblast cdd input.fasta
web_blast.pl megablast nt input.fasta
  • web_blast.pl - name of my PERL script that I saved before
  • program - megablast, blastn, blastp, rpsblast, blastx, tblastn, tblastx (any one of them)
  • databases - nr or anyone you want to choose
  • query - name of the file which have your query sequences

  • Hope you will enjoy this PERL script. If you face any problem, you are welcome anytime :)

    148 comments:

    1. Search expired when I tried this first, will try again might be a busy server time..

      ReplyDelete
    2. Hi Robert,
      It seems that either your internet speed is slow or you are u=using more thaan 50 sequences in a go. If you have any more problem plz let me know

      ReplyDelete
    3. Hi EduGirl,

      I have thousands of sequences to be analysed in tblasx. Can you please suggest me the way to blast all the sequences at once? I need to analyse the sequences as soon as I can.
      If you have any idea to bring all the sequences in the separate fasta files and arrange them in one fasta file, please let me know.
      Thank you for your kind suggestions.
      Manoj

      ReplyDelete
    4. what should I do if I have a few thousand query sequences that I would like to search against nr database? Seems NCBI would kill the job 2 hr after submission...

      Chifat

      ReplyDelete
    5. @Anonymous
      If query dataset is too big then, in my opinion, it is better to do BLAST job offline. because if you BLAST to many sequences to NCBI database, they wyll blacklict you IP adress. So it would be better to download NCBI database to your system and then go for blast

      ReplyDelete
    6. @manoj
      can you elaborate you problem little bit :(...you want to'bring all the sequences in the separate fasta files' which sequence?? query or results ?? and 'arrange them in fasta file' again query or results??

      ReplyDelete
    7. Hi EduGirl
      How to download nr database for a particular organism ? I have few thousands of sequences from which I want to do blastx and separate out the coding and non-coding sequences. pl. suggest me some solution.

      ReplyDelete
    8. Thank you for the illustration and I could download the sequences I want. I request kindly give solution for my second problem also. I once again explain I want to do localblastx and separate the sequences which are non-coding. i.e, i would like to make the coding sequences as one file and non-coding sequences as another file.

      ReplyDelete
      Replies
      1. Why do you want to use BLASTX for this purpose? Why don't you go gor the gene prediction tools for this purpose?

        Delete
      2. My idea is if there is hit with blastx that the sequence is coding for a protien and can be treated as coding region.Just i want confirm the sequence cannot code for any protien. is it correct? And what are the gene prediction tools i can use in windows environment ? kindly let me know.

        Delete
      3. hi,
        I don't think that blastx is designed for this purpose. blastx will show similarity to any sequence whether its coding or not. can you tell something about your organism? because gene prediction tools are designed to be organism based or something similar to increase the sensitivity. However I will post so easily used good gene prediction tools in near future.

        Delete
      4. I am dealing with plant sequences.

        Delete
      5. hi,
        pl. post the information about gene prediction tools that can by used on desktop and windows environment. I am dealing with lot of plant genome sequences. as u said based on gene prediction tool if I could know exon , intron , 5'utr , 3' utr / intergenic intragenic regions that would be very useful to me for further processing of my sequences.

        Delete
      6. thank u , let me explore and get back to you

        Delete
    9. hou to use and install ncbi blast offline cammands

      ReplyDelete
    10. I am waiting for your reply about the parsing of FGENESH output file. Pl. suggest me some solution to make the output in a readable tabluar form.

      ReplyDelete
      Replies
      1. http://www.bioinformatics-made-simple.com/2012/07/fgenesh-parser-for-parse-gene.html

        Delete
    11. Replies
      1. Thank you very much for considering my request. I tried the programme It worked well with the part of the first sequence output and thrown warnings which I have given at the end of my message.
        Pl. look into the warnings and solve my problem.

        2.And other query is there is output for some of the sequences like below whether the FGENESH parser can handle this type of output also ?
        //
        FGENESH 2.6 Prediction of potential genes in Homo_sapiens genomic DNA
        Time : Thu Jul 05 14:55:07 2012
        Seq name: AASG02054176
        Length of sequence: 181
        no reliable predictions
        //
        ______________________________________________________________

        C:\Perl>perl fgenesh.pl result.txt > result.gff

        --------------------- WARNING ---------------------
        MSG: seq doesn't validate, mismatch is //,2,6,_,:,0512
        596,24:,+,11,,13,42:,+,21,,21,:,1,1,,:575,203223,1,924
        0292,108615701,11467,6,892,123521,132,1,12928,1322413,
        993+,18163,9,193+1,19259,2020658,8419259,202069483+,20
        8,246959,9524218,246944774,2,32652,327425,1932654,3274



        C:\Perl>perl fgenesh.pl result.txt > result.gff

        --------------------- WARNING ---------------------
        MSG: seq doesn't validate, mismatch is //,2,6,_,:,051
        596,24:,+,11,,13,42:,+,21,,21,:,1,1,,:575,203223,1,92
        0292,108615701,11467,6,892,123521,132,1,12928,1322413
        993+,18163,9,193+1,19259,2020658,8419259,202069483+,2
        8,246959,9524218,246944774,2,32652,327425,1932654,327

        9-6.5924+1CDSf183022-1830999.80183022-1830997824+2CDSi183326-1834360.68
        343611124+3CDSl186415-18782472.10186415-187824141024+PolA188049-5.47Pre
        tein(s):] which does not look healthy
        STACK Bio::PrimarySeq::seq C:/Perl/site/lib/Bio/PrimarySeq.pm:283
        STACK Bio::PrimarySeq::new C:/Perl/site/lib/Bio/PrimarySeq.pm:234
        STACK Bio::Tools::Fgenesh::next_prediction C:/Perl/site/lib/Bio/Tools/Fge
        18
        STACK toplevel fgenesh.pl:19
        -----------------------------------

        Delete
      2. 1 can you email me your input file. i will check it because Fgenesh parser PERL script is working fine for me.

        2. actually it shows that either your sequence length is too short or there are so many stop codons to predict a reasonable size of protein. So i don't think you need to parse these gene prediction results

        Delete
      3. Thank you for the reply. My out put file contains mixed output of both 1 and 2 and actually I want to separate 2 type output as another file for further use.
        what I means is from the same parser perl script can we generate another file for the 2 type out output or a list of sequences where there is no reliable predictions.

        Delete
      4. give your E-Mail id.

        Delete
      5. in my opinion, it would be better approach to sort out those sequences which prediction is not reliable before gene prediction. I mean first make a list of those gene whose prediction are not reliable and then go with those genes whose predictions are good. i may post a PERL script to do this job in near future. my id is prp291@gmail.com.

        Delete
      6. Yes you are right when there are few sequences but if they are in thousands it is difficult. for the time being i will do the same thing. Your mail id is noted.

        Delete
      7. I hope u received e-mail with input file for FGENESH Parser.

        Delete
      8. Hi Edugirl
        pl. help me to extract the sequences using FGENESH.pl it is extracting if file contains only one sequence output if we put sequences of more sequences it is giving error. I request you pl. look into it and help me .

        Delete
      9. I need one more help. I have list of sequences ids and the want to extract those sequences from a multifasta file. Kindly give some suggestion for extracting

        Delete
    12. thanks 4 nice post

      ReplyDelete
    13. Hi,
      I need help, could you send me the input file from FGENESH.pl. My email is ornatus30@gmail.com, thank you very much.

      ReplyDelete
      Replies
      1. Hi Julian,
        What kind of input file you are looking for

        Delete
    14. Hi EduGirl,

      Thanks for this - what if I have 1 multifasta (~500 sequences, ~22k sequence altogether) and 1 other sequences I want blasted against this multifasta file?

      I've used formatdb to format the multifasta and as a sanity check I've used 1 sequence from this file to blast against the database but it comes up with 'No hits found' and I'm not sure why,

      Thanks,

      K

      ReplyDelete
      Replies
      1. Hi K,
        can you send me your input files? Theoretically it is not possible

        Delete
    15. Sure, what address should I use? How is it not possible?

      ReplyDelete
    16. Hi edugirl,

      Nice post.

      I really need help for blasting. I have few thousands sequence. I have separated them into one file per sequence. So, I have few thousands files in one directory now. I want to do blastx for all of my sequence into nr database. I have installed ncbi-blast+ and also download nr database.

      I wonder how to make shell script (or any script) to do blastx for my multiple file and the result will have multiple file with the same name in .xml. Please help me.

      ReplyDelete
    17. Hi edugirl,

      Nice post.

      I really need help for blasting. I have few thousands sequence. I have separated them into one file per sequence. So, I have few thousands files in one directory now. I want to do blastx for all of my sequence into nr database. I have installed ncbi-blast+ and also download nr database.

      I wonder how to make shell script (or any script) to do blastx for my multiple file and the result will have multiple file with the same name in .xml. Please help me.

      ReplyDelete
    18. Hi EduGirl

      Nice post.

      I need your help. I have few thousands fasta files. Each file contain single fasta sequence. I want to do blastx all of my files. I want the result have same name with the query but with .xml format.
      Can you help me to guide to make simple script (shell sript or any script you familiar with)?
      I've installed ncbi-blast+ and also nr database from ncbi in my computer.

      Thanks,
      ZAT

      ReplyDelete
      Replies
      1. Hi ZAt,
        I don't your final purpose but , in my opinion, it would be better to combine all your sequence file in a single input file then BLAST them together, and finally parse them for best hit if you need. If it helps you then I can help you further.

        Delete
    19. hello mam/sir which platform does this code executes- i meant whether windows or linux...

      ReplyDelete
      Replies
      1. Hi, I have tested it for window platform.

        Delete
      2. ma'am can i know what command should be used for blasting protein sequence -> Blastp , i'd executed the program on command prompt by giving
        :/filename.pl blastp
        and i'm not getting the output since i'm from Bio background i'm feeling difficult in executing this code, please help me

        Delete
      3. Hi,
        Plz read the post carefully. I have already mentioned the different commands. However for 'blastp' use this command without quote 'web_blast.pl blastp nr input.fasta'

        Delete
      4. Hi EduGirl,
        the above code has to be excuted on perl s/w or bioperl..

        Delete
    20. HELLO admin..
      can you guide me where should the files be extracted..

      ReplyDelete
    21. Hello ma'am where should the nr database located...to execute the code..

      ReplyDelete
      Replies
      1. Hi,
        THus BLAST PERL script doesn't require any database locally on your computer. It will use nr database from NCBI database. If you want to do BLAST locally then follow this post

        Part I : How to install NCBI BLAST on window 7
        Part II : How to install NCBI BLAST on window 7

        Delete
    22. hi i was made aware the the execution of this program require nr database, i have the nr database dowwnloaded but i am not sure where to paste or extract the compressed nr db file..The Zhang lab website from where i got the db file doesnt say anythin about the extraction location..

      Please let me know the extraction/pasting location of the nr database file and also pls tell me the exact sequence of the command that need to be used to get the output of the Blast_multiple_fasta.pl program, i dont know the proper appearance of the file name, web_blast.pl, blastp, web_blast.pl blastp nr input.fasta... pls help... i use perl s/w and not bioperl..

      Thank u
      -Edward

      ReplyDelete
      Replies
      1. Hi dinesh,
        THus BLAST PERL script doesn't require any database locally on your computer. It will use nr database from NCBI database. If you want to do BLAST locally then follow this post

        Part I : How to install NCBI BLAST on window 7
        Part II : How to install NCBI BLAST on window 7

        Delete
    23. hi i was made aware the the execution of this program require nr database, i have the nr database dowwnloaded but i am not sure where to paste or extract the compressed nr db file..The Zhang lab website from where i got the db file doesnt say anythin about the extraction location..

      Please let me know the extraction/pasting location of the nr database file and also pls tell me the exact sequence of the command that need to be used to get the output of the Blast_multiple_fasta.pl program, i dont know the proper appearance of the file name, web_blast.pl, blastp, web_blast.pl blastp nr input.fasta... pls help... i use perl s/w and not bioperl..

      Thank u
      -Edward

      ReplyDelete
    24. hi i was made aware that the program Blast_multiple_fasta.pl needs nr database, i have downloaded it from zhang lab site, they didnt mention the location where the nr file need to be pasted or extracted.. please tell me where the Blast_multiple_fasta.pl, web_blast.pl and the nr database files need to placed.. and also please advise what is the exact sequense of command need to be used to get the output of the program.. i am using perl s/w and not bioperl.. also when i went through the Blast_multiple_fasta.pl program it mentioned about the querring the database, so what exactly are we querrying here...???

      ReplyDelete
    25. Respected ma'am even 'm facing same difficulty, please help me out by replying for the query posted by dinesh..


      hi i was made aware that the program Blast_multiple_fasta.pl needs nr database, i have downloaded it from zhang lab site, they didnt mention the location where the nr file need to be pasted or extracted.. please tell me where the Blast_multiple_fasta.pl, web_blast.pl and the nr database files need to placed.. and also please advise what is the exact sequense of command need to be used to get the output of the program.. i am using perl s/w and not bioperl.. also when i went through the Blast_multiple_fasta.pl program it mentioned about the querring the database, so what exactly are we querrying here...???

      ReplyDelete
      Replies
      1. Hi Paru,
        I have replied to Dinesh query. Actually this script blast your sequences to NCBI database not to your locally stored database. so you don't need anything except perl script and query sequences. If you want to use protein sequences as query the use this command line

        web_blast.pl blastp nr input.fasta



        where
        web_blast.pl - name of my PERL script
        blastp - programe name
        nr - databases
        input.fasta - file name where sequences are stored



        Delete
    26. Respected ma'am can you please share the link for downloading NCBI BLAST for windowsXP

      ReplyDelete
    27. Replies
      1. thankyou :)
        so ma'am u meant above NCBI BLAST works also on WindowsXP..

        Delete
      2. Yes, it is supposed to work on WindowXP

        Delete
      3. thanks again:)

        Delete
    28. hello Priyanka Paul,
      i've tried to execute this code, on using this command "web_blast.pl blastp nr input.fasta " as mentioned above so can you please suggest what does "input.fasta " mean and how fasta files should be framed into "input.fasta "
      kindly please brief out what and how should it be executed...
      Thankyou..

      ReplyDelete
      Replies
      1. 'input.fasta' stand for the name of your sequence file. if you are using window the you can simply excute this perl script through your command prompt. Hope this will help you.

        Delete
      2. i have few questns on this:

        1st: u mentioned "input.fasta" is the name of the sequence file.
        so does it mean we need to write the serquence in a text file and rename it as "input.fasta"....????

        2nd: is it the full sequesnce of the protein or just the header....?????

        3rd: after running "web_blast.pl blastp nr input.fasta" in cmd prompt shoud it open the http://www.ncbi.nlm.nih.gov/blast/Blast.cgi link....???

        Because i heard forom my frnd that after running this cmd it will open this pg and it is here where enter the sequences to blast, is it so....?????

        4th: Also i have a excell sheet containing 1600 sequesnce of proteins to be blasted is there a way to blast them all at once instead of doing them manually.....?????

        Delete
      3. oh thanks and
        1 - ma'am after running "web_blast.pl blastp nr input.fasta" in cmd prompt shoud it open the http://www.ncbi.nlm.nih.gov/blast/Blast.cgi link??
        and
        2 - so does it mean we need to write the serquence and rename it as "input.fasta"....????

        Delete
      4. ma'am so sequence file refers to list of fasta sequences. and list of fasta files should be listed n saved in notepad or excel sheet? if possible ma'am please share the format or you can mail me the sample file, so that i'l get an idea about it..
        snehaswan56@gmail.com
        thankyou in advance..

        Delete
      5. Hi Sneha,
        Please look in to my answer to Dinesh. Hope this will help you

        Delete
      6. Hi Sneha,
        yes you are right about file format.

        Delete
    29. Hi

      Thank you very much for providing this script. It work perfect according my needs. However, I would be grateful to you in case you have some similar script which can be used to BLAST (stand alone version) against query database. I am using Blast and Blast+ for my work. Please suggest some thing. I would be great.

      Thank you
      Harry

      ReplyDelete
      Replies
      1. Hi Harry,
        NCBI stand alone version can use for it. There are some command like blastall (please read their documentation)for batch blast againt your own database.

        Delete
    30. Hi Dinesh,

      1. you don't have to rename anything. If your sequence is in input.txt then type same in command not input.fasta.

      2. fill sequence

      3. Use this command

      web_blast.pl blastp nr input.fasta >result.txt

      and your BLAST result will be save in result.txt file.

      4. follow this post 5 server for batch BLAST against NCBI

      ReplyDelete
      Replies
      1. Hi ya i tried as per your pocedures..
        I tried to download the fasta file from the site and used it with web_blast.pl balstp nr sequence.fasta>results.txt

        but the the results.txt file is empty, there arent any contents in the file, it is always 0bytes...

        Getting more worried now... pls help...

        Delete
      2. Hi ya i tried as per your pocedures..
        I tried to download the fasta file from the site and used it with web_blast.pl balstp nr sequence.fasta>results.txt

        but the the results.txt file is empty, there arent any contents in the file, it is always 0bytes...

        Getting more worried now... pls help...

        Delete
      3. hi,
        can you send me your sequence file to prp291@gmail.com?

        Delete
    31. thanks for the above replied i'm half cleared

      ma'am 'm confused with the extensions like
      if its okay with .txt extension -> notepad version
      and
      whats all about .fasta format?

      and please ma'am could you just share sample file it would really help me lot because if i've to blast 1000's of sequences at a time and saving those sequences in notepad on execution wont it be confused to what and how the results(scores) have to b matched with original sequences..

      snehaswan56@gmail.com

      ReplyDelete
      Replies
      1. Hi Sneha,
        file extention is only to show the path of exact file so file extention like .txt or .fasta will work.I am not getting your question properly :(. Can explain what exactly you want to do? I have sent the sample file to your e-mail.

        Delete
      2. ma'am thanks for d sample file, i got a clear view how should a input file look like :) :)
        thanks again...

        Delete
      3. hello,
        can you just attach a snapshot of how should a output results look like...

        Delete
    32. ma'am i worked according to your guidelines , yea its executing but again im facing difficulty with input.txt part like the command prompt on window blinks for seconds and stops- after crosschecking i gotta know that 0bytes has been utilized during retrieving the input sequence from input.txt file so i'm unaware of what really went wrong so please guide me..

      ReplyDelete
    33. yes ma'am can u plz attach a o/p results sheet/ snapshot so that many n most doubts will b cleared..

      ReplyDelete
      Replies
      1. Hi Sneha,
        Actually result sheet will be just like the web version. So I don't think you need any snapshot.

        Delete
      2. oh okay ma'am :)

        Delete
      3. "This code is for example purposes only.
        #
        # Please refer to http://www.ncbi.nlm.nih.gov/blast/Doc/urlapi.html
        # for a complete list of allowed parameters"

        ma'am as per d above statement gotta small doubt should parameters be changed or modified on execution??

        actually ma'am only thing i need to retrieve is % of identity from results/web page , so should i alter the parameters?

        Delete
      4. Hi Sneha,
        Of course, you can modified your queries but i have never tested them. Instead of modifying my query i always do this job in two step.

        1. BLAST your sequence using this PERL script.
        2. pasrse NCBI blast result using method given in this post

        NCBI blast parser

        Delete
    34. hello prinyanka mam,
      i knw my question seems to b silly but i've doubt in which perl s/w did you use to execute the above code, their are like- strawberry perl, active perl , so please guide which one is preferable ...

      ReplyDelete
      Replies
      1. I have tested this BLAST PERL script with active perl

        Delete
    35. hello ma'am tis s sneha again
      is this the same code that need to be used for execution..?? coz asper the comment in the code its given as example purpose.. so got doubt

      ReplyDelete
      Replies
      1. Actually command can be devided into following parts

        [perl script name] [programme name][database name] [query file name].

        Delete
    36. I have emailed u the sequence.


      I tried to execute this code on both windows xp and 7.

      and i tried C:\Users\kdinesh\Desktop\NCBI\blast-2.2.28+\bin>web_blast.pl blastp nr input.txt>results.txt

      input.txt contains few seq and after execution the results.txt is still empty and the size of the resilts file is 0bytes... :(

      like in my previous comment i said i have also downloaded the fasta file from the website and executed but the results file was 0bytes.. :(

      if possilble could you tell me what the results fil will contain or look like after the execution...???????



      pls help

      ReplyDelete
    37. ya after reading snehas's cmnt

      ""This code is for example purposes only.
      #
      # Please refer to http://www.ncbi.nlm.nih.gov/blast/Doc/urlapi.html
      # for a complete list of allowed parameters"
      "

      i got a doubt


      do u make any kinda changes to the code in order to make the code working...???, because i'm messing up my mind a lot..

      i use the cmd web_blast.pl blastp nr input.txt>results.txt

      where, input.txt contains only full sequence like u said, but after execution of the cmd the results.txt is empty always it is 0byte thought the file is created, and the input.txt is intact(i know ther wont be any modifications made to theis input.txt)


      also sneha mentioned "should parameters be changed or modified on execution??" what parameters are they and where do v change those parameters, is it in the code...???? parameters in the http://www.ncbi.nlm.nih.gov/blast/Doc/urlapi.html are some kinda links and how do use these parameters, where do v use these parameters...????

      pls assist..

      ReplyDelete
      Replies
      1. Hi Dinesh,

        Actually this script use this url
        http://www.ncbi.nlm.nih.gov/blast/ Blast.cgi

        as base url but you can modified it as per your requirement. For example if I want to get one two hit from my BLAST then i will use this url

        http://www.ncbi.nlm.nih.gov/blast/Blast.cgi?\CMD=Put&HITLIST_SIZE=2

        Delete
    38. Mam unfortunately i wonder why it is still not working in my PC, the results.txt is still empty i dont no why this is the case in my PC...

      The web_blast.pl and input.txt (seq file) are in the path "C:\Users\kdinesh\Desktop\NCBI\blast-2.2.28+\bin" i.e in the bin folder of NCBI which inturn is on desktop...

      ReplyDelete
    39. Hi,
      Is your sequence file and perl script is in same filder/directory??

      ReplyDelete
    40. yes mam, both seq file and web_blast.pl script are in same folder....

      ReplyDelete
    41. hi, ma'am should input sequence file and perl script, i'd placed in ncbi/bin folder- but still it didnt work...
      solution please........

      ReplyDelete
    42. ma'am i've placed perl script and sequence input file in ncbi/bin folder
      C:\Documents and Settings\sneha\Desktop\NCBI\blast-2.2.28+\bin

      C:\Documents and Settings\jaga\Desktop\NCBI\blast-2.2.28+\bin>web_blast.pl blastp nr input.txt>result.txt
      , but still it didnt work...

      ReplyDelete
      Replies
      1. Sneha,
        Do you have PERL script on your computer?

        Delete
      2. yes mam :) but didnt work , i followed all your guidelines :(

        Delete
      3. yea i've installed Active perl..+ Part I : NCBI BLAST in my lappy
        btw ma'am 'm using windows xp..
        everything is perfect except the output file :(

        Delete
    43. hi any fix for this issue, any solutions and suggestions, because i have my project deadline comming through in few days and still this doesnt works, i would be glad if im able to blast atlease one sequest and show the results in results.txt ............

      ReplyDelete
      Replies
      1. Hi Dinesh,
        I really don't have any clue that why does it not working. Can you try it on another computer?

        Delete
      2. ya i have tried in a different computer and same results :( ..
        the results.txt file is empty...

        Delete
    44. ya i tried in a different computer same results.. :(
      No data in results.txt file....

      ReplyDelete
    45. ma'am plz if u dont mind post d path which and wer your code got executed on your system maybe i'm stucked with this...

      C:\Documents and Settings\sneha\Desktop\NCBI\blast-2.2.28+\bin

      C:\Documents and Settings\sneha\Desktop\NCBI\blast-2.2.28+\bin>web_blast.pl blastp nr input.txt>result.txt
      was my path

      i'm happy with the coding 'cuz its perfect for me , but the problem lies ->
      result.txt file remains empty wen ever i execute the code!!!

      ReplyDelete
    46. Actually, my script is on desktop so I first go to the desktop then used the above command you have given i.e. web_blast.pl blastp nr input.txt>result.txt.


      remmber that you should not use more than 50 sequence/query file. I have tested it on WindowXP also so OS is not a problem

      ReplyDelete
      Replies
      1. Guess i & sneha have the same issue.

        I have tried the same- code,seq and result.txt are in appropriate locations,after executing the results.txt file is always 0byte.

        I uses the same seq file that you once emailed me, even that didnt the contents in results.txt file.

        i have tried both in windows xp and 7, but still no solution, im screwed now and no other go pls help...

        Delete
      2. ma'am where should perl script file be placed is it in ncbi bin folder or in some other file- maybe this is the one i'm tackled with...

        Delete
      3. sneha,
        location of perl script is not important unless your sequence and perl script are in same folder and perl is added in envirnment variable. THis BLAST perl script doesn't depend upon NCBI local blast because it utilize internet and use NCBI online database.

        Delete
      4. dinesh,
        sorry for delayed response. I tried everything to work the perl script for you. Sorry. i don't know the objective of you project but if it allowed i can BLAST the sequences for you :)

        Delete
      5. well i'm glad you can blast the seq for me, but i need to show its execution (working blast) before my project guide and coordinator in person during my project externals which is few days away, and the code is not working on nun of the systems.

        i have done all modiications to environmental variables, changed computers, Operating systems, checked internet etc but nothing worked..

        i'l be glad if atleast one seq i am able to blast in from my lecturers..

        pls suggest any alternate solution....

        Delete
      6. Ok, so what exactly you want to do?

        Delete
      7. Well i want2 show the execution of the script before my lecturers in college i.e i should be able to blast atleast one protein seq in front of them, this is all i want.. i've been struggling day/night just to have this script working... i've made all kinda possible setups but nun worked...

        Delete
      8. while i execute the script i dont get any error message r warnings, hope it executed well, but is that still the result.txt file empty, it is always 0byte..??? there are no contents..

        Delete
      9. the environmental variables are set by default to PATH and is pointing to ncbi's bin of C: drive, but still y is this failure..??? is there any thing else i need to change in the script..???

        Delete
    47. Hi i'm glad and very happy to inform u that the new blast code u sent me worked well, thank u so much.. i have my presentation in few days and u were being very helpful.. thank u 1s again......

      ReplyDelete
    48. It's really awesome, thanks. Although it seems that all the results are combined. I try to find a way to separate the result for the different fasta files when I use multiple input, but it seems that after the result is created there is no way. Also I would like to display only Homo sapiens results. Is there a a way to do it, or maybe should I write or use an existing different script for these tasks.
      thanks

      ReplyDelete
      Replies
      1. Hi Peter,
        Use the PERL script given below and choose the homo sapiens as organism to get the sequence from human only. I am afraid that for this perl scripr results will be combined together however you can use NCBI BLAST parser to simplify your downstrean task


        How to BLAST multiple sequences against NCBI database using PERL script - II

        NCBI BLAST parser : Extract query and best hits

        Delete
      2. hi,
        Thanks for the hint. I will try the other ncbi blast script too.

        Delete
    49. Hi, thanks it's really awesome. Although I try to find a way to separate the results by the different input fasta files, but it seems from the output file that there is no way to say. Is there a way maybe, or should I find/write another script for separating the combined result?

      ReplyDelete
    50. Hi Gayan,
      Sorry for delayed response. you can use this command to do batch megablast
      web_blast.pl megablast nt file1.fasta file2.fasta

      ReplyDelete
    51. Hi Priyanka,

      Great post and amazing effort at replying. Thanks for that.

      I'm running the script from Terminal with:

      perl web_blast.pl blastn nr ../pathtomyfile/myfile.FASTA

      and getting the following error:

      ```
      Can't locate URI/Escape.pm in @INC (@INC contains: /sw/lib/perl5 /sw/lib/perl5/darwin /opt/local/lib/perl5/site_perl/5.12.3/darwin-multi-2level /opt/local/lib/perl5/site_perl/5.12.3 /opt/local/lib/perl5/vendor_perl/5.12.3/darwin-multi-2level /opt/local/lib/perl5/vendor_perl/5.12.3 /opt/local/lib/perl5/5.12.3/darwin-multi-2level /opt/local/lib/perl5/5.12.3 /opt/local/lib/perl5/site_perl /opt/local/lib/perl5/vendor_perl .) at web_blast.pl line 52.
      BEGIN failed--compilation aborted at web_blast.pl line 52.
      ```

      Any clues what's going on? A I missing a module or something? Help very much appreciated. I'm using v5.12.3 built for darwin-multi-2level on a MacBookPro

      Thanks,


      ReplyDelete
      Replies
      1. Hi Ticatla,
        Thanks for the words of appreciations. Although I haven't tried this perl script on MAC but i wanna make some point that is important to run ncbi batch blast perl script. Please make sure that you have installed the BIOPERL on your mechine.

        Delete
    52. Hi Priyanka,

      I have several accessions numbers I want to run in TBLASTX. I tried CLC genomics and Geneious but those programs need the FASTA formats for each of those accessions, which is just too much for me to find out. Is there any suggestion you can make fo that I can run TBLASTX for multiple accessions at a time. Perhaps a perl script or something else?

      Thanks,
      Vin

      ReplyDelete
    53. Hi Priyanka,

      I have several accessions numbers I want to run in TBLASTX. I tried CLC genomics and Geneious but those programs need the FASTA formats for each of those accessions, which is just too much for me to find out. Is there any suggestion you can make fo that I can run TBLASTX for multiple accessions at a time. Perhaps a perl script or something else?

      Thanks,
      Vin

      ReplyDelete
    54. Hi Priyanka,

      I am trying to run TBLASTX for multiple accessions numbers (~14000 accessions) for which I do not have FASTA format. Can you suggest a way to do multiple TBLASTX using these accessions?

      Thank you,
      Sam

      ReplyDelete
      Replies
      1. Hi Sameer.
        Happy to see you here. You have accession number of which database?? I assume that you have accession number of NCBI than you can do you analysis in couple of steps:
        1. Download sequences from NCBI

        2. BATCH blast your sequence with NCBI database or BATCH blast your sequence with NCBI database

        3. parse your result with the help of NCBI BLAST parser


        Hope this will help. Please let me know if you face any problem

        Delete
    55. mam, i have same problem like Mr. Dinesh.. what to do.? please help me.

      ReplyDelete
      Replies
      1. Hi Deepti,
        We have discussed the Dinesh's problem in our comment section. Please fulfill those criteria and if if problem is not solved then do let me know.

        Delete
    56. How will this work for querying the SRA database with a specific SRX number?

      ReplyDelete
      Replies
      1. Hi Steven,
        This PERL script will not work for Sequence Read Archive Nucleotide BLAST.But you can use NCBI web surface for Sequence Read Archive Nucleotide batch BLAST

        Delete
    57. Hi Priyanka,
      I have a problem! I need to batch a blast of several proteins against NCBI in pdb, but I can't access to greengene! Is that just me? I'm student and I'm not used to that at all, my professor suggested to use greengene but I can't even acces the site.
      Please help me!
      Thanks

      ReplyDelete
      Replies
      1. Hi Serena,
        For some reason, Greengene server is down. You may try after few days. Alternatively you can use the PERL script given above for NCBI batch blast if you have installed the Bioperl on your machine.

        Delete
    58. Could you clarify how to use blastn/megablast? In the example for megablast, is input1.fasta the query sequence and input2.fasta the target sequence?

      web_blast.pl megablast nt input1.fasta input2.fasta

      Would it be possible to just use one .fasta file with both sequences?

      ReplyDelete
      Replies
      1. Hi David,
        Thanks for stopping by. It was typo. I have corrected in the post. You can use a file with several sequences BLAST. Let me know if you need any more help.

        Delete

    Have Problem ?? Drop a comments here!