MPI-Blast is a program from LANL which parallelizes the NCBI Blast algorithms using Message Passing Interface library. The version of MPI-Blast included with Rocks is v1.4.0-devel.
MPI-Blast is used in a similar manner to NCBI-Blast. MPI-Blast uses the same variables that are available for NCBI-Blast.
There are 3 steps to running MPI-Blast.
Download a FASTA database to $BLASTDB. For this example we will download the ecoli nucleotide database.
[nostromo@rocks-168 ~]$ sudo su - biouser -bash-3.00$ cd $BLASTDB -bash-3.00$ wget ftp://ftp.ncbi.nlm.nih.gov/blast/db/FASTA/ecoli.nt.gz --17:06:23-- ftp://ftp.ncbi.nlm.nih.gov/blast/db/FASTA/ecoli.nt.gz => `ecoli.nt.gz' Resolving ftp.ncbi.nlm.nih.gov... 165.112.7.10 Connecting to ftp.ncbi.nlm.nih.gov|165.112.7.10|:21... connected. Logging in as anonymous ... Logged in! ==> SYST ... done. ==> PWD ... done. ==> TYPE I ... done. ==> CWD /blast/db/FASTA ... done. ==> PASV ... done. ==> RETR ecoli.nt.gz ... done. Length: 1,438,199 (1.4M) (unauthoritative) 100%[========================================================>] 1,438,199 610.14K/s 17:06:27 (607.91 KB/s) - `ecoli.nt.gz' saved [1438199] |
Format the database using mpiformatdb as follows. A good rule is to format the database to atleast 4 processors, as follows.
-bash-3.00$ gunzip ecoli.nt.gz -bash-3.00$ ls ecoli.nt -bash-3.00$ mpiformatdb --nfrags=4 -i ecoli.nt -pF --quiet Reading input file Done, read 58882 lines Reordering 400 sequence entries Breaking ecoli.nt into 4 fragments Executing: formatdb -p F -i /tmp/reorderUDz97K -N 4 -n /share/bio/ncbi/db/ecoli.nt -o T Removed /tmp/reorderUDz97K Created 4 fragments. -bash-3.00$ ls ecoli.nt ecoli.nt.000.nsq ecoli.nt.001.nsq ecoli.nt.002.nsq ecoli.nt.003.nsq ecoli.nt.000.nhr ecoli.nt.001.nhr ecoli.nt.002.nhr ecoli.nt.003.nhr ecoli.nt.mbf ecoli.nt.000.nin ecoli.nt.001.nin ecoli.nt.002.nin ecoli.nt.003.nin ecoli.nt.nal ecoli.nt.000.nnd ecoli.nt.001.nnd ecoli.nt.002.nnd ecoli.nt.003.nnd formatdb.log ecoli.nt.000.nni ecoli.nt.001.nni ecoli.nt.002.nni ecoli.nt.003.nni ecoli.nt.000.nsd ecoli.nt.001.nsd ecoli.nt.002.nsd ecoli.nt.003.nsd ecoli.nt.000.nsi ecoli.nt.001.nsi ecoli.nt.002.nsi ecoli.nt.003.nsi |
Now, as a normal user, create a test sequence file and run mpiblast on the sequence against the formatted database.
[nostromo@rocks-168 ~]$ cat > test.txt >Test AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC TTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTTTATTGACTTAGGTCACTAAATACTTTAACCAA TATAGGCATAGCGCACAGACAGATAAAAATTACAGAGTACACAACATCCATGAAACGCATTAGCACCACC ATTACCACCACCATCACCATTACCACAGGTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAG CCCGCACCTGACAGTGCGGGCTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAA GTTCGGCGGTACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAAGCAATGCC AGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCACCTGGTGGCGATGATTG AAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAACGTATTTTTGCCGAACTTTT [nostromo@rocks-168 mpiblast]$ /opt/mpich/gnu/bin/mpirun -np 4 /opt/Bio/mpiblast/bin/mpiblast -d ecoli.nt -i /home/nostromo/test.txt -p blastn > result.txt |
After mpirun terminates, result.txt contains the result of your computation.
Further information about using mpiblast can be found at the MPI-Blast home page.
For support, please join the mpiblast mailing list