Module NCBIStandalone
source code
Code for calling standalone BLAST and parsing plain text output
(OBSOLETE).
Rather than parsing the human readable plain text BLAST output (which
seems to change with every update to BLAST), we and the NBCI recommend
you parse the XML output instead. The plain text parser in this module
still works at the time of writing, but is considered obsolete and
updating it to cope with the latest versions of BLAST is not a priority
for us.
This module also provides code to work with the "legacy"
standalone version of NCBI BLAST, tools blastall, rpsblast and blastpgp
via three helper functions of the same name. These functions are very
limited for dealing with the output as files rather than handles, for
which the wrappers in Bio.Blast.Applications are prefered. Furthermore,
the NCBI themselves regard these command line tools as
"legacy", and encourage using the new BLAST+ tools instead.
Biopython has wrappers for these under Bio.Blast.Applications (see the
tutorial).
Classes: LowQualityBlastError Except that indicates low quality
query sequences. BlastParser Parses output from blast.
BlastErrorParser Parses output and tries to diagnose possible
errors. PSIBlastParser Parses output from psi-blast. Iterator
Iterates over a file of blast results.
_Scanner Scans output from standalone BLAST.
_BlastConsumer Consumes output from blast. _PSIBlastConsumer
Consumes output from psi-blast. _HeaderConsumer Consumes header
information. _DescriptionConsumer Consumes description information.
_AlignmentConsumer Consumes alignment information. _HSPConsumer
Consumes hsp information. _DatabaseReportConsumer Consumes database
report information. _ParametersConsumer Consumes parameters
information.
Functions: blastall Execute blastall (OBSOLETE). blastpgp
Execute blastpgp (OBSOLETE). rpsblast Execute rpsblast
(OBSOLETE).
For calling the BLAST command line tools, we encourage you to use the
command line wrappers in Bio.Blast.Applications - the three functions
blastall, blastpgp and rpsblast are considered to be obsolete now, and
are likely to be deprecated and then removed in future releases.
|
blastall(blastcmd,
program,
database,
infile,
align_view=' 7 ' ,
**keywds)
Execute and retrieve data from standalone BLASTPALL as handles (OBSOLETE). |
source code
|
|
|
blastpgp(blastcmd,
database,
infile,
align_view=' 7 ' ,
**keywds)
Execute and retrieve data from standalone BLASTPGP as handles (OBSOLETE). |
source code
|
|
|
rpsblast(blastcmd,
database,
infile,
align_view=' 7 ' ,
**keywds)
Execute and retrieve data from standalone RPS-BLAST as handles (OBSOLETE). |
source code
|
|
|
|
|
_get_cols(line,
cols_to_get,
ncols=None,
expected={ } ) |
source code
|
|
|
|
|
|
|
|
|
|
|
StringTypes = ( <type 'str'>, <type 'unicode'>)
|
|
__package__ = ' Bio.Blast '
|
|
xml_support = 1
|
blastall(blastcmd,
program,
database,
infile,
align_view=' 7 ' ,
**keywds)
| source code
|
Execute and retrieve data from standalone BLASTPALL as handles (OBSOLETE).
NOTE - This function is obsolete, you are encouraged to the command
line wrapper Bio.Blast.Applications.BlastallCommandline instead.
Execute and retrieve data from blastall. blastcmd is the command
used to launch the 'blastall' executable. program is the blast program
to use, e.g. 'blastp', 'blastn', etc. database is the path to the database
to search against. infile is the path to the file containing
the sequence to search with.
The return values are two handles, for standard output and standard error.
You may pass more parameters to **keywds to change the behavior of
the search. Otherwise, optional values will be chosen by blastall.
The Blast output is by default in XML format. Use the align_view keyword
for output in a different format.
Scoring
matrix Matrix to use.
gap_open Gap open penalty.
gap_extend Gap extension penalty.
nuc_match Nucleotide match reward. (BLASTN)
nuc_mismatch Nucleotide mismatch penalty. (BLASTN)
query_genetic_code Genetic code for Query.
db_genetic_code Genetic code for database. (TBLAST[NX])
Algorithm
gapped Whether to do a gapped alignment. T/F (not for TBLASTX)
expectation Expectation value cutoff.
wordsize Word size.
strands Query strands to search against database.([T]BLAST[NX])
keep_hits Number of best hits from a region to keep.
xdrop Dropoff value (bits) for gapped alignments.
hit_extend Threshold for extending hits.
region_length Length of region used to judge hits.
db_length Effective database length.
search_length Effective length of search space.
Processing
filter Filter query sequence for low complexity (with SEG)? T/F
believe_query Believe the query defline. T/F
restrict_gi Restrict search to these GI's.
nprocessors Number of processors to use.
oldengine Force use of old engine T/F
Formatting
html Produce HTML output? T/F
descriptions Number of one-line descriptions.
alignments Number of alignments.
align_view Alignment view. Integer 0-11,
passed as a string or integer.
show_gi Show GI's in deflines? T/F
seqalign_file seqalign file to output.
outfile Output file for report. Filename to write to, if
ommitted standard output is used (which you can access
from the returned handles).
|
blastpgp(blastcmd,
database,
infile,
align_view=' 7 ' ,
**keywds)
| source code
|
Execute and retrieve data from standalone BLASTPGP as handles (OBSOLETE).
NOTE - This function is obsolete, you are encouraged to the command
line wrapper Bio.Blast.Applications.BlastpgpCommandline instead.
Execute and retrieve data from blastpgp. blastcmd is the command
used to launch the 'blastpgp' executable. database is the path to the
database to search against. infile is the path to the file containing
the sequence to search with.
The return values are two handles, for standard output and standard error.
You may pass more parameters to **keywds to change the behavior of
the search. Otherwise, optional values will be chosen by blastpgp.
The Blast output is by default in XML format. Use the align_view keyword
for output in a different format.
Scoring
matrix Matrix to use.
gap_open Gap open penalty.
gap_extend Gap extension penalty.
window_size Multiple hits window size.
npasses Number of passes.
passes Hits/passes. Integer 0-2.
Algorithm
gapped Whether to do a gapped alignment. T/F
expectation Expectation value cutoff.
wordsize Word size.
keep_hits Number of beset hits from a region to keep.
xdrop Dropoff value (bits) for gapped alignments.
hit_extend Threshold for extending hits.
region_length Length of region used to judge hits.
db_length Effective database length.
search_length Effective length of search space.
nbits_gapping Number of bits to trigger gapping.
pseudocounts Pseudocounts constants for multiple passes.
xdrop_final X dropoff for final gapped alignment.
xdrop_extension Dropoff for blast extensions.
model_threshold E-value threshold to include in multipass model.
required_start Start of required region in query.
required_end End of required region in query.
Processing
XXX should document default values
program The blast program to use. (PHI-BLAST)
filter Filter query sequence for low complexity (with SEG)? T/F
believe_query Believe the query defline? T/F
nprocessors Number of processors to use.
Formatting
html Produce HTML output? T/F
descriptions Number of one-line descriptions.
alignments Number of alignments.
align_view Alignment view. Integer 0-11,
passed as a string or integer.
show_gi Show GI's in deflines? T/F
seqalign_file seqalign file to output.
align_outfile Output file for alignment.
checkpoint_outfile Output file for PSI-BLAST checkpointing.
restart_infile Input file for PSI-BLAST restart.
hit_infile Hit file for PHI-BLAST.
matrix_outfile Output file for PSI-BLAST matrix in ASCII.
align_outfile Output file for alignment. Filename to write to, if
ommitted standard output is used (which you can access
from the returned handles).
align_infile Input alignment file for PSI-BLAST restart.
|
rpsblast(blastcmd,
database,
infile,
align_view=' 7 ' ,
**keywds)
| source code
|
Execute and retrieve data from standalone RPS-BLAST as handles (OBSOLETE).
NOTE - This function is obsolete, you are encouraged to the command
line wrapper Bio.Blast.Applications.RpsBlastCommandline instead.
Execute and retrieve data from standalone RPS-BLAST. blastcmd is the
command used to launch the 'rpsblast' executable. database is the path
to the database to search against. infile is the path to the file
containing the sequence to search with.
The return values are two handles, for standard output and standard error.
You may pass more parameters to **keywds to change the behavior of
the search. Otherwise, optional values will be chosen by rpsblast.
Please note that this function will give XML output by default, by
setting align_view to seven (i.e. command line option -m 7).
You should use the NCBIXML.parse() function to read the resulting output.
This is because NCBIStandalone.BlastParser() does not understand the
plain text output format from rpsblast.
WARNING - The following text and associated parameter handling has not
received extensive testing. Please report any errors we might have made...
Algorithm/Scoring
gapped Whether to do a gapped alignment. T/F
multihit 0 for multiple hit (default), 1 for single hit
expectation Expectation value cutoff.
range_restriction Range restriction on query sequence (Format: start,stop) blastp only
0 in 'start' refers to the beginning of the sequence
0 in 'stop' refers to the end of the sequence
Default = 0,0
xdrop Dropoff value (bits) for gapped alignments.
xdrop_final X dropoff for final gapped alignment (in bits).
xdrop_extension Dropoff for blast extensions (in bits).
search_length Effective length of search space.
nbits_gapping Number of bits to trigger gapping.
protein Query sequence is protein. T/F
db_length Effective database length.
Processing
filter Filter query sequence for low complexity? T/F
case_filter Use lower case filtering of FASTA sequence T/F, default F
believe_query Believe the query defline. T/F
nprocessors Number of processors to use.
logfile Name of log file to use, default rpsblast.log
Formatting
html Produce HTML output? T/F
descriptions Number of one-line descriptions.
alignments Number of alignments.
align_view Alignment view. Integer 0-11,
passed as a string or integer.
show_gi Show GI's in deflines? T/F
seqalign_file seqalign file to output.
align_outfile Output file for alignment. Filename to write to, if
ommitted standard output is used (which you can access
from the returned handles).
|
Start BLAST and returns handles for stdout and stderr (PRIVATE).
Expects a command line wrapper object from Bio.Blast.Applications
|
Look for any attempt to insert a command into a parameter.
e.g. blastall(..., matrix='IDENTITY -F 0; rm -rf /etc/passwd')
Looks for ";" or "&&" in the strings (Unix
and Windows syntax for appending a command line), or ">",
"<" or "|" (redirection) and if any are found
raises an exception.
|