Bio::Tools::BPlite HSP
SummaryIncluded librariesPackage variablesSynopsisDescriptionGeneral documentationMethods
Summary
Bio::Tools::BPlite::HSP - Blast report High Scoring Pair (HSP)
Package variables
No package variables defined.
Included modules
Bio::SeqFeature::Similarity
Bio::SeqFeature::SimilarityPair
Inherit
Bio::SeqFeature::SimilarityPair
Synopsis
 use Bio::Tools::BPlite;
 my $report = new Bio::Tools::BPlite(-fh=>\*STDIN);
 {
    while(my $sbjct = $report->nextSbjct) {
	while (my $hsp = $sbjct->nextHSP) {
	    $hsp->score;
	    $hsp->bits;
	    $hsp->percent;
	    $hsp->P;
	    $hsp->match;
	    $hsp->positive;
	    $hsp->length;
	    $hsp->querySeq;
	    $hsp->sbjctSeq;
	    $hsp->homologySeq;
	    $hsp->query->start;
	    $hsp->query->end;
	    $hsp->hit->start;
	    $hsp->hit->end;
	    $hsp->hit->seqname;
	    $hsp->hit->overlaps($exon);
	}
    }

    # the following line takes you to the next report in the stream/file
    # it will return 0 if that report is empty,
    # but that is valid for an empty blast report.
    # Returns -1 for EOF.

    last if ($report->_parseHeader == -1));

 redo
 }
Description
This object handles the High Scoring Pair data for a Blast report.
This is where the percent identity, query and hit sequence length,
P value, etc are stored and where most of the necessary information is located when building logic around parsing a Blast report.
See Bio::Tools::BPlite for more detailed information on the entire
BPlite Blast parsing system.
Methods
new
No description
Code
_overload
No description
Code
report_typeDescriptionCode
PDescriptionCode
percentDescriptionCode
matchDescriptionCode
hsplengthDescriptionCode
positiveDescriptionCode
gapsDescriptionCode
querySeqDescriptionCode
sbjctSeqDescriptionCode
homologySeqDescriptionCode
qsDescriptionCode
ssDescriptionCode
hsDescriptionCode
frame
No description
Code
Methods description
report_typecode    nextTop
 Title    : report_type
 Usage    : $type = $sbjct->report_type()
 Function : Returns the type of report from which this hit was obtained.
            This usually pertains only to BLAST and friends reports, for which
            the report type denotes what type of sequence was aligned against
            what (BLASTN: dna-dna, BLASTP prt-prt, BLASTX translated dna-prt, 
            TBLASTN prt-translated dna, TBLASTX translated dna-translated dna).
 Example  : 
 Returns  : A string (BLASTN, BLASTP, BLASTX, TBLASTN, TBLASTX, UNKNOWN)
 Args     : a string on set (you should know what you are doing)
PcodeprevnextTop
 Title    : P
 Usage    : $hsp->P();
 Function : returns the P (significance) value for a HSP 
 Example  : 
 Returns  : (double) significance value
 Args     :
percentcodeprevnextTop
 Title    : percent
 Usage    : $hsp->percent();
 Function : returns the percent matching 
 Returns  : (double) percent matching
 Args     : none
matchcodeprevnextTop
 Title    : match
 Usage    : $hsp->match();
 Function : returns the match
 Example  : 
 Returns  : (double) frac_identical 
 Args     :
hsplengthcodeprevnextTop
 Title    : hsplength
 Usage    : $hsp->hsplength();
 Function : returns the HSP length (including gaps)
 Returns  : (integer) HSP length
 Args     : none
positivecodeprevnextTop
 Title    : positive
 Usage    : $hsp->positive();
 Function : returns the number of positive matches (symbols in the alignment
            with a positive score)
 Returns  : (int) number of positive matches in the alignment
 Args     : none
gapscodeprevnextTop
 Title    : gaps
 Usage    : $hsp->gaps();
 Function : returns the number of gaps or 0 if none
 Returns  : (int) number of gaps or 0 if none
 Args     : none
querySeqcodeprevnextTop
 Title    : querySeq
 Usage    : $hsp->querySeq();
 Function : returns the query sequence
 Returns  : (string) the Query Sequence 
 Args     : none
sbjctSeqcodeprevnextTop
 Title    : sbjctSeq
 Usage    : $hsp->sbjctSeq();
 Function : returns the Sbjct sequence 
 Returns  : (string) the Sbjct Sequence 
 Args     : none
homologySeqcodeprevnextTop
 Title    : homologySeq
 Usage    : $hsp->homologySeq();
 Function : returns the homologous sequence 
 Returns  : (string) homologous sequence 
 Args     : none
qscodeprevnextTop
 Title    : qs
 Usage    : $hsp->qs();
 Function : returns the Query Sequence (same as querySeq)
 Returns  : (string) query Sequence 
 Args     : none
sscodeprevnextTop
 Title    : ss
 Usage    : $hsp->ss();
 Function : returns the subject sequence ( same as sbjctSeq) 
 Returns  : (string) Sbjct Sequence
 Args     : none
hscodeprevnextTop
 Title    : hs
 Usage    : $hsp->hs();
 Function : returns the Homologous Sequence (same as homologySeq ) 
 Returns  : (string) Homologous Sequence
 Args     : none
Methods code
newdescriptionprevnextTop
sub new {
    my ($class, @args) = @_;

    # workaround to make sure frame is not set before strand is
# interpreted from query/hit info
# this workaround removes the key from the hash
# so the superclass does not try and work with it
# we'll take care of setting it in this module later on
my %newargs = @args; foreach ( keys %newargs ) { if( /frame$/i ) { delete $newargs{$_}; } } # done with workaround
my $self = $class->SUPER::new(%newargs); my ($score,$bits,$match,$hsplength,$positive,$gaps,$p,$qb,$qe,$sb,$se,$qs, $ss,$hs,$qname,$sname,$qlength,$slength,$qframe,$sframe,$blasttype) = $self->_rearrange([qw(SCORE BITS MATCH HSPLENGTH POSITIVE GAPS P QUERYBEGIN QUERYEND SBJCTBEGIN SBJCTEND QUERYSEQ SBJCTSEQ HOMOLOGYSEQ QUERYNAME SBJCTNAME QUERYLENGTH SBJCTLENGTH QUERYFRAME SBJCTFRAME BLASTTYPE )],@args); $blasttype = 'UNKNOWN' unless $blasttype; $self->report_type($blasttype); # Determine strand meanings
my ($queryfactor, $sbjctfactor) = (1,0); # default
if ($blasttype eq 'BLASTP' || $blasttype eq 'TBLASTN' ) { $queryfactor = 0; } if ($blasttype eq 'TBLASTN' || $blasttype eq 'TBLASTX' || $blasttype eq 'BLASTN' ) { $sbjctfactor = 1; } # Set BLAST type
$self->{'BLAST_TYPE'} = $blasttype; # Store the aligned query as sequence feature
my $strand; if ($qe > $qb) { # normal query: start < end
if ($queryfactor) { $strand = 1; } else { $strand = undef; } $self->query( Bio::SeqFeature::Similarity->new (-start=>$qb, -end=>$qe, -strand=>$strand, -source=>"BLAST" ) ) } else { # reverse query (i dont know if this is possible, but feel free to correct)
if ($queryfactor) { $strand = -1; } else { $strand = undef; } $self->query( Bio::SeqFeature::Similarity->new (-start=>$qe, -end=>$qb, -strand=>$strand, -source=>"BLAST" ) ) } # store the aligned hit as sequence feature
if ($se > $sb) { # normal hit
if ($sbjctfactor) { $strand = 1; } else { $strand = undef; } $self->hit( Bio::SeqFeature::Similarity->new (-start=>$sb, -end=>$se, -strand=>$strand, -source=>"BLAST" ) ) } else { # reverse hit: start bigger than end
if ($sbjctfactor) { $strand = -1; } else { $strand = undef; } $self->hit( Bio::SeqFeature::Similarity->new (-start=>$se, -end=>$sb, -strand=>$strand, -source=>"BLAST" ) ) } # name the sequences
$self->query->seqname($qname); # query name
$self->hit->seqname($sname); # hit name
# set lengths
$self->query->seqlength($qlength); # query length
$self->hit->seqlength($slength); # hit length
# set object vars
$self->score($score); $self->bits($bits); $self->significance($p); $self->query->frac_identical($match); $self->hit->frac_identical($match); $self->{'HSPLENGTH'} = $hsplength; $self->{'PERCENT'} = int((1000 * $match)/$hsplength)/10; $self->{'POSITIVE'} = $positive; $self->{'GAPS'} = $gaps; $self->{'QS'} = $qs; $self->{'SS'} = $ss; $self->{'HS'} = $hs; $self->frame($qframe, $sframe); return $self; # success - we hope!
}
_overloaddescriptionprevnextTop
sub _overload {
	my $self = shift;
	return $self->start."..".$self->end." ".$self->bits;
}
report_typedescriptionprevnextTop
sub report_type {
    my ($self, $rpt) = @_;
    if($rpt) {
	$self->{'_report_type'} = $rpt;
    }
    return $self->{'_report_type'};
}
PdescriptionprevnextTop
sub P {
	my ($self, @args) = @_;
	my $float = $self->significance(@args);
	my $match = '([+-]?)(?=\d|\.\d)\d*(\.\d*)?([Ee]([+-]?\d+))?'; # Perl Cookbook 2.1
if ($float =~ /^$match$/) { # Is a C float
return $float; } elsif ("1$float" =~ /^$match$/) { # Almost C float, Jitterbug 974
return "1$float"; } else { $self->warn("[HSP::P()] '$float' is not a known number format. Returning zero (0) instead."); return 0; }
}
percentdescriptionprevnextTop
sub percent {
shift->{'PERCENT'}
}
matchdescriptionprevnextTop
sub match {
shift->query->frac_identical(@_)
}
hsplengthdescriptionprevnextTop
sub hsplength {
shift->{'HSPLENGTH'}
}
positivedescriptionprevnextTop
sub positive {
shift->{'POSITIVE'}
}
gapsdescriptionprevnextTop
sub gaps {
shift->{'GAPS'}
}
querySeqdescriptionprevnextTop
sub querySeq {
shift->{'QS'}
}
sbjctSeqdescriptionprevnextTop
sub sbjctSeq {
shift->{'SS'}
}
homologySeqdescriptionprevnextTop
sub homologySeq {
shift->{'HS'}
}
qsdescriptionprevnextTop
sub qs {
shift->{'QS'}
}
ssdescriptionprevnextTop
sub ss {
shift->{'SS'}
}
hsdescriptionprevnextTop
sub hs {
shift->{'HS'}
}
framedescriptionprevnextTop
sub frame {
    my ($self, $qframe, $sframe) = @_;
    if( defined $qframe ) {
	if( $qframe == 0 ) {
	    $qframe = undef;
	} elsif( $qframe !~ /^([+-])?([1-3])/ ) {	    
	    $self->warn("Specifying an invalid query frame ($qframe)");
	    $qframe = undef;
	} else { 
	    if( ($1 eq '-' && $self->query->strand >= 0) || 
		($1 eq '+' && $self->query->strand <= 0) ) {
		$self->warn("Query frame ($qframe) did not match strand of query (". $self->query->strand() . ")");
	    }
	    # Set frame to GFF [0-2]
$qframe = $2 - 1; } $self->{'QFRAME'} = $qframe; } if( defined $sframe ) { if( $sframe == 0 ) { $sframe = undef; } elsif( $sframe !~ /^([+-])?([1-3])/ ) { $self->warn("Specifying an invalid hit frame ($sframe)"); $sframe = undef; } else { if( ($1 eq '-' && $self->hit->strand >= 0) || ($1 eq '+' && $self->hit->strand <= 0) ) { $self->warn("Hit frame ($sframe) did not match strand of hit (". $self->hit->strand() . ")"); } # Set frame to GFF [0-2]
$sframe = $2 - 1; } $self->{'SFRAME'} = $sframe; } (defined $qframe && $self->SUPER::frame($qframe) && ($self->{'FRAME'} = $qframe)) || (defined $sframe && $self->SUPER::frame($sframe) && ($self->{'FRAME'} = $sframe)); if (wantarray() && $self->{'BLAST_TYPE'} eq 'TBLASTX') { return ($self->{'QFRAME'}, $self->{'SFRAME'}); } elsif (wantarray()) { (defined $self->{'QFRAME'} && return ($self->{'QFRAME'}, undef)) || (defined $self->{'SFRAME'} && return (undef, $self->{'SFRAME'})); } else { (defined $self->{'QFRAME'} && return $self->{'QFRAME'}) || (defined $self->{'SFRAME'} && return $self->{'SFRAME'}); }
}
General documentation
FEEDBACKTop
Mailing ListsTop
User feedback is an integral part of the evolution of this and other
Bioperl modules. Send your comments and suggestions preferably to
the Bioperl mailing list. Your participation is much appreciated.
  bioperl-l@bioperl.org            - General discussion
http://bioperl.org/MailList.shtml  - About the mailing lists
Reporting BugsTop
Report bugs to the Bioperl bug tracking system to help us keep track
of the bugs and their resolution. Bug reports can be submitted via
email or the web:
  bioperl-bugs@bioperl.org
  http://bioperl.org/bioperl-bugs/
AUTHOR - Peter SchattnerTop
Email: schattner@alum.mit.edu
CONTRIBUTORSTop
Jason Stajich, jason@bioperl.org
APPENDIXTop
The rest of the documentation details each of the object methods.
Internal methods are usually preceded with a _