Activity Number:
|
272
|
Type:
|
Topic Contributed
|
Date/Time:
|
Wednesday, August 14, 2002 : 8:30 AM to 10:20 AM
|
Sponsor:
|
Section on Statistical Computing*
|
Abstract - #301374 |
Title:
|
HMMSTR: A Hidden Markov Model for Protein Local Structure Motifs
|
Author(s):
|
Christopher Bystroff*+ and Vesteinn Thorsson and David Baker
|
Affiliation(s):
|
Rensselaer Polytechnic Institute and University of Washington and University of Washington
|
Address:
|
, Troy, New York, 12180, USA
|
Keywords:
|
protein folding ; machine learning ; peptide structure ; bioinformatics ; sequence grammar ; gene finding
|
Abstract:
|
HMMSTR is a hidden Markov model for general protein sequence based on the I-sites library of sequence-structure motifs. Unlike profile HMMs, used to model individual protein families, HMMSTR has a highly branched topology. Markov state pathways capture recurrent local features of protein sequences and structures that transcend protein family boundaries. The model extends the I-sites library by describing the adjacencies of different sequence-structure motifs as observed in the database, and achieves a great reduction in parameters by representing overlapping motifs in a much more compact form. The HMM attributes a considerably higher probability to coding sequence than does an equivalent dipeptide model; predicts secondary structure with an accuracy of 74.6% and backbone torsion angles better than any previously reported method; and predicts the structural context of beta strands and turns with an accuracy that should be useful for tertiary structure prediction.
|