Supporting Website for the Paper:
Discriminative Motif Finding for Subcellular Localization based on Profile Hidden Markov Models

Tien-ho Lin, Robert F. Murphy, and Ziv Bar-Joseph

Compartments and HMM motifs

Abstract

Knowing the subcellular location of proteins is important for understanding their functions. Many methods have been described to predict subcellular locaiton from sequence information. However, most of these methods either rely on global sequence properties or use a set of known protein targeting motifs to predict protein localization. Here we develop and test a novel method that identifies potential targeting motifs using a discriminative approach based on Hidden Markov models (discriminative HMMs). These models search for motifs that are present in a compartment but absent in other, nearby, compartments by utilizing an hierarchical structure that mimics the protein sorting mechanism. We show that both discriminative motif finding and the hierarchical structure improves localization prediction on a benchmark dataset of yeast proteins. The motifs identified can be mapped to known targeting motifs and they are more conserved than the average protein sequence. Using our motif-based predictions we can identify what we believe are annotation errors in public databases for the location of some of the proteins.

Prediction of all yeast proteins

Software download

Please email Tien-ho Lin for any questions.