ProtoNet method

From Proteinfunction.net

Jump to: navigation, search
Given a set of proteins(proteins from Uniprot) ProtoNet aims at organizing the proteins into a hierarchy of trees, each tree representing a biologically related group of proteins and its division into functional subgroups.
To construct ProtoNet, Noam Kaplan et al. used three phases : 1. All-against-all BLAST - 2. Hierarchical agglomerative clustering - 3. Pruning.


Figure. Graphical representation of the BLAST e-value matrix data of ProtoNet cluster A429475

[How to use ProtoNet to infer annotation]
When provided with a new sequence, it is localized to an existing cluster of ProtoNet. Once it is localized, its functionality can be learned from its relative position in the hierarchy.
To do this, functional annotations to each cluster should be made in advance. They assigned to each cluster the annotation of its member proteins, which adhere to the following two conditions :
(1) the annotation is shared by at least 75% of the proteins in the cluster and (2) the annotation achieves a P-value<0.001 under the assumption that the annotations are distributed hypergeometrically. Once the clusters are assigned annotations, the new sequence is assigned the annotations of the cluster to which it belongs and the annotations of all of the cluster's parents in the hierarchy.

References

  1. Ori Sasson, Noam Kaplan, and Michal Linial, Functional annotation prediction: All for one and one for all, Protein Sci 2006 15: 1557-1562
  2. Noam Kaplan, Moriah Friedlich, Menachem Fromer and Michal Linial, A functional hierarchical organization of the protein sequence space,BMC Bioinformatics 2004, 5:196
  3. Noam Kaplan et al, ProtoNet 4.0: A hierarchical classification of one million protein sequences, Nucleic Acids Research, 2005, Vol. 33, Database issue D216-D218