Protein function prediction via graph kernels

From Proteinfunction.net

Jump to: navigation, search
Computational approaches to protein function prediction usually infer protein function by finding proteins with similar sequence, structure, surface clefts, chemical properties, amino acid motifs, interaction partners or phylogenetic profiles. If we combine those methods together rather than use only a few methods, we could expect a better performance.
Here, Karsten M. Borgwardt et al. presents coding a protein to a graph using sequential, structural and chemical information.



They model proteins as attributed and undirected graphs. Nodes represent secondary structural elements within the protein structure. Edges connect nodes if those are neighbors along the amino acid sequence or if they are neighbors in space within the protein structure. Nodes bear a type label, stating whether they represent a helix, sheet or turn, and physical and chemical information, namely the hydrophobicity, the van der Waals volume, the polarity and polarizability of the secondary structural element represented by this node.

Once graphs are generated, a kernel is needed to measure the similarity between two protein graphs. They untilized "Random walk graph kernel", and created a new version called "Protein graph kernel". Protein graph kernel is a combination of step kernels, and step kernels consists of several sub-kernels such as type kernel, length kernel, and node labels kernel.

In this way, they defined a new graph kernel method, and evaluated its performance by using EC numbers.

Reference

Karsten M. Borgwardt et al., Protein function prediction via graph kernels, BIOINFORMATICS (2005)