PIMKL, Computational systems biology, IBM Research Zurich

Stable and interpretable molecular signatures

Reliable identification of molecular biomarkers is essential for accurate patient stratification. Although state-of-the-art machine learning approaches for classification continue to push boundaries in terms of performance, most of these methods are not able to integrate different data types and lack generalization power, limiting their application in clinical settings. Furthermore, they behave as black boxes, and provide limited insights about the mechanisms that lead to the prediction.

Pathway-induced multiple kernel learning (PIMKL) is a novel, interpretable methodology to reliably classify samples highlighting molecular mechanisms responsible for the prediction of a phenotype of interest. PIMKL exploits prior knowledge in the form of a molecular interaction network and annotated gene sets, by optimizing a mixture of pathway-induced kernels using a multiple kernel learning (MKL) algorithm. The model provides a stable molecular signature, interpretable in the light of the ingested prior knowledge, that can be further used in transfer learning tasks.

PIMKL schema

Watch the PIMKL video

Reference

““PIMKL: Pathway-Induced Multiple Kernel Learning,”
Matteo Manica et al.,
npj Systems Biology and Applications 5(1), 8, Nature Publishing Group, 2019.

Web service

Interpretable molecular signatures from phenotype prediction.

Source code

Pathway-induced multiple kernel learning for computational biology (PIMKL)

Source code

Matrix-induced multiple kernel learning (MIMKL)

Questions?

Ask the experts

Matteo Manica
IBM Research Scientist

Joris Cadow
IBM Data scientist

Funding

This research is funded by the PrECISE EU project