Stable and interpretable molecular signatures

Reliable identification of molecular biomarkers is essential for accurate patient stratification. Although state-of-the-art machine learning approaches for classification continue to push boundaries in terms of performance, most of these methods are not able to integrate different data types and lack generalization power, limiting their application in clinical settings. Furthermore, they behave as black boxes, and provide limited insights about the mechanisms that lead to the prediction.

Pathway-induced multiple kernel learning (PIMKL) is a novel, interpretable methodology to reliably classify samples highlighting molecular mechanisms responsible for the prediction of a phenotype of interest. PIMKL exploits prior knowledge in the form of a molecular interaction network and annotated gene sets, by optimizing a mixture of pathway-induced kernels using a multiple kernel learning (MKL) algorithm. The model provides a stable molecular signature, interpretable in the light of the ingested prior knowledge, that can be further used in transfer learning tasks.

PIMKL schema


“PIMKL: Pathway-Induced Multiple Kernel Learning,”
Matteo Manica et al.,
npj Systems Biology and Applications 5(1), 8, Nature Publishing Group, 2019.


Interpretable molecular signatures from phenotype prediction.


Ask the experts
Matteo Manica
Matteo Manica
IBM Research Scientist

Joris Cadow
Joris Cadow
IBM Data scientist


This research is funded by the PrECISE EU project

EU project PrECISE