Stable and interpretable molecular signatures
Reliable identification of molecular biomarkers is essential for accurate patient stratification. Although state-of-the-art machine learning approaches for classification continue to push boundaries in terms of performance, most of these methods are not able to integrate different data types and lack generalization power, limiting their application in clinical settings. Furthermore, they behave as black boxes, and provide limited insights about the mechanisms that lead to the prediction.
Pathway-induced multiple kernel learning (PIMKL) is a novel, interpretable methodology to reliably classify samples highlighting molecular mechanisms responsible for the prediction of a phenotype of interest. PIMKL exploits prior knowledge in the form of a molecular interaction network and annotated gene sets, by optimizing a mixture of pathway-induced kernels using a multiple kernel learning (MKL) algorithm. The model provides a stable molecular signature, interpretable in the light of the ingested prior knowledge, that can be further used in transfer learning tasks.
Reference
““PIMKL: Pathway-Induced Multiple Kernel Learning,”
Matteo Manica et al.,
npj Systems Biology and Applications 5(1), 8, Nature Publishing Group, 2019.
Interpretable molecular signatures from phenotype prediction.
Pathway-induced multiple kernel learning for computational biology (PIMKL)
Matrix-induced multiple kernel learning (MIMKL)
Questions?
Ask the experts
Matteo Manica
IBM Research Scientist
Joris Cadow
IBM Data scientist
Funding
This research is funded by the PrECISE EU project