Interpretability for machine learning and computational biology
Understanding real-world datasets is often challenging due to their size, complexity and/or poor knowledge about the problem to be tackled (i.e. electronic health records, OMICS data, etc.).
To achieve high accuracy for important tasks, equally complex machine/deep-learning models are usually used. In many situations, the decisions achieved by such automated systems can have significant—and potentially deleterious—consequences.
In biology and healthcare, interpretability becomes important for three main reasons.
For example, doctors and patient need to be confident about the decision achieved by a deployed model.
By providing the rationale behind a decision could make a model more trustable.
A model could return unexpected predictions, possibly indicating poor performance.
Interpretability could help by shedding light on the causes behind poor performance, such as unfair dataset bias or poor model training.
3 Generating biological hypotheses
Surprising results do not always have a negative connotation. Rather, they might be due to the trained model leveraging a true pattern in the data that is unknown even to field experts, such as an unknown protein–protein interaction.
Interpretable methods can potentially uncover these patterns, which can then be used as the basis for novel biological hypotheses.
Ask the experts
María Rodríguez Martínez
IBM Research scientist