Master’s student

Modeling 3D genome structure from Hi-C data with deep learning

Ref. 2020_032

Project description

A plethora of studies during the past decade have highlighted that the 3D structure of the genome influences critical cell functions, such as DNA replication, gene expression, cell fate decisions and differentiation [Bonev&Cavalli 2016]. To determine how chromatin is folded within the nucleus, a number of experimental techniques are employed, whereby chromosome conformation capture methods, and notably Hi-C, are gaining popularity [Lieberman et al. 2009]. Chromatin interactions captured by Hi-C are represented as a contact matrix, where each entry determines the frequency of interactions between a pair of genomic bins in a population of cells [Dekker et al. 2013]. One of the main applications of Hi-C involves building realistic 3D models of chromatin structure from the extracted contact matrices. Numerous methods have been proposed in past years; however, they suffer from important limitations in terms of underlying assumptions, low resolution or scalability. To address this, early efforts in our group resulted in REACH-3D (REcurrent Autoencoders for CHromatin 3D structure prediction) [Cristecsu et al. 2018], a novel deep-learning approach based on autoencoders with recurrent neural units that infers an ensemble of structures from a Hi-C matrix.

The proposed project involves developing a deep-learning framework for inferring a 3D chromatin structure. To achieve this, the student will extend our prior REACH-3D model by exploiting recent developments in attention mechanisms [Bahdanau et al. 2020]. Alternative neural-network architectures based on graph neural networks [Veličković et al. 2020] or transformers [Vaswani et al. 2020] will also be considered. The goal of the Master’s thesis is not only to infer the 3D genome structure, but also to provide feedback on the chromatin contacts driving genome folding in a location-specific manner. To test the methods, publicly available datasets will be exploited [HiC 2018], and the inferred structures will be benchmarked against established methods. Focus will be placed not only on the accuracy of the results, but also on the scalability of the methods with respect to the genome size and the Hi-C resolution.

Requirements

We invite applications from ETH Master students with a background in Computer Science, Computational Biology/Bioinformatics or related fields. The ideal candidate should have a solid background in machine learning, deep learning and data analysis. Strong programming skills in Python and practical experience with at least one deep-learning framework (Tensorflow, PyTorch, Keras) are essential. Prior knowledge of molecular biology is not a prerequisite.

Diversity

IBM is committed to diversity at the workplace. With us you will find an open, multicultural environment. Excellent flexible working arrangements enable all genders to strike the desired balance between their professional development and their personal lives.

How to apply

Interested candidates are welcome to submit an application including CV and transcript of grades.

B. Bonev and G. Cavalli,
“Organization and function of the 3D genome,”
Nature Reviews Genetics 17(11), 661, 2016,
doi: 10.1038/nrg.2016.112.

E. Lieberman-Aiden et al.,
“Comprehensive Mapping of Long-Range Interactions Reveals Folding Principles of the Human Genome,”
Science 326(5950), 289–293, 2009,
doi: 10.1126/science.1181369.

J. Dekker, M.A. Marti-Renom, and L.A. Mirny,
“Exploring the three-dimensional organization of genomes: interpreting chromatin interaction data,”
Nat. Rev. Genet. 14(6), 390–403, 2013,
doi: 10.1038/nrg3454.

B.-C. Cristescu et al.,
“Inference of the three-dimensional chromatin structure and its temporal behavior,”
arXiv preprint arXiv:1811.09619, 2018.

D. Bahdanau, K. Cho, and Y. Bengio,
“Neural Machine Translation by Jointly Learning to Align and Translate,”
arXiv:1409.0473 [cs, stat], May 2016, Accessed: May 26, 2020. [Online].

P. Veličković et al.,
“Graph Attention Networks,”
arXiv:1710.10903 [cs, stat], Feb. 2018, Accessed: May 26, 2020. [Online].

A. Vaswani et al.,
“Attention Is All You Need,”
arXiv:1706.03762 [cs], Dec. 2017, Accessed: May 26, 2020. [Online].


Hi-C data Browser.” (accessed Mar. 06, 2018).