Institute of Medical Statistics and Computational Biology, Faculty of Medicine
The Computational Biology group has its biological focus on gene regulation, mRNA metabolism, and epigenomics. We perform integrative analyses of high dimensional biomedical omics data generated by, e.g., (single cell) RNA-Seq, chromatin immunoprecipitation (ChIP-Seq), bisulfite sequencing and high-throughput microscopic imaging. Our goal is to identify and reconstruct intracellular signaling networks that are active in development, aging, stress response, and disease. To this end, we develop statistical and machine learning methods and efficient inference algorithms for their training.
Our research: We are interested in all aspects of RNA metabolism, foremost the epigenomic control of RNA synthesis, the dynamics of nuclear RNA export, and RNA degradation. Our primary objective is the precise, genome-wide quantification of these processes. As the technology for RNA quantification is under rapid development, we continuously adapt our statistical repertoire for data analysis. Further, we want to know how DNA and RNA binding proteins contribute to RNA metabolism. We construct models for the joint interpretation of multiple ChIP-Seq experiments, which shed light on the composition of the molecular complexes involved in RNA synthesis and their dynamic change during transcription. Recently, we started to do analysis of microscopic time lapse images, as well as of magnetic resonance tomography images. We use novel neuronal network architectures for their automated segmentation and object recognition.
As a computational Biology group, our main work revolves around the development of statistical and machine learning algorithms and their application to biomedical data. Specifically, we use graphical models for the detection and interpretation of gene regulatory interactions. E.g., we created Nested Effects Models, which are able to infer a hierarchical regulatory cascade from the measurement of genome-wide gene expression after targeted interventions. For the learning of these models, which often contain hundreds of parameters, we improve state-of-the-art optimization and inference methods, such as Markov Chain Monte Carlo, or penalized linear models (and their generalizations).
Another focus of our group is on new sequencing technologies, such as single cell RNA-Seq, single cell ATAC-Seq, and direct RNA sequencing (Oxford Nanopore) and the detection of post-transcriptional RNA modifications.
Figure 1: The bidirectional hidden Markov model (bdHMM). Background: Multiple ChIP-Seq tracks of general transcription factors and Polymerase II CTD modifications are jointly analyzed. Foreground: Contrary to a standard HMM (middle track), the bdHMM (bottom track) automatically identifies directional processes such as transcription along genes (top track).
Figure 2: The tree hidden factor Graph model (treeHFM). Left: Pre-processing of differentiating hematopoietic progenitor cells by time-lapse microscopy, automated cell tracking and image feature extraction. Middle: The treeHFM is a probabilistic graphical model, which learns an unbiased, automated annotation of cell types (right).
Figure 3: Comparative Dynamic Transcriptome Analysis (cDTA) of mRNA synthesis and degradation reveals a global mRNA buffering mechanism, which is mediated by Xrn1.
Top: Scatter plot showing global changes in mRNA DRs (log fold of median mRNA decay rates in mutant versus wild-type, x-axis) and SRs (log fold of median mRNA synthesis rates in mutant versus wild-type, y-axis) in 46 yeast deletion strains. The center of each circle is determined by the median DR and SR of the strain. Bottom: Bar plot depicting the buffering index (BI). BI is 1 when mRNA level buffering is perfect. BI is between 0 and 1 when mRNA level buffering is partial. BI of 0 or below 0 indicates that there is no mRNA level buffering. The only mutant in which the dysbalance in mRNA levels is even acerbated is the Xrn1 deletion strain.