The CECAD Bioinformatics Facility is working on the analysis of high-throughput molecular data in model systems and clinical cases of aging-associated diseases. To this end, we are taking a data integration approach combining multiple OMICS data types with phenotypic and clinical data. Our scientific information systems for high-throughput data analysis and integration have substantially reduced the analytical hurdles posed by emerging high-throughput technologies and thus facilitate the accessibility to such technologies by benchside scientists and clinicians. The Bioinformatics Facility is using high-performance computing (HPC) infrastructure, databases and web resources operated in collaboration with the Regional Computing Center of the University of Cologne (RRZK).
Services: The Bioinformatics Facility offers Next-Generation Sequencing (NGS) analyses using state-of-the-art methods with the promise to deliver publication-ready results within only a few days. Our analyses rely on QuickNGS, a highly scalable NGS analysis system developed in-house (Wagle et al., 2015). It currently comprises analysis pipelines for transcriptomics (RNA-Seq), epigenomics (ChIP-Seq), whole-genome sequencing (WGS) as well as solutions for WGS and whole-exome sequencing (WXS) of tumor samples. Integrated analyses of the transcriptome and epigenome will be performed if both data types are available. Downstream analyses of the results and assistance with the publication process are provided upon individual request.
Scientific achievements: The highly scalable QuickNGS analysis pipelines developed by the Bioinformatics Facility have already been used to process approximately 4000 samples undergoing analyses of the genome, transcriptome, or epigenome. By these analyses, the group has made significant contributions to many widely recognized papers (see Publications). QuickNGS has been adopted by scientists in more than 15 countries worldwide to setup local NGS analysis infrastructures. Currently, the system is undergoing a substantial transformation towards an integrated multi-OMICS analysis platform suitable to perform comprehensive analyses of several orthogonal NGS data types from the same tissue of origin.
Future plans: The Bioinformatics Facility will further expand its efforts towards NGS data integration in order to make a multi-OMICS sequencing approach become a standard analysis in the near future. On top of this, we will strengthen our efforts on the application of complex mathematical models to the analysis of regulatory circuits in aging-associated diseases. In addition, we are planning to apply methods of big data analytics and artificial intelligence to large-scale NGS data volumes processed by our platforms. This will push science at CECAD further into the digital age and bring significant outreach to the benefit of research and health also beyond the cluster.
Figure 1: The QuickNGS framework developed by the facility combines a broad collection of widely adopted Next-Generation Sequencing (NGS) data analysis softwares into a professional high-end analysis system enabling rapid in-depth analysis of tens or hundreds of samples from NGS experiments with extremely low hands-on time. The sample information simply needs to be fed into a MySQL database by the lab operators and all the rest is running fully automated, ending up in a convenient web interface for easy user access.
Figure 2: The lab operates an in-house software package which trains models of molecular regulatory networks by taking advantage of the large and ever increasing RNA-Seq/ChIP-Seq databases behind the QuickNGS framework. Integration of these networks with additional a-priori knowledge from public databases has established a very powerful tool for downstream analysis of RNA-Seq data from the labs within CECAD.
The figure shows a sub-network of a trained miRNA regulatory network in Caenorhabditis elegans enriched for genes involved in lifespan regulation and aging.