Prof. Dr. Alexander Dilthey

Head of Bioinformatics Core Facility


Prof. Dr. Alexander Dilthey
Head of Bioinformatics Core Facility
 +49 221 478 84017

CECAD Bioinformatics Facility
Joseph-Stelzmann-Str. 26
50931 Cologne

Prof. Dr. Alexander Dilthey and the CECAD Bioinformatics Core Facility team support the bioinformatics analysis needs of the CECAD community with state-of-the-art analysis algorithms and genomic data science.
In addition, Prof. Dilthey also leads the research group Genome Informatics, with a focus on bioinformatics methods development and population-scale genomic analyses.

Our research: The Genome Informatics Group is working on translating advances in DNA sequencing technology into biological insight and novel diagnostic approaches. To achieve this we're using computational genomics and genome informatics; we develop new approaches for the analysis of sequencing data and apply these to large datasets (sometimes comprising tens or hundreds of thousands of individuals). In close collaboration with our colleagues, we're working on novel approaches that leverage the power scalable DNA sequencing technologies and algorithms for the rapid and comprehensive interrogation of clinical samples. Current research projects include the genetic characterization and genomic epidemiology of novel coronavirus SARS-CoV-2; de novo assembly and population reference graphs; high-throughput immunogenetics in some of the world's largest human genome cohorts.

Our successes:

  • We have pioneered the use of genome graph approaches for the representation and analysis of variation in human genomes [6, 8, 9].
  • We have developed some of the most accurate and cost-effective methods for the analysis of metagenomic communities [4] and large cohorts of bacterial isolates [2].
  • We were among the first groups to use modern tools of genomic epidemiology to characterize the genetic structure of the novel coronavirus SARS-CoV-2 in Germany and remain one of the leading SARS-CoV-2 sequencing groups in Germany [1].
  • Our algorithms for statistical and sequence-based HLA typing have enabled the immunogenetic characterization of some of world’s largest human genetics cohorts [3, 8, 10].

Our methods/techniques: Our research combines methods from the fields of statistical genetics, methods for sequence analysis (including alignment and sketch-based algorithms), and genomic data science.


Figure 1: Minimum spanning tree and genomic neighbourhood analysis of 44 unambiguously resolved SARS-CoV-2 sequences from North-Rhine Westphalia, including 8 isolates from the Heinsberg district. Dashed and solid edges without adjacent numbers indicate distances of 0 and 1, respectively; all other distances are shown explicitly. Genomic neighbourhood analysis based on GISAID data. Source: Walker et al. [1].

Figure 2: Sequence homology between HLA-A, -B and–C. Graphs visualizing sequence homology between HLA-A, -B and -C across exons 2 (left) and 3 (right), based on an IMGT/HLA-provided multiple sequence alignment (MSA) of 3284 -A, 4077 -B, 2799 -C alleles. Source: Dilthey et al. [8]