In 2003, the Human Genome Project (HGP) succeeded in mapping more than 90% of the human genome for the first time. Subsequently, the 1,000 Genome Project (2007 – 2015), provided initial insights into the diversity of the human genome. Now – ten years after the conclusion of the project – an international research consortium has, in collaboration with Professor Dr Tobias Marschall, bioinformatician at Heinrich Heine University Düsseldorf (HHU), re-analysed the samples using state-of-the-art techniques. The resulting findings are now presented in two publications in the same issue of the scientific journal Nature. They show the diversity of the human genome in an unprecedented level of depth.
Building on the HGP, the goal of the 1,000 Genome Project was to sequence the human genome of a cross-section of the global population. On conclusion of the project in 2015, the goal had been exceeded and data from 2,500 people from five continents and 26 populations had been collated. Both projects have made significant contributions toward understanding the human genome.
Ten years after the conclusion of the project, an international research group involving the team headed by Professor Marschall (Institute of Medical Biometry and Bioinformatics) has now examined the human genome in more detail. In the studies now published in the scientific journal Nature, genomes collated within the framework of the 1,000 Genome Project were re-analysed using advanced technologies, which were not available back in 2015.
The new aspect: At the time of the 1,000 Genome Project, the sequencing, i.e. the determination of the sequence of letters in the human genome, was based largely on short DNA segments, which were not sufficient to assemble a complete genome. The new so-called long-read sequencing methods now permit a much more detailed analysis of the genomes. These technologies supply the sequences of longer DNA segments in one piece, making it easier to identify genetic differences between individuals.
These so-called genetic variants can occur in various forms, such as differences of one or several base pairs – the letters – of the DNA sequence. However, they can also be more extensive, e.g. where longer DNA segments are deleted, inverted, repeated or added in certain individuals. These instances are also referred to as structural variants and they play an important role in the development of a variety of genetic diseases, including rare and as yet unexplained genetic syndromes.
The pangenome: The mapping of the human genome
In 2023, the Human Pangenome Reference Project (HPRC), in which Professor Marschall was also involved, published a draft “pangenome reference”, i.e. a map of human genetic diversity, based on 47 individuals. The goal is to replace the reference genome used to date with this draft in the future. The new study data will also contribute toward this.
In the first of the studies now published (Schloissnig, Pani, et al.), 1,019 genomes were sequenced, making this cohort more than 20 times larger than the data of the HPRC. This new, significantly larger reference dataset proves particularly helpful when studying structural variants that occur less frequently in the population. “Having variants in a diverse cohort of healthy individuals is essential for gaining a better understanding of which variants in the genomes of patients can be the cause of the conditions they are suffering from,” says Professor Dr Dagmar Wieczorek (Institute of Human Genetics at HHU), who was also involved in the study.
The second study (Logsdon, Ebert, Audano, Loftus, et al.) also expands existing knowledge of the human genome. However, the focus here was not on the number of genomes, but rather on sequencing the genomes as completely as possible. 65 samples, which are also part of the 1,000 Genome Project, were examined using highly sophisticated sequencing methods. The researchers were able to reconstruct complete genome sequences (known as “T2T” or “telomere-to-telomere”) for 1,161 chromosomes (39%). “This is particularly noteworthy, as human chromosomes can contain hundreds of millions of base pairs and it was only a few years ago that the first ever complete reconstruction of an individual genome was achieved,” says the HHU bioinformatician Professor Dr Alexander Dilthey (research team leader at the Institute of Medical Microbiology and Hospital Hygiene), who was also involved in the study.
Furthermore, the complete genomes have now made it possible to understand certain regions such as centromeres, which have not been accessible using conventional methods. Centromeres are the points at which the two chromatids are linked during cell division – they form the familiar X shape. Research into the significance and consequences of genetic variants in centromeres has been limited to date. The new study now enables follow-up investigation of their influence e.g. on immune disorders and cancers.
Professor Dr Jan Korbel from the European Molecular Biology Laboratory (EMBL) in Heidelberg, co-author of both papers, believes that the simultaneous publication of the two studies is a particular success: “Although the first study uses less sophisticated sequencing methods, it is based on a much larger cohort, while the second study is based on a smaller cohort but uses more advanced sequencing methods. This enables us to gain extremely robust and precise insights into the variation of our genomes.”
A new resource for genome research worldwide
Professor Marschall also emphasises that the findings of the two studies not only supply important insights, but also significantly increase the amount of data available, which will have a positive impact on research in the long term. “These studies establish an extensive and medically relevant resource, which can now be used by researchers all over the world to gain a better understanding of the mutational mechanisms driving human genome variation,” says Professor Marschall. “This is an outstanding example of collaborative research and open science, which opens up new perspectives in genome research and represents a step toward a more complete knowledge of the human genome. I am confident that, on the basis of these important findings, we will be able to identify many links between structural genetic variants and disease risks in the future.”
The new datasets have been made publicly available to researchers all over the world for analysis and use.
In addition to HHU and EMBL, research facilities worldwide were involved in the two studies.
• Research facilities involved in the examination of the 1,019 datasets included the Research Institute of Molecular Pathology Vienna (IMP), the Centre for Genomic Regulation Barcelona (CRG) and Pompeu Fabra University in Barcelona (UPF).
• Alongside HHU and EMBL, institutions involved in the sequencing of the 65 genome datasets included the University of Washington School of Medicine (USA), the Jackson Laboratory for Genomic Medicine (USA), Clemson University (USA), the University of Connecticut (USA) and the University of Pennsylvania.
Prof. Dr. Tobias Marschall
“Structural Variation in 1,019 Diverse Humans based on Long-Read Sequencing”
S. Schloissnig, S. Pani, B. Rodriguez-Martin, J. Ebler, C. Hain, V. Tsapalou, A. Söylev, P. Hüther, H. Ashraf, T. Prodanov, M. Asparuhova, H. Magalhães, W. Höps, J. Sotelo-Fonseca, T. Fitzgerald, W. Santana-Garcia, R. Moreira-Pinhal, S. Hunt, F. J. Pérez-Llanos, T. Wollenweber, S. Sivalingam, D. Wieczorek, M. Cáceres, C. Gilissen, E. Birney, Z. Ding, J. Jensen, N. Podduturi, J. Stutzki, B. Rodriguez-Martin, T. Rausch, T. Marschall, J. Korbel. Nature 2025.
DOI: 10.1038/s41586-025-09290-7
“Complex genetic variation in nearly complete human genomes”
G. Logsdon, P. Ebert, P. Audano, M. Loftus, D. Porubsky, J. Ebler, F. Yilmaz, P. Hallast, T. Prodanov, D. Yoo, C. Paisie, W. Harvey, X. Zhao, G. V. Martino, M. Henglin, K. Munson, K. Rabbani, C.-S. Chin, B. Gu, H. Ashraf, S. Scholz, O. Austine-Orimoloye, P. Balachandran, M. Bonder, H. Cheng, Z. Chong, J. Crabtree, M. Gerstein, L. Guethlein, P. Hasenfeld, G. Hickey, K. Heokzema, S. Hunt, M. Jensen, Y. Jiang, S. Koren, Y. Kwon, C. Li, H. Li, J. Li, P. J. Norman, K. Oshima, B. Paten, A. Phillippy, N. Pollock, T. Rausch, M. Rautiainen, Y. Song, A. Söylev, A. Sulovari, L. Surapaneni, V. Tsapalou. W. Zhou, Y. Zhou, Q. Zhu, M. Zody, R. Mills, S. Devine. X. Shi, M. Talkowski, M. Chaisson, A. Dilthey, M. Konkel, J. Korbel, C. Lee, C. Beck, E. Eichler, T. Marschall. Nature 2025.
DOI: 10.1038/s41586-025-09140-6
https://www.nature.com/articles/s41586-025-09290-7
https://www.nature.com/articles/s41586-025-09140-6
The studies analyzed genome data sets from five continents and 26 populations. This way it became po ...
Copyright: Siegfried Schloissnig / HHU – Berit Meisenkothen)
Criteria of this press release:
Journalists
Biology, Medicine
transregional, national
Research results, Scientific Publications
English
The studies analyzed genome data sets from five continents and 26 populations. This way it became po ...
Copyright: Siegfried Schloissnig / HHU – Berit Meisenkothen)
You can combine search terms with and, or and/or not, e.g. Philo not logy.
You can use brackets to separate combinations from each other, e.g. (Philo not logy) or (Psycho and logy).
Coherent groups of words will be located as complete phrases if you put them into quotation marks, e.g. “Federal Republic of Germany”.
You can also use the advanced search without entering search terms. It will then follow the criteria you have selected (e.g. country or subject area).
If you have not selected any criteria in a given category, the entire category will be searched (e.g. all subject areas or all countries).