idw – Informationsdienst Wissenschaft

Nachrichten, Termine, Experten

Grafik: idw-Logo
Science Video Project
idw-Abo

idw-News App:

AppStore

Google Play Store



Instanz:
Teilen: 
23.01.2025 12:55

Artificial intelligence in biomedicine: A key to analyzing millions of individual cells

Julia Rinner Corporate Communications Center
Technische Universität München

    Our bodies are made up of around 75 billion cells. But what function does each individual cell perform and how greatly do a healthy person’s cells differ from those of someone with a disease? To draw conclusions, enormous quantities of data must be analyzed and interpreted. For this purpose, machine learning methods are applied. Researchers at the Technical University of Munich (TUM) and Helmholtz Munich have now tested self-supervised learning as a promising approach for testing 20 million cells or more.

    In recent years, researchers have made considerable progress with single-cell technology. This makes it possible to investigate tissue on the basis of individual cells and simply to determine the various functions of the individual cell types. The analysis can be used, for instance, to make comparisons with healthy cells to find out how smoking, lung cancer or a COVID infection change individual cell structures in the lung.

    At the same time, the analysis is generating ever-increasing quantities of data. The researchers intend to apply machine learning methods to support the process of re-interpreting existing datasets, deriving conclusive statements from the patterns and applying the results to other areas.

    Self-supervised learning as a new approach

    Fabian Theis holds the Chair of Mathematical Modelling of Biological Systems at TUM. With his team, he has investigated whether self-supervised learning is more suitable for the analysis of large data quantities than other methods. The study was recently published in Nature Machine Intelligence. This form of machine learning works with unlabelled data. No classified sample data are required in advance. That means that it is not necessary to pre-assign the data to certain groups in advance. Unlabelled data are available in large quantities and permit the robust representation of enormous data volumes.

    Self-supervised learning is based on two methods. In masked learning – as the name suggests – a portion of the input data is masked and the model is trained to be able to reconstruct the missing elements. In addition, the researchers apply contrastive learning in which the model learns to combine similar data and separate dissimilar data.
    The team used both methods of self-supervised learning to test more than 20 million individual cells and compared them with the results of classical learning methods. In their assessment of the different methods, the researchers focused on tasks such as predicting cell types and the reconstruction of gene expression.

    Prospects for the development of virtual cells

    The results of the study show that self-supervised learning improves performance especially with transfer tasks – that is, when analyzing smaller datasets informed by insights from a larger auxiliary dataset. In addition, the results of zero-shot cell predictions – in other words, tasks performed without pre-training – are also promising. The comparison between masked and contrastive learning shows that masked learning is better suited for applications with large single-cell data sets.

    The researchers are using the data to work on the development of virtual cells. These are comprehensive computer models that reflect the diversity of cells in different datasets. These models are promising for the analysis of cellular changes as seen with diseases, for example. The results of the study offer valuable insights into how such models could be trained more efficiently and further optimized.


    Wissenschaftliche Ansprechpartner:

    Prof. Fabian Theis
    Technical University of Munich
    Chair of Mathematical Modelling of Biological Systems
    theis@mytum.de


    Originalpublikation:

    Richter, T., Bahrami, M., Xia, Y. et al. Delineating the effective use of self-supervised learning in single-cell genomics. Nat Mach Intell (2024). https://doi.org/10.1038/s42256-024-00934-3


    Weitere Informationen:

    https://www.tum.de/en/news-and-events/all-news/press-releases/details/a-key-to-a...


    Bilder

    Fabian Theis, Professor für Mathematische Modellierung biologischer Systeme
    Fabian Theis, Professor für Mathematische Modellierung biologischer Systeme
    Astrid Eckert / TUM
    © Astrid Eckert, München (Free for use in reporting on TUM, with the copyright noted)


    Merkmale dieser Pressemitteilung:
    Journalisten
    Informationstechnik, Mathematik, Medizin
    überregional
    Forschungsergebnisse, Wissenschaftliche Publikationen
    Englisch


     

    Fabian Theis, Professor für Mathematische Modellierung biologischer Systeme


    Zum Download

    x

    Hilfe

    Die Suche / Erweiterte Suche im idw-Archiv
    Verknüpfungen

    Sie können Suchbegriffe mit und, oder und / oder nicht verknüpfen, z. B. Philo nicht logie.

    Klammern

    Verknüpfungen können Sie mit Klammern voneinander trennen, z. B. (Philo nicht logie) oder (Psycho und logie).

    Wortgruppen

    Zusammenhängende Worte werden als Wortgruppe gesucht, wenn Sie sie in Anführungsstriche setzen, z. B. „Bundesrepublik Deutschland“.

    Auswahlkriterien

    Die Erweiterte Suche können Sie auch nutzen, ohne Suchbegriffe einzugeben. Sie orientiert sich dann an den Kriterien, die Sie ausgewählt haben (z. B. nach dem Land oder dem Sachgebiet).

    Haben Sie in einer Kategorie kein Kriterium ausgewählt, wird die gesamte Kategorie durchsucht (z.B. alle Sachgebiete oder alle Länder).