idw - Informationsdienst
Wissenschaft
+++ RESEARCH TICKER UNIVERSITY OF BONN: Artificial Intelligence for Drug Design +++ In the search for new drugs, artificial intelligence in the form of diffusion models is being used in drug design. What exactly does AI do in this context? Dr. Andrea Mastropietro and Prof. Dr. Jürgen Bajorath from Life Science Informatics at the University of Bonn and the Lamarr Institute for Machine Learning and Artificial Intelligence have investigated this.
WHAT IS IT ABOUT?
Diffusion models have been mainly used for image and video generation. Recently, their usage has been extended to new domains, such as chemistry for the generation of new molecules. For our analysis we aimed at generality and approached the explanation of diffusion models for linker design of molecules with different applications.
WHAT IS A “LINKER”?
A linker is a substructure of a molecule that connects two or more disconnected fragments of atoms. Linker design is an important task in drug development, as it plays a central role in the design of effective molecules with specific properties.
HOW DO DIFFUSION MODELS WORK IN PRINCIPLE?
Diffusion models learn a data distribution and generate new data by sampling from that distribution. The diffusion model itself is an advanced AI model. We try to understand its generative process.
HOW DOES “NOISE” COME INTO PLAY?
Adding and removing noise is the hallmark of diffusion models. Starting from a sample in the dataset (an image or, in our case, a molecule), they add “noise” until the original sample is “destroyed”—like the transition from a detailed image to a “TV static effect.” Then, the model learns how such added noise needs to be removed to retrieve a valid sample, generating a new image (or molecule).
HOW DID YOU PROCEED?
For our study, we selected a state-of-the-art diffusion model for linker design and developed a novel explainability strategy extending a well-known concept in the field on explainable artificial intelligence: Shapley values. For our method, DiffSHAPer, we adapted the widely used Shapley value formalism for explaining machine learning predictions to diffusion models. Our goal was to find which fragment atoms were the most influential for linker generation.
WHAT IS THE MOST IMPORTANT FINDING?
We found that, to generate chemically valid linkers, diffusion models do not learn or exploit chemistry principles, but they mostly rely on distance constraints between atoms. Therefore, they take into account recurrent statistical patterns in the data without learning generalizable chemical rules.
WHAT WAS THE BIGGEST CHALLENGE?
From a computational perspective, running inference and explaining the generations of diffusion models are time-consuming tasks. From a methodological perspective, our approach represents a novelty, therefore we had to find the best way to present our results effectively.
IS THERE AN APPLICATION?
Our methodology can be used to understand what molecular diffusion models learn. In the specific case of linker design, it’s useful to determine what drives the generation of the linker. Linkers are important in drug design, as they can improve critical molecular properties (such as potency and stability). Consequently, a linker generated solely based on distance and geometric constraints does not guarantee optimization of properties or practical chemical utility.
WHAT ARE THE NEXT STEPS?
The first step would be to apply DiffSHAPer to molecular diffusion models tailored to different tasks. Future research will be focused on the development of models able to include more chemical context in their internal reasoning.
WHAT IS THE SOURCE?
Andrea Mastropietro and Jürgen Bajorath: Explaining a molecular diffusion model, Cell Reports Physical Science, DOI: 10.1016/j.xcrp.2026.103270, URL: https://www.cell.com/cell-reports-physical-science/fulltext/S2666-3864(26)00176-...
WHERE CAN I FIND OUT MORE?
Prof. Dr. Jürgen Bajorath, Dr. Andrea Mastropietro, Bonn-Aachen International Center for Information Technology (b-it), Lamarr Institute for Machine Learning and Artificial Intelligence, Tel. +49 228 7369 100, E-Mails: bajorath@bit.uni-bonn.de, mastropietro@bit.uni-bonn.de
Isolated fragments are connected by a linker, generated by the diffusion model. DiffSHAPer rationali ...
Copyright: Image: Andrea Mastropietro
Criteria of this press release:
Journalists, all interested persons
Chemistry, Information technology, Medicine
transregional, national
Research results, Scientific Publications
English

You can combine search terms with and, or and/or not, e.g. Philo not logy.
You can use brackets to separate combinations from each other, e.g. (Philo not logy) or (Psycho and logy).
Coherent groups of words will be located as complete phrases if you put them into quotation marks, e.g. “Federal Republic of Germany”.
You can also use the advanced search without entering search terms. It will then follow the criteria you have selected (e.g. country or subject area).
If you have not selected any criteria in a given category, the entire category will be searched (e.g. all subject areas or all countries).