State-of-the-art AI programs can support the development of drugs by predicting how proteins interact with small molecules. However, researchers at the University of Basel have shown that these programs only memorize patterns, rather than understanding physical relationships. They often fail when it comes to new proteins that would be of particular interest for innovative drugs.
Proteins play a key role not only in the body, but also in medicine: they either serve as active ingredients, such as enzymes or antibodies, or they are target structures for drugs. The first step in developing new therapies is therefore usually to decipher the three-dimensional structure of proteins.
For a long time, elucidating protein structures was a highly complex endeavor, until machine learning found its way into protein research. AI models with names such as AlphaFold or RosettaFold have ushered in a new era: they calculate how the chain of protein building blocks, known as amino acids, folds into a three-dimensional structure. In 2024, the developers of these programs received the Nobel Prize in Chemistry.
Suspiciously high success rate
The latest versions of these programs go one step further: they calculate how the protein in question interacts with another molecule – a docking partner or “ligand”, as experts call it. This could be an active pharmaceutical ingredient, for example.
“This possibility of predicting the structure of proteins together with a ligand is invaluable for drug development,” says Professor Markus Lill from the University of Basel. Together with his team at the Department of Pharmaceutical Sciences, he researches methods for designing active pharmaceutical ingredients.
However, the apparently high success rates for the structural prediction puzzled Lill and his staff. Especially as there are only around 100,000 already elucidated protein structures together with their ligands available for training the AI models – relatively few compared to other training data sets for AI. “We wanted to find out whether these AI models really learn the basics of physical chemistry using the training data and apply them correctly,” says Lill.
Same prediction for significantly altered binding sites
The researchers modified the amino acid sequence of hundreds of sample proteins in such a way that the binding sites for their ligands exhibited a completely different charge distribution or were even blocked entirely. Nevertheless, the AI models predicted the same complex structure – as if binding were still possible. The researchers pursued a similar approach with the ligands: they modified them in such a way that they would no longer be able to dock to the protein in question. This did not bother the AI models either.
In more than half of the cases, the models predicted the structure as if the interferences in the amino acid sequence had never occurred. “This shows us that even the most advanced AI models do not really understand why a drug binds to a protein; they only recognize patterns that they have seen before,” says Lill.
Unknown proteins are particularly difficult
The AI models faced particular difficulties if the proteins did not show any similarity to the training data sets. “When they see something completely new, they quickly fall short, but that is precisely where the key to new drugs lies,” emphasizes Markus Lill.
AI models should therefore be viewed with caution when it comes to drug development. It is important to validate the predictions of the models using experiments or computer-aided analyses that actually take the physicochemical properties into account. The researchers also used these methods to examine the results of the AI models in the course of their study.
“The better solution would be to integrate the physicochemical laws into future AI models,” says Lill. With their more realistic structural predictions, these could then provide a better basis for the development of new drugs, especially for protein structures that have so far been difficult to elucidate, and would open up the possibility of completely new therapeutic approaches.
Prof. Dr. Markus A. Lill, University of Basel, Department of Pharmaceutical Sciences, tel. +41 61 207 61 35, email: markus.lill@unibas.ch
Matthew R. Masters, Amr H. Mahmoud, Markus A. Lill
Investigating whether deep learning models for co-folding learn the physics of protein-ligand interactions
Nature Communiations (2025), doi: 10.1038/s41467-025-63947-5
https://doi.org/10.1038/s41467-025-63947-5
Merkmale dieser Pressemitteilung:
Journalisten, Wissenschaftler, jedermann
Informationstechnik, Medizin
überregional
Forschungsergebnisse, Wissenschaftliche Publikationen
Englisch

Sie können Suchbegriffe mit und, oder und / oder nicht verknüpfen, z. B. Philo nicht logie.
Verknüpfungen können Sie mit Klammern voneinander trennen, z. B. (Philo nicht logie) oder (Psycho und logie).
Zusammenhängende Worte werden als Wortgruppe gesucht, wenn Sie sie in Anführungsstriche setzen, z. B. „Bundesrepublik Deutschland“.
Die Erweiterte Suche können Sie auch nutzen, ohne Suchbegriffe einzugeben. Sie orientiert sich dann an den Kriterien, die Sie ausgewählt haben (z. B. nach dem Land oder dem Sachgebiet).
Haben Sie in einer Kategorie kein Kriterium ausgewählt, wird die gesamte Kategorie durchsucht (z.B. alle Sachgebiete oder alle Länder).