idw - Informationsdienst
Wissenschaft
The SIGMOD community honors the research of BIFOLD researchers Arnab Phani and Matthias Böhm. Their work on eliminating the inefficient reuse of intermediate computations across multi-backend machine learning systems addresses a key challenge at the intersection of data engineering and machine learning.
The paper introduces MEMPHIS, a new framework that helps ML systems reuse previously computed results more effectively across different computing resources. At the heart of MEMPHIS is an efficient caching system that keeps track of past computations and decides when to reuse, move, or discard them, while also managing memory efficiently.
BIFOLD researchers Dr. Arnab Phani and Prof. Dr. Matthias Böhm have received the 2026* ACM SIGMOD Research Highlight Award for their paper "MEMPHIS: Holistic Lineage-based Reuse and Memory Management for Multi-backend ML Systems." The award will be formally presented at SIGMOD 2026.
Modern machine learning (ML) systems use different types of computing resources, such as CPUs, GPUs, and distributed platforms like Apache Spark or Ray. These systems often split ML tasks into parts that run across these resources. However, during data analysis, the same computations are often repeated, which wastes time and resources. While earlier solutions can reuse results in specific systems, it is still difficult to do this efficiently across multiple types of computing environments, especially because of challenges like limited memory, data transfer costs, and coordination between tasks.
In this paper, the researchers introduce MEMPHIS, a new framework that helps ML systems reuse previously computed results more effectively across different computing resources. At the heart of MEMPHIS is an efficient caching system that keeps track of past computations and decides when to reuse, move, or discard them, while also managing memory efficiently. To address differences between computing environments, such as delayed or parallel execution and varying memory and communication speeds, they design adaptive strategies for managing this cache. They also enhance an ML compiler so it can better schedule tasks and exchange data efficiently.
Their experiments on a wide range of ML tasks show that MEMPHIS can make systems run up to 9.6 times faster than current approaches.
MEMPHIS is fully integrated into Apache SystemDS, an open-source machine learning system for the end-to-end data science lifecycle, from data preparation and feature transformations of input data such as text, images, and tabular data, to model training. It extends LIMA, an earlier framework from the same authors for reusing intermediate results within a single in-memory environment, to three backends: standard in-memory processing, distributed computing via Apache Spark, and GPU acceleration. The paper was also honored with the Best Research Paper Award at the EDBT Conference 2025.
About the ACM SIGMOD Research Highlight Award
The ACM SIGMOD Research Highlight Award is one of the most prestigious distinctions in database research. Presented annually since 2016 by the ACM Special Interest Group on Management of Data (SIGMOD), it recognizes research that addresses an important problem, represents a definitive milestone in solving it, and demonstrates significant impact potential within and beyond the research community.
In his technical review of the Memphis paper, Arun Kumar (UC San Diego; here in the role of the SIGMOD Records editor) explains that the paper demonstrates how classical database principles, such as query optimization, caching, and memory management, are becoming essential for operating AI infrastructure efficiently and sustainably. This trend highlights the key role the database community plays in powering the ongoing AI boom. (Source: https://sigmodrecord.org/publications/sigmodRecord/2603/pdfs/17_memphis-kumar.pd...)
For BIFOLD and its predecessor organization, this is the fourth time they have received the SIGMOD Research Highlight Award. Two earlier awards were associated with Prof. Dr. Volker Markl, today Co-Director of BIFOLD, and one was associated with Prof. Dr. Matthias Böhm, chair of BIFOLD’s DAMS research group.
Voices from the Researchers
Dr. Arnab Phani: “The repetitive nature of data science workflows involves a high degree of redundancy. We implemented the LIMA and MEMPHIS frameworks to efficiently identify and reuse redundant intermediates across diverse compute backends. We integrated all our work into Apache SystemDS to enable future research and adoption. It is deeply gratifying to see that our work has been acknowledged by the broader data management community.”
Prof. Dr. Matthias Böhm: "Arnab has done outstanding work on a problem that affects anyone running large-scale machine learning experiments. I am very pleased that the broader research community is taking notice."
About BIFOLD
The Berlin Institute for the Foundations of Learning and Data (BIFOLD) is one of Germany's six national AI research centers, founded in 2019 through the merger of the Berlin Big Data Center and the Berlin Center for Machine Learning. Based at TU Berlin and in partnership with Charité - Universitätsmedizin Berlin, BIFOLD conducts foundational research at the intersection of machine learning and large-scale data management. The institute receives permanent funding from the State of Berlin and the Federal Ministry of Research, Technology, and Space, reflecting its role as a cornerstone of Germany's long-term AI research strategy.
Dr. Arnab Phani (https://www.bifold.berlin/people/arnab-phani.html ) is a postdoctoral researcher in the DEEM Lab (https://deem.berlin/ ), a research group headed by Sebastian Schelter at BIFOLD and Technische Universität Berlin. He received his PhD from TU Berlin, where he was a research associate in the DAMS Lab group headed by Matthias Böhm. Prior to his PhD, he was a Senior Software Engineer at Teradata Labs, Hyderabad, India.
Please read the BIFOLD researcher Spotlight (https://www.bifold.berlin/news-events/news/view/news-detail/researcher-spotlight... ) if you like to learn more about Arnab and his research.
Prof. Dr. Matthias Böhm (https://www.bifold.berlin/people/prof-dr-matthias-boehm.html ) is a full professor of large-scale data engineering at BIFOLD and Technische Universität Berlin. His research group, the Big Data Engineering Lab (https://www.tu.berlin/en/dams ), focuses on high-level, data science-centric abstractions, as well as on systems and tools to execute these tasks efficiently and scalably.
Links
Paper: https://openproceedings.org/2025/conf/edbt/paper-82.pdf
Code: https://github.com/apache/systemds
SIGMOD Record, Research Highlight: https://sigmodrecord.org/category/research-highlights/
* ”2026 ACM SIGMOD Research Highlight Award” in accordance with the official ACM SIGMOD award listing. Some sources may use 2025 instead, reflecting the year of the selection process.
Dr. Arnab Phani
BIFOLD / DEEM Lab
arnab.phani@tu-berlin.de
MEMPHIS: Holistic Lineage-based Reuse and MemoryManagementfor Multi-backend ML Systems. Arnab Phani, Matthias Böhm. https://openproceedings.org/2025/conf/edbt/paper-82.pdf
https://SIGMOD Record, technical review by Arum Kumar (UC San Diego) https://sigmodrecord.org/2026/04/01/technical-perspective-on-memphis-holistic-li...
https://SIGMOD Research Highlight Awards List
https://sigmod.org/sigmod-awards/sigmod-research-highlights/
Memphis paper receives 2026 ACM SIGMOD Research Highlight Award
Copyright: BIFOLD
Memphis paper receives 2026 ACM SIGMOD Research Highlight Award
Copyright: BIFOLD
Criteria of this press release:
Journalists, Scientists and scholars
Information technology
transregional, national
Contests / awards, Research results
English

You can combine search terms with and, or and/or not, e.g. Philo not logy.
You can use brackets to separate combinations from each other, e.g. (Philo not logy) or (Psycho and logy).
Coherent groups of words will be located as complete phrases if you put them into quotation marks, e.g. “Federal Republic of Germany”.
You can also use the advanced search without entering search terms. It will then follow the criteria you have selected (e.g. country or subject area).
If you have not selected any criteria in a given category, the entire category will be searched (e.g. all subject areas or all countries).