Machine Learning to increase biotechnology-based protein production

idw – Informationsdienst Wissenschaft

Nachrichten, Termine, Experten

Grafik: idw-Logo

Share on: 
05/17/2019 15:41

Machine Learning to increase biotechnology-based protein production

Dr. Susanne Stöcker Presse, Informationen
Paul-Ehrlich-Institut - Bundesinstitut für Impfstoffe und biomedizinische Arzneimittel

    In a research co-operation, researchers of the Paul-Ehrlich-Institut (PEI) have developed a mathematical model which allows more accurate forecasts and improved output in the biotechnology-based protein synthesis in host organism. The new method offers many and varied applications in biotechnology including the development of vaccines. Scientific Reports has published an article on the results in its online version of 17 May 2019.

    Biotechnology medicinal products are frequently based on tailor-made proteins produced in cell cultures or bacteria. For this purpose, the genes containing the information on the amino acid sequence of the desired proteins are transferred to the bacterial or mammalian cells. However, this is often not sufficient to read the transferred genes to the desired extent and to form the proteins coded on them. Usually, an additional adaptation of the respective genes in the host cell is required. Among other thing, this happens by adaptation of the code for the amino acids. The sequence of three nucleobases each of the messenger RNA (mRNA), also called codon, determines the individual amino acids; the sequence of the codons determines the amino acid sequence of the proteins. An exchange of these codons is necessary because different organisms, i.e. cell systems have preferences for one and the same amino acid. The reason for this has scientifically not yet been fully understood. The adaptation of the codons has therefore so far been made using a heuristic approach.

    How can it be better predicted which optimisation steps are suitable? In a research co-operation supported by the Adolf-Messer Foundation with researchers of the Max Planck Institute for Colloids and Interfaces, Potsdam, and the Goethe University at Frankfurt/Main, co-workers of Dr Jan-Hendrik Trösemeier and Dr. Christel Kamp, Section Biostatistics of Division Microbiology of the Paul-Ehrlich-Institut studied the protein expression in the so-called codon-specific elongation model (COSEM). In this study, mathematical methods are used to simulate the dynamics of the protein synthesis (protein translation) in the appropriate cells and a codon-specific protein synthesis is derived from this.

    Using the data of this simulation, the researchers have found the so-called protein expression score, taking into account additional predictors for the protein output and using methods of “machine learning”. This protein expression score serves to forecast the protein output and to optimise the codons of the genes, which are expressed in foreign cells (heterologous). In various model organisms, the researchers provided proof that their simulation-based optimisation method was superior to conventional methods. Not only can the protein output be increased with this newly developed modular model, but further optimisations can also be performed, e.g. the accuracy of translation can be improved.

    The algorithm is implemented in special software programs and permits the above-described user-defined optimisation of genes. The algorithm can also be used for the inverted path – de-optimisation. What is the purpose of this? Such a de-optimisation of genes can, among other things, be used for the genetic modification and attenuation of pathogenesis. Such an attenuation of pathogens is used in developing vaccines: Live vaccines are derived from original pathogens and are genetically modified in such a way that although they produce an immune reaction in humans, they only replicate to a limited extent, and are therefore no longer able to produce a disease.

    This new approach to optimising codons has brought about a patent registration (see this link): ( ).

    Original publication:

    Trösemeier JH, Rudorf S, Loessner H, Hofner B, Reuter A, Schulenborg T, Koch I, Bekeredjian-Ding I, Lipowsky R, Kamp C (2019): Optimizing the dynamics of protein expression. Sci Rep May 17 [Epub ahead of print].

    More information: - press release Paul-Ehrlich-Institut

    Criteria of this press release:
    Journalists, Scientists and scholars
    Biology, Mathematics, Medicine
    transregional, national
    Research results, Transfer of Science or Research

    The codon-specific elongation model (COSEM) simulates protein synthesis.

    For download



    Search / advanced search of the idw archives
    Combination of search terms

    You can combine search terms with and, or and/or not, e.g. Philo not logy.


    You can use brackets to separate combinations from each other, e.g. (Philo not logy) or (Psycho and logy).


    Coherent groups of words will be located as complete phrases if you put them into quotation marks, e.g. “Federal Republic of Germany”.

    Selection criteria

    You can also use the advanced search without entering search terms. It will then follow the criteria you have selected (e.g. country or subject area).

    If you have not selected any criteria in a given category, the entire category will be searched (e.g. all subject areas or all countries).

    Cookies optimize the use of our services. By surfing on you agree to the use of cookies. Data Confidentiality Statement