BMBF is funding Sophie Burkhardt to establish a Computer Science junior research group

idw – Informationsdienst Wissenschaft

Nachrichten, Termine, Experten

Grafik: idw-Logo
Thema Corona

Science Video Project

Share on: 
06/29/2020 13:13

BMBF is funding Sophie Burkhardt to establish a Computer Science junior research group

Kathrin Voigt Kommunikation und Presse
Johannes Gutenberg-Universität Mainz

    Junior research group will be investigating how to independently control the content and style of texts generated using artificial intelligence / German Federal Ministry of Education and Research to provide EUR 2 million in funding

    While it is relatively easy to turn a photo into a Picasso-style image using a computer, it is not yet possible to produce a text in a specific individual style, such as that of an author like Franz Kafka. The problem with texts is that the style and the subject matter are not necessarily cognate. A new research project headed up by Dr. Sophie Burkhardt of the Institute of Computer Science at Johannes Gutenberg University Mainz (JGU) will be looking into exactly this problem. For this, the project "Semantic Disentanglement: Differentiation of Style and Topic in Text Data" will be receiving some EUR 2 million from the German Federal Ministry of Education and Research (BMBF). The researchers intend to develop models and software with the aim of improving the automatic analysis and generation of good quality texts. Possible areas of application involve communication between people and machines, such as in the fields of customer support and the use of social media.

    Artificial intelligence has proven astonishingly successful in text creation. "By now, AI can produce texts that are barely distinguishable from those produced by humans," stated Burkhardt, describing the current status of the technology. However, specifying exactly what the content of a text to be generated should be and then separately manipulating the style of the text is rather difficult. By disentangling or separating the styles and topics of textual data, the effects they have on the generated texts – and hence on their quality – can be enhanced. According to the computer scientist, the ideal outcome would be if it were possible to transform, say, a Harry Potter novel to that extent that it appears to be written in the style of Shakespeare. "But that is still a long way off."

    First successful steps for topic analysis of texts

    The results of the first phases of analysis of topics in complex texts have proven successful, but to date the text style has not yet been taken into account. Initial progress towards finding a way of managing the incorporation of text style could be achieved, for example, by generating a long article in short form or summarizing it for posting on social media, or by reproducing a scientific article in simplified language or rewriting the text with another target group in mind. When it comes to influencing text style, the preliminary emphasis is being placed on the tonality of a text; a review of a product might be positive but it would be possible to rewrite this so that it is negative in tone. "Other, less apparent aspects of style are much more difficult to control," said Burkhardt. "Irony and sarcasm are a huge problem, especially as the system needs to understand the background knowledge involved."

    The aim of the new project sponsored by the German Federal Ministry of Education and Research (BMBF) is to use both language modeling and topic modeling techniques in combination in order to create a common model that can represent both content and text style. This will require the use of state-of-the-art deep neural networks, whereby it will be necessary to first determine how these neural networks can best handle complicated data such as texts. However, large datasets, in other words, large text corpora, will be needed in order to first train the systems.

    Possible applications in dialog systems in the home, in customer support, or in vehicles

    Dr. Sophie Burkhardt expects that the option of automatic generation of high-quality texts could be interesting for many businesses and applications. For example, the newly developed methods could be used in combination with speech recognition for dialog systems in the home, in customer support, or in driving assistance systems. In the long term, this could also serve to make media consumption more accessible if texts could be generated, for example, specifically to the needs of blind people.

    The German Federal Ministry of Education and Research is funding the project as part of its support program for young researchers working in the field of artificial intelligence and thus supporting the establishment of an interdisciplinary junior research group to be headed by Dr. Sophie Burkhardt. The group will receive funding of EUR 2 million over a period of four years.

    Sophie Burkhardt studied Philosophy and Computer Science at Johannes Gutenberg University Mainz and subsequently acquired a doctorate. She was awarded the Dissertation Prize by the JGU Faculty of Physics, Mathematics, and Computer Science for her dissertation on "Online Multi-label Text Classification using Topic Models". While working towards her doctorate she received a scholarship from PRIME Research in Mainz. She has contributed as lead author to a total of ten articles on the subject of topic models and text classification. Since January 2019, Sophie Burkhardt has been working as a postdoctoral researcher in the Data Mining work group at JGU led by Professor Stefan Kramer.

    Dr. Sophie Burkhardt
    photo/©: private

    Related links: – Data Mining group at the JGU Institute of Computer Science

    Read more: – press release "Computer-based weather forecast: New algorithm outperforms mainframe computer systems" (13 Feb. 2020) – press release "Carl Zeiss Foundation supports the establishment of a new research center for artificial intelligence at Mainz University" (2 Oct. 2019)

    Contact for scientific information:

    Dr. Sophie Burkhardt
    Data Mining group
    Institute of Computer Science
    Johannes Gutenberg University Mainz
    55099 Mainz, GERMANY
    phone +49 6131 39-21059

    Criteria of this press release:
    Journalists, Scientists and scholars, all interested persons
    Information technology, Language / literature
    transregional, national
    Personnel announcements, Research projects


    Search / advanced search of the idw archives
    Combination of search terms

    You can combine search terms with and, or and/or not, e.g. Philo not logy.


    You can use brackets to separate combinations from each other, e.g. (Philo not logy) or (Psycho and logy).


    Coherent groups of words will be located as complete phrases if you put them into quotation marks, e.g. “Federal Republic of Germany”.

    Selection criteria

    You can also use the advanced search without entering search terms. It will then follow the criteria you have selected (e.g. country or subject area).

    If you have not selected any criteria in a given category, the entire category will be searched (e.g. all subject areas or all countries).

    Cookies optimize the use of our services. By surfing on you agree to the use of cookies. Data Confidentiality Statement