idw – Informationsdienst Wissenschaft

Nachrichten, Termine, Experten

Grafik: idw-Logo
Grafik: idw-Logo

idw - Informationsdienst
Wissenschaft

Science Video Project
idw-Abo

idw-News App:

AppStore

Google Play Store



Instanz:
Teilen: 
04.02.2025 11:38

How languages make compromises: More complex languages seem to be more efficient

Dr. Annette Trabold Presse- und Öffentlichkeitsarbeit
Leibniz-Institut für Deutsche Sprache

    How do languages balance the richness of their structures with the need for efficient communication? To investigate, researchers at the Leibniz Institute for the German Language (IDS) in Mannheim, Germany, trained computational language models on more than 6,500 documents in over 2,000 languages. They found that languages that are computationally harder to process compensate for this increased complexity with greater efficiency: more complex languages need fewer symbols to encode the same message. The analyses also reveal that larger language communities tend to use more complex but more efficient languages.

    Language models are computer algorithms that learn to process and generate language by analysing large amounts of text. They excel at identifying patterns without relying on predefined rules, making them valuable tools for linguistic research. Importantly, not all models are the same: their internal architectures vary, shaping how they learn and process language. These differences allow researchers to compare languages in new ways and uncover insights into linguistic diversity.

    In a novel study, researchers at the IDS trained language models on a vast dataset of over 6,500 documents in more than 2,000 languages, covering almost 3 billion words. The texts included religious writings, legal documents, movie subtitles, newspaper articles, and a lot more. The researchers estimated how difficult it is for the computational models to process or produce text, using this as a measure of language complexity. “We trained very different language models on this textual material,” says co-author Sascha Wolfer. “Some simple models only consider the last two words, which limits their ability to capture grammatical patterns over long distances. Others, such as transformers (similar to ChatGPT), use advanced mechanisms to analyse complex dependencies and uncover richer linguistic structures.”

    Surprisingly, the results were consistent: despite significant architectural differences, the models produced remarkably similar rankings of language complexity. “If one language is harder to process than another for one model in one corpus, this relationship holds across other models, text types, and even if the model operates on a different symbolic level, e.g. characters instead of words,” explains co-author Peter Meyer. “These findings suggest that the results may not only reflect computational effort but could also offer insights into the intrinsic complexity of human languages.”

    Why, then, would some languages evolve to be more complex, given the increased effort required for processing? A key finding of the study may provide an answer: there is a trade-off between complexity and efficiency. Languages with higher complexity tend to produce shorter texts to convey the same content, reflecting a compensatory mechanism where increased structural intricacy is offset by greater efficiency in communication.
    “So maybe the extra effort required to learn a complex language has its benefits,” suggests Alexander Koplenig, lead author of the study. “Once you’ve mastered it, a complex language might offer more options to express yourself, which can make it easier to convey the same idea using fewer symbols. This is relevant, because we also show that this trade-off is shaped by the social environments in which languages are used, with larger communities tending to use more complex but more efficient languages.”

    So one could speculate that in large societies, institutionalised education might enable greater linguistic complexity by providing systematic and formalised language learning, which supports the acquisition and use of intricate linguistic structures. At the same time, the importance of written communication in larger societies may create pressure for shorter messages to reduce costs for production, storage, and transmission—such as book paper, storage space, or bandwidth. “This combination—education enabling complexity and practical needs driving efficiency—could explain why languages in larger communities evolve the way they do,” Koplenig continues. “Testing this speculative hypothesis is a fascinating direction for future research.”

    The Leibniz Institute for the German Language (IDS) is the central extramural institute for research and documentation of the German language in its contemporary usage and in its recent history. It is one of over 90 research and service institutions of the Leibniz Association. For more details see: http://www.ids-mannheim.de, https://bsky.app/profile/idsmannheim.bsky.social, http://www.facebook.com/ids.mannheim, http://www.instagram.com/ids_mannheim/ and http://www.leibniz-gemeinschaft.de.


    Wissenschaftliche Ansprechpartner:

    Dr. Sascha Wolfer
    Leibniz Institute for the German Language
    R 5, 6-13
    D-68161 Mannheim
    Tel.: +49 621 1581-439
    Email: wolfer@ids-mannheim.de


    Originalpublikation:

    Koplenig A., Wolfer S., Rüdiger J.-O., Meyer, P. (2025): Human languages trade off complexity against efficiency. PLOS Complex Systems 2(1): e0000032.
    https://journals.plos.org/complexsystems/article?id=10.1371/journal.pcsy.0000032


    Bilder

    Merkmale dieser Pressemitteilung:
    Journalisten
    Sprache / Literatur
    überregional
    Forschungsergebnisse
    Englisch


     

    Hilfe

    Die Suche / Erweiterte Suche im idw-Archiv
    Verknüpfungen

    Sie können Suchbegriffe mit und, oder und / oder nicht verknüpfen, z. B. Philo nicht logie.

    Klammern

    Verknüpfungen können Sie mit Klammern voneinander trennen, z. B. (Philo nicht logie) oder (Psycho und logie).

    Wortgruppen

    Zusammenhängende Worte werden als Wortgruppe gesucht, wenn Sie sie in Anführungsstriche setzen, z. B. „Bundesrepublik Deutschland“.

    Auswahlkriterien

    Die Erweiterte Suche können Sie auch nutzen, ohne Suchbegriffe einzugeben. Sie orientiert sich dann an den Kriterien, die Sie ausgewählt haben (z. B. nach dem Land oder dem Sachgebiet).

    Haben Sie in einer Kategorie kein Kriterium ausgewählt, wird die gesamte Kategorie durchsucht (z.B. alle Sachgebiete oder alle Länder).