idw – Informationsdienst Wissenschaft

Nachrichten, Termine, Experten

Grafik: idw-Logo
Grafik: idw-Logo

idw - Informationsdienst
Wissenschaft

Science Video Project
idw-Abo

idw-News App:

AppStore

Google Play Store



Instanz:
Teilen: 
23.05.2022 18:15

Seeing speech

Christian Colmer Press & Public Relations
Fraunhofer-Institut für Digitale Medientechnologie IDMT

    Sound engineers can rely on “Dialogue Detection” to identify speech in the audio signal

    New algorithms from Fraunhofer IDMT form the basis for the “Dialogue Detection” in Steinberg Media Technologies’ latest version of its audio post-production software Nuendo. The function reliably recognises speech components in the audio track and in so doing enables audio professionals to easily separate passages with and without speech into different tracks. Fraunhofer IDMT supplied algorithms for measuring, evaluating and displaying speech intelligibility for the previous version of Nuendo too.

    Oldenburg, 23 May 2022. Identifying passages with and without speech components solely on the basis of the audio level can be a tedious task for professional sound engineers. To detect whether an audio passage is spoken word or merely background noise, they are obliged to listen to each one during editing. In cooperation with the Fraunhofer Institute for Digital Media Technology IDMT in Oldenburg, Steinberg Media Technologies GmbH wants to make professionals’ work in the areas of sound design, dialogue editing and speech synchronisation easier. To this end, Steinberg has integrated the “Dialogue Detection” feature in the latest update of its Nuendo digital audio workstation.

    Spotlight on dialogue processing

    The new features in Nuendo 12 focus on the recording and editing of dialogue. “This especially brings to the fore the requirements of Nuendo users who, for example, need to concentrate more on speech when dubbing products and producing voiceovers. This is particularly important when creating content for streaming services,” says Timo Wildenhain, Head of ProAudio at Steinberg. For this, “Dialogue Detection” relies on technologies from Fraunhofer IDMT in Oldenburg. Algorithms based on machine learning (neural networks) detect speech activity in the audio signal independently of background noise. Sound engineers can listen to these passages and, if required, have parts without speech split automatically into different tracks. They can then start the actual editing process comfortably and conveniently with a separate dialogue track.

    Multiple applications for speech activity detection

    To reliably identify speech activity in the presence of background noise, Fraunhofer IDMT brought in a lot of different data to train its “Speech Activity Detection” (SAD) algorithm used in the feature. “Our SAD algorithms are found in a variety of applications. As an independent feature, they can noticeably improve audio professionals’ workflow. In addition, they serve in other Fraunhofer IDMT solutions as a pre-processing tool for our in-house speech and speaker recognition, as noise cancellation algorithms or privacy filters,” explains Christian Rollwage, Head of Audio Signal Enhancement at the Oldenburg Branch for Hearing, Speech and Audio Technology HSA. Whether in the smart speaker in the living room at home, in speech-based machine control on the factory floor or in voice documentation in quality assurance: SAD can be used to ensure that non-speech components are filtered out before passing the audio to the next processing steps, or that speech is not recorded in the first place, thus protecting users’ privacy, for example in public places.

    Successful cooperation between Steinberg and Fraunhofer IDMT

    Steinberg already used Fraunhofer IDMT’s technologies in the previous version, Nuendo 11, to measure, evaluate and display speech intelligibility. The intelligibility meter gave audio professionals a tool to keep speech as intelligible as possible in the final mix and also to take demographic change, with its associated hearing losses, into account.

    Hearing, Speech and Audio Technology HSA at Fraunhofer IDMT in Oldenburg

    Founded in 2008 by Prof. Dr. Dr. Birger Kollmeier and Dr. Jens-E. Appell, the Fraunhofer Institute for Digital Media Technology IDMT’s Branch for Hearing, Speech and Audio Technology HSA stands for market-oriented research and development with a focus on the following areas:

    • Speech and event recognition
    • Sound quality and speech intelligibility
    • Mobile neurotechnology and systems for networked healthcare

    With in-house expertise in the development of hardware and software systems for audio system technology and signal enhancement, over 100 employees at the Oldenburg site are responsible for transferring scientific findings into practical, customer-oriented solutions.

    Through scientific cooperation, the institute is closely linked to the Carl von Ossietzky University, Jade University of Applied Sciences, and the University of Applied Sciences Emden/Leer. Fraunhofer IDMT is a partner in the »Hearing4all« cluster of excellence.

    Further information on www.idmt.fraunhofer.de/hsa

    Contact for the media:
    Christian Colmer
    Head of Marketing and Communication

    Fraunhofer-Institute for Digital Media Technology IDMT
    Oldenburg Branch for Hearing, Speech and Audio Technology HSA
    Marie-Curie-Str. 2
    26129 Oldenburg
    Phone +49 441 2172-436
    christian.colmer@idmt.fraunhofer.de
    http://www.idmt.fraunhofer.de/hsa


    Weitere Informationen:

    http://www.idmt.fraunhofer.de/hsa


    Bilder

    The »Dialog Detection« in Steinbergs Nuendo 12: Algorithms of Fraunhofer IDMT in Oldenburg reliably identify speech activity in the presence of background noises.
    The »Dialog Detection« in Steinbergs Nuendo 12: Algorithms of Fraunhofer IDMT in Oldenburg reliably ...
    Steinberg Media Technologies
    Steinberg Media Technologies


    Merkmale dieser Pressemitteilung:
    Journalisten, Wirtschaftsvertreter
    Elektrotechnik, Informationstechnik, Kunst / Design, Medien- und Kommunikationswissenschaften
    überregional
    Forschungs- / Wissenstransfer, Kooperationen
    Englisch


     

    Hilfe

    Die Suche / Erweiterte Suche im idw-Archiv
    Verknüpfungen

    Sie können Suchbegriffe mit und, oder und / oder nicht verknüpfen, z. B. Philo nicht logie.

    Klammern

    Verknüpfungen können Sie mit Klammern voneinander trennen, z. B. (Philo nicht logie) oder (Psycho und logie).

    Wortgruppen

    Zusammenhängende Worte werden als Wortgruppe gesucht, wenn Sie sie in Anführungsstriche setzen, z. B. „Bundesrepublik Deutschland“.

    Auswahlkriterien

    Die Erweiterte Suche können Sie auch nutzen, ohne Suchbegriffe einzugeben. Sie orientiert sich dann an den Kriterien, die Sie ausgewählt haben (z. B. nach dem Land oder dem Sachgebiet).

    Haben Sie in einer Kategorie kein Kriterium ausgewählt, wird die gesamte Kategorie durchsucht (z.B. alle Sachgebiete oder alle Länder).