idw - Informationsdienst
Wissenschaft
IBC 2024: Fraunhofer IDMT's software solution “InsightPersona” enables meaningful media analysis through combined face and speaker recognition
At the International Broadcasting Convention (IBC) in Amsterdam, the Fraunhofer Institute for Digital Media Technology IDMT will be presenting a solution that uses AI to identify faces and voices in large media collections. The solution “InsightPersona” provides evaluations of the presence of people, groups of people, and conversation content in a matter of seconds. In combination with information on speech intelligibility or excitement in the voice, it also offers new perspectives for searches in video archives and databases.
September 05, 2024. How often is the German Chancellor visible on which channels? Which words are used particularly often during his appearances and how excited does his voice sound? How often do women appear on a particular television program - and how much of the time do they speak? These and other questions about media content are asked by broadcasters, streaming platforms, editorial offices, and research institutions. Whether it's optimizing content strategy, researching sound bites and video footage for the next news report, or analyzing diversity in television programming, Fraunhofer IDMT has developed the right solution with InsightPersona.
InsightPersona: Precise analysis results for faces, voices and perceived genders
The AI algorithms for combined face and speaker analysis enable the rapid evaluation of huge media collections with many thousands of hours of material relating to specific faces, voices, perceived genders (male, female), and the differentiation between adults and children. The rapid evaluation in turn makes it possible, for example, to compile news reports with audio and video clips of suitable people almost in real-time.
The results of the analysis requests are visualized for each customer, for example using heat maps or timelines. "This combination of audio- and video-based analysis technologies enables us to achieve highly informative and high-quality search results. This is particularly helpful when the information obtained complements each other, for example when people are speaking in a media report but are not shown in the image”, explains Uwe Kühhirt, an expert in video analysis at Fraunhofer IDMT.
The display of speech intelligibility can contribute to the accessibility of media content. Media libraries, streaming services, or communication service providers could offer their customers added value, for example by selecting particularly easy-to-understand content or offering an alternative audio track with optimized intelligibility.
Other features: Visualization of communication content, excitement, and instant face search in large archives
At this year's IBC in Amsterdam, the experts from Fraunhofer IDMT will be presenting their software solution live at the stand of the Fraunhofer Business Area Digital Media. Visitors will also be able to get to know the following features:
• Word clouds can be displayed for individual posts from media archives. This visually displays important key terms in the content of the conversation. For example, the tone of a political speech can be seen at a glance.
• The level of excitement in the voice is visualized with the help of color-coded audio tracks. "If a person speaks in a very calm and composed manner, the audio track is colored green. If the voice color is very excited, it turns yellow or red,” explains Christian Rollwage, Head of the Audio Signal Enhancement group at Fraunhofer IDMT.
• InsightPersona Instant Face Search can also be used to search large video collections for any face very quickly. All that is needed is a reference image with the face of the person being searched for. This enables a quick search for specific people in huge media collections.
A tailored solution for every customer
The InsightPersona software solution can be run locally on the customer's hardware (on-premises) or by a service provider (off-premises). The Fraunhofer IDMT also offers to conduct media archive analyses with InsightPersona for its customers. The dashboard with its analysis modules and statistics can be adapted by the Fraunhofer IDMT to the customer's analysis requirements. Various licensing models are available for InsightPersona.
Get to know “InsightPersona” and visit us from September 13 to 16, 2024 at the IBC in Hall 8 at Stand B.80 of the Fraunhofer Digital Media Business Unit. Our experts will be happy to discuss the benefits of this cross-modal analysis tool for your individual areas of application.
About the Fraunhofer IDMT
The Fraunhofer Institute for Digital Media Technology IDMT is one of 76 institutes and research facilities of the Fraunhofer-Gesellschaft, one of the leading organizations for application-oriented research.
At the Fraunhofer IDMT headquarters in Ilmenau (Thuringia), experts are working on the secure and efficient AI-based recognition and classification of audio and video data. Areas of application include media, industrial production, transport and logistics as well as the environment and agriculture. Another focus is the development of customized solutions for the production and reproduction of authentic and spatial sound experiences for the professional audio, entertainment and automotive sectors.
The Oldenburg Branch Hearing, Speech and Audio Technology HSA stands for market-oriented research and development with a focus on speech and event recognition, sound quality and speech intelligibility as well as mobile neurotechnologies and systems for networked healthcare. With in-house expertise in the development of hardware and software systems for audio system technology and signal enhancement, the employees at the Oldenburg site translate scientific findings into customer-oriented, practical solutions.
Further information on
www.idmt.fraunhofer.de and www.idmt.fraunhofer.de/hsa
Contact for the media:
Christian Colmer
Head of Marketing and Communication
Fraunhofer-Institute for Digital Media Technology IDMT
Oldenburg Branch for Hearing, Speech and Audio Technology HSA
Marie-Curie-Str. 2
26129 Oldenburg
Phone +49 441 2172-436
Christian.colmer@idmt.fraunhofer.de
http://www.idmt.fraunhofer.de/hsa
https://www.idmt.fraunhofer.de/en/institute/projects-products/insightpersona.htm...
Fraunhofer IDMT's AI software “InsightPersona” identifies faces and voices in large media archives i ...
Fraunhofer IDMT/istock.com/vm
Fraunhofer IDMT/istock.com/vm
The results of audiovisual recognition of specific individuals are presented in an easy-to-understan ...
Fraunhofer IDMT
Fraunhofer IDMT
Criteria of this press release:
Business and commerce, Journalists, Scientists and scholars
Economics / business administration, Information technology, Media and communication sciences, Politics, Social studies
transregional, national
Research results, Transfer of Science or Research
English
You can combine search terms with and, or and/or not, e.g. Philo not logy.
You can use brackets to separate combinations from each other, e.g. (Philo not logy) or (Psycho and logy).
Coherent groups of words will be located as complete phrases if you put them into quotation marks, e.g. “Federal Republic of Germany”.
You can also use the advanced search without entering search terms. It will then follow the criteria you have selected (e.g. country or subject area).
If you have not selected any criteria in a given category, the entire category will be searched (e.g. all subject areas or all countries).