2nd MLC-SLM Challenge Launches, Advancing Multilingual Conversational Speech Understanding
LOS ANGELES, CA, UNITED STATES, April 13, 2026 /EINPresswire.com/ -- The 2nd Multilingual Conversational Speech Language Model (MLC-SLM) Challenge has officially opened for registration, inviting research teams and practitioners worldwide to participate. Built on a multilingual conversational speech training set covering ๐ญ๐ฐ ๐น๐ฎ๐ป๐ด๐๐ฎ๐ด๐ฒ๐ and approximately ๐ฎ,๐ญ๐ฌ๐ฌ ๐ต๐ผ๐๐ฟ๐ of data, this yearโs challenge focuses on key tasks including ๐๐ฝ๐ฒ๐ฎ๐ธ๐ฒ๐ฟ ๐๐ฒ๐ด๐บ๐ฒ๐ป๐๐ฎ๐๐ถ๐ผ๐ป, ๐ฎ๐๐๐ผ๐บ๐ฎ๐๐ถ๐ฐ ๐๐ฝ๐ฒ๐ฒ๐ฐ๐ต ๐ฟ๐ฒ๐ฐ๐ผ๐ด๐ป๐ถ๐๐ถ๐ผ๐ป (๐๐ฆ๐ฅ), ๐ฎ๐ป๐ฑ ๐ฑ๐ถ๐ฎ๐น๐ผ๐ด๐๐ฒ ๐๐ป๐ฑ๐ฒ๐ฟ๐๐๐ฎ๐ป๐ฑ๐ถ๐ป๐ด, further pushing speech language model research from simple transcription toward deeper conversational understanding.
๐ง๐ฎ๐ฟ๐ด๐ฒ๐๐ถ๐ป๐ด ๐ฅ๐ฒ๐ฎ๐น-๐ช๐ผ๐ฟ๐น๐ฑ ๐ ๐๐น๐๐ถ๐น๐ถ๐ป๐ด๐๐ฎ๐น ๐๐ผ๐ป๐๐ฒ๐ฟ๐๐ฎ๐๐ถ๐ผ๐ป๐
As speech language models continue to evolve, real-world multilingual conversations are becoming an increasingly important research direction. Unlike conventional ASR tasks, these scenarios involve multiple speakers, multi-turn interactions, and more complex acoustic and semantic information. Systems are expected not only to transcribe speech accurately, but also to determine who spoke when and ultimately understand the conversation as a whole.
The 2nd MLC-SLM Challenge is designed around this shift, focusing on multilingual conversational speech tasks that are closer to real application settings and providing an open benchmark and international platform for Speech LLM research.
๐๐ ๐ฝ๐ฎ๐ป๐ฑ๐ฒ๐ฑ ๐ง๐ฟ๐ฎ๐ถ๐ป๐ถ๐ป๐ด ๐๐ฎ๐๐ฎ: ๐๐ฟ๐ผ๐๐ป๐ฑ ๐ฎ,๐ญ๐ฌ๐ฌ ๐๐ผ๐๐ฟ๐ ๐๐ฐ๐ฟ๐ผ๐๐ ๐ญ๐ฐ ๐๐ฎ๐ป๐ด๐๐ฎ๐ด๐ฒ๐
One of the most significant highlights of this yearโs challenge is the dataset. The training set contains approximately 2,100 hours of multilingual conversational speech spanning ๐ญ๐ฐ ๐น๐ฎ๐ป๐ด๐๐ฎ๐ด๐ฒ๐: ๐๐ป๐ด๐น๐ถ๐๐ต, ๐๐ฟ๐ฒ๐ป๐ฐ๐ต, ๐๐ฒ๐ฟ๐บ๐ฎ๐ป, ๐๐๐ฎ๐น๐ถ๐ฎ๐ป, ๐ฃ๐ผ๐ฟ๐๐๐ด๐๐ฒ๐๐ฒ, ๐ฆ๐ฝ๐ฎ๐ป๐ถ๐๐ต, ๐๐ฎ๐ฝ๐ฎ๐ป๐ฒ๐๐ฒ, ๐๐ผ๐ฟ๐ฒ๐ฎ๐ป, ๐ฅ๐๐๐๐ถ๐ฎ๐ป, ๐ง๐ต๐ฎ๐ถ, ๐ฉ๐ถ๐ฒ๐๐ป๐ฎ๐บ๐ฒ๐๐ฒ, ๐ง๐ฎ๐ด๐ฎ๐น๐ผ๐ด, ๐จ๐ฟ๐ฑ๐, ๐ฎ๐ป๐ฑ ๐ง๐๐ฟ๐ธ๐ถ๐๐ต.
Among them, English contributes around 500 hours and includes diverse regional varieties such as US, UK, Australian, Indian, and Philippine English, while each of the other languages contributes roughly 100 hours. This expansion strengthens the challengeโs foundation for multilingual conversational speech research in terms of scale, language coverage, and regional diversity.
๐ก๐ฎ๐๐๐ฟ๐ฎ๐น ๐ง๐๐ผ-๐ฆ๐ฝ๐ฒ๐ฎ๐ธ๐ฒ๐ฟ ๐๐ผ๐ป๐๐ฒ๐ฟ๐๐ฎ๐๐ถ๐ผ๐ป๐ ๐๐ผ๐น๐น๐ฒ๐ฐ๐๐ฒ๐ฑ ๐ถ๐ป ๐ฅ๐ฒ๐ฎ๐น๐ถ๐๐๐ถ๐ฐ ๐ฆ๐ฒ๐๐๐ถ๐ป๐ด๐
The dataset is designed to better reflect real application scenarios. All recordings are natural two-speaker conversations, where participants discuss randomly assigned topics in a meaningful and fluent way. The audio was collected in quiet indoor environments using consumer devices such as iPhones, making the data closer to real-world collection conditions.
The dataset also includes real-time timestamps and speaker labels to support system development. In addition, Track 1 and Track 2 share the same training set, encouraging participants to explore unified modeling approaches across recognition, diarization, and conversational understanding.
๐ง๐๐ผ ๐๐ผ๐ฟ๐ฒ ๐ง๐ฎ๐๐ธ๐: ๐๐ฟ๐ผ๐บ โ๐ช๐ต๐ผ ๐ฆ๐ฝ๐ผ๐ธ๐ฒโ ๐๐ผ โ๐ช๐ต๐ฎ๐ ๐ช๐ฎ๐ ๐จ๐ป๐ฑ๐ฒ๐ฟ๐๐๐ผ๐ผ๐ฑโ
The challenge includes two main tasks.
Track 1: Multilingual Conversational Speech Diarization and Recognition
Track 2:Multilingual Conversational Speech Understanding
Unlike traditional speech benchmarks that focus primarily on transcription, the 2nd MLC-SLM Challenge places greater emphasis on multilingual, multi-speaker, and dialogue-level understanding. The evaluation setting does not provide prior information such as pre-segmented utterances or speaker labels, making the tasks closer to real deployment conditions.
๐๐๐ถ๐น๐ฑ๐ถ๐ป๐ด ๐ผ๐ป ๐๐ต๐ฒ ๐๐ป๐๐ฒ๐ฟ๐ป๐ฎ๐๐ถ๐ผ๐ป๐ฎ๐น ๐๐บ๐ฝ๐ฎ๐ฐ๐ ๐ผ๐ณ ๐๐ต๐ฒ ๐๐ถ๐ฟ๐๐ ๐๐ฑ๐ถ๐๐ถ๐ผ๐ป
The new edition builds on the success of the inaugural MLC-SLM Challenge, which was held as a satellite event of Interspeech 2025. The first challenge attracted 78 teams from 13 countries and regions, generated 489 valid leaderboard submissions across two tracks, and received 14 high-quality technical reports. ๐๐๐ ๐๐๐บ๐บ๐ฎ๐ฟ๐ ๐ฝ๐ฎ๐ฝ๐ฒ๐ฟ ๐ต๐ฎ๐ ๐ฎ๐น๐๐ผ ๐ฏ๐ฒ๐ฒ๐ป ๐ฎ๐ฐ๐ฐ๐ฒ๐ฝ๐๐ฒ๐ฑ ๐ฏ๐ ๐๐๐๐ฆ๐ฆ๐ฃ ๐ฎ๐ฌ๐ฎ๐ฒ, further demonstrating the challengeโs academic value and growing international visibility.
๐ฅ๐ฒ๐ด๐ถ๐๐๐ฟ๐ฎ๐๐ถ๐ผ๐ป ๐ณ๐ผ๐ฟ ๐๐ต๐ฒ ๐ฎ๐ป๐ฑ ๐ ๐๐-๐ฆ๐๐ ๐๐ต๐ฎ๐น๐น๐ฒ๐ป๐ด๐ฒ ๐ถ๐ ๐ป๐ผ๐ ๐ผ๐ฝ๐ฒ๐ป
โ March 30, 2026: Registration opens
โ April 10, 2026: Training data release
โ April 24, 2026: Development set and baseline system release
โ June 15, 2026: Evaluation set release and leaderboard open
โ June 25, 2026: Leaderboard freeze and paper submission portal opens (CMT system)
โ July 10, 2026: Paper submission deadline
โ July 20, 2026: Notification of acceptance
โ October 2, 2026: Workshop date
By offering open data, realistic tasks, and an international exchange platform, the challenge aims to bring together more research teams to advance multilingual conversational speech language modeling. The launch of the second edition also provides a new benchmark for pushing speech language models from simply โhearing clearlyโ toward genuinely โunderstandingโ conversations.
Registration Links: https://forms.gle/jfAZ95abGy4ZiNHo7
Official Website: https://www.nexdata.ai/competition/mlc-slm
๐ง๐ฎ๐ฟ๐ด๐ฒ๐๐ถ๐ป๐ด ๐ฅ๐ฒ๐ฎ๐น-๐ช๐ผ๐ฟ๐น๐ฑ ๐ ๐๐น๐๐ถ๐น๐ถ๐ป๐ด๐๐ฎ๐น ๐๐ผ๐ป๐๐ฒ๐ฟ๐๐ฎ๐๐ถ๐ผ๐ป๐
As speech language models continue to evolve, real-world multilingual conversations are becoming an increasingly important research direction. Unlike conventional ASR tasks, these scenarios involve multiple speakers, multi-turn interactions, and more complex acoustic and semantic information. Systems are expected not only to transcribe speech accurately, but also to determine who spoke when and ultimately understand the conversation as a whole.
The 2nd MLC-SLM Challenge is designed around this shift, focusing on multilingual conversational speech tasks that are closer to real application settings and providing an open benchmark and international platform for Speech LLM research.
๐๐ ๐ฝ๐ฎ๐ป๐ฑ๐ฒ๐ฑ ๐ง๐ฟ๐ฎ๐ถ๐ป๐ถ๐ป๐ด ๐๐ฎ๐๐ฎ: ๐๐ฟ๐ผ๐๐ป๐ฑ ๐ฎ,๐ญ๐ฌ๐ฌ ๐๐ผ๐๐ฟ๐ ๐๐ฐ๐ฟ๐ผ๐๐ ๐ญ๐ฐ ๐๐ฎ๐ป๐ด๐๐ฎ๐ด๐ฒ๐
One of the most significant highlights of this yearโs challenge is the dataset. The training set contains approximately 2,100 hours of multilingual conversational speech spanning ๐ญ๐ฐ ๐น๐ฎ๐ป๐ด๐๐ฎ๐ด๐ฒ๐: ๐๐ป๐ด๐น๐ถ๐๐ต, ๐๐ฟ๐ฒ๐ป๐ฐ๐ต, ๐๐ฒ๐ฟ๐บ๐ฎ๐ป, ๐๐๐ฎ๐น๐ถ๐ฎ๐ป, ๐ฃ๐ผ๐ฟ๐๐๐ด๐๐ฒ๐๐ฒ, ๐ฆ๐ฝ๐ฎ๐ป๐ถ๐๐ต, ๐๐ฎ๐ฝ๐ฎ๐ป๐ฒ๐๐ฒ, ๐๐ผ๐ฟ๐ฒ๐ฎ๐ป, ๐ฅ๐๐๐๐ถ๐ฎ๐ป, ๐ง๐ต๐ฎ๐ถ, ๐ฉ๐ถ๐ฒ๐๐ป๐ฎ๐บ๐ฒ๐๐ฒ, ๐ง๐ฎ๐ด๐ฎ๐น๐ผ๐ด, ๐จ๐ฟ๐ฑ๐, ๐ฎ๐ป๐ฑ ๐ง๐๐ฟ๐ธ๐ถ๐๐ต.
Among them, English contributes around 500 hours and includes diverse regional varieties such as US, UK, Australian, Indian, and Philippine English, while each of the other languages contributes roughly 100 hours. This expansion strengthens the challengeโs foundation for multilingual conversational speech research in terms of scale, language coverage, and regional diversity.
๐ก๐ฎ๐๐๐ฟ๐ฎ๐น ๐ง๐๐ผ-๐ฆ๐ฝ๐ฒ๐ฎ๐ธ๐ฒ๐ฟ ๐๐ผ๐ป๐๐ฒ๐ฟ๐๐ฎ๐๐ถ๐ผ๐ป๐ ๐๐ผ๐น๐น๐ฒ๐ฐ๐๐ฒ๐ฑ ๐ถ๐ป ๐ฅ๐ฒ๐ฎ๐น๐ถ๐๐๐ถ๐ฐ ๐ฆ๐ฒ๐๐๐ถ๐ป๐ด๐
The dataset is designed to better reflect real application scenarios. All recordings are natural two-speaker conversations, where participants discuss randomly assigned topics in a meaningful and fluent way. The audio was collected in quiet indoor environments using consumer devices such as iPhones, making the data closer to real-world collection conditions.
The dataset also includes real-time timestamps and speaker labels to support system development. In addition, Track 1 and Track 2 share the same training set, encouraging participants to explore unified modeling approaches across recognition, diarization, and conversational understanding.
๐ง๐๐ผ ๐๐ผ๐ฟ๐ฒ ๐ง๐ฎ๐๐ธ๐: ๐๐ฟ๐ผ๐บ โ๐ช๐ต๐ผ ๐ฆ๐ฝ๐ผ๐ธ๐ฒโ ๐๐ผ โ๐ช๐ต๐ฎ๐ ๐ช๐ฎ๐ ๐จ๐ป๐ฑ๐ฒ๐ฟ๐๐๐ผ๐ผ๐ฑโ
The challenge includes two main tasks.
Track 1: Multilingual Conversational Speech Diarization and Recognition
Track 2:Multilingual Conversational Speech Understanding
Unlike traditional speech benchmarks that focus primarily on transcription, the 2nd MLC-SLM Challenge places greater emphasis on multilingual, multi-speaker, and dialogue-level understanding. The evaluation setting does not provide prior information such as pre-segmented utterances or speaker labels, making the tasks closer to real deployment conditions.
๐๐๐ถ๐น๐ฑ๐ถ๐ป๐ด ๐ผ๐ป ๐๐ต๐ฒ ๐๐ป๐๐ฒ๐ฟ๐ป๐ฎ๐๐ถ๐ผ๐ป๐ฎ๐น ๐๐บ๐ฝ๐ฎ๐ฐ๐ ๐ผ๐ณ ๐๐ต๐ฒ ๐๐ถ๐ฟ๐๐ ๐๐ฑ๐ถ๐๐ถ๐ผ๐ป
The new edition builds on the success of the inaugural MLC-SLM Challenge, which was held as a satellite event of Interspeech 2025. The first challenge attracted 78 teams from 13 countries and regions, generated 489 valid leaderboard submissions across two tracks, and received 14 high-quality technical reports. ๐๐๐ ๐๐๐บ๐บ๐ฎ๐ฟ๐ ๐ฝ๐ฎ๐ฝ๐ฒ๐ฟ ๐ต๐ฎ๐ ๐ฎ๐น๐๐ผ ๐ฏ๐ฒ๐ฒ๐ป ๐ฎ๐ฐ๐ฐ๐ฒ๐ฝ๐๐ฒ๐ฑ ๐ฏ๐ ๐๐๐๐ฆ๐ฆ๐ฃ ๐ฎ๐ฌ๐ฎ๐ฒ, further demonstrating the challengeโs academic value and growing international visibility.
๐ฅ๐ฒ๐ด๐ถ๐๐๐ฟ๐ฎ๐๐ถ๐ผ๐ป ๐ณ๐ผ๐ฟ ๐๐ต๐ฒ ๐ฎ๐ป๐ฑ ๐ ๐๐-๐ฆ๐๐ ๐๐ต๐ฎ๐น๐น๐ฒ๐ป๐ด๐ฒ ๐ถ๐ ๐ป๐ผ๐ ๐ผ๐ฝ๐ฒ๐ป
โ March 30, 2026: Registration opens
โ April 10, 2026: Training data release
โ April 24, 2026: Development set and baseline system release
โ June 15, 2026: Evaluation set release and leaderboard open
โ June 25, 2026: Leaderboard freeze and paper submission portal opens (CMT system)
โ July 10, 2026: Paper submission deadline
โ July 20, 2026: Notification of acceptance
โ October 2, 2026: Workshop date
By offering open data, realistic tasks, and an international exchange platform, the challenge aims to bring together more research teams to advance multilingual conversational speech language modeling. The launch of the second edition also provides a new benchmark for pushing speech language models from simply โhearing clearlyโ toward genuinely โunderstandingโ conversations.
Registration Links: https://forms.gle/jfAZ95abGy4ZiNHo7
Official Website: https://www.nexdata.ai/competition/mlc-slm
Nexdata
MLC-SLM Competition Committee
mlc-slmw@nexdata.ai
Visit us on social media:
LinkedIn
Facebook
YouTube
X
Legal Disclaimer:
EIN Presswire provides this news content "as is" without warranty of any kind. We do not accept any responsibility or liability for the accuracy, content, images, videos, licenses, completeness, legality, or reliability of the information contained in this article. If you have any complaints or copyright issues related to this article, kindly contact the author above.

