ATHENS, Greece, Sept. 4, 2024 /PRNewswire/ — Nexdata, a leading global provider of AI data services, today released its latest data solutions of Multilingual Automatic Speech Recognition (ASR), Multi-timbral Text-to-speech (TTS), and Large Language Models (LLM) & Multi-modal applications at the INTERSPEECH 2024 Conference in Kos Island. As the world’s foremost conference on the science and technology of spoken language processing, INTERSPEECH serves as the perfect stage for Nexdata to showcase its advancements in high-quality speech data and services.
Leveraging the insights gained from the massive projects it has undertaken, Nexdata introduces its enhanced Multilingual ASR data solutions. The company proudly reports reaching a milestone of 1 million hours of speech datasets, encompassing over 60 countries and 100 languages, and which includes 45,000 hours of Accented English Speech Data, 300,000 hours of Spontaneous Dialogue Data, and more. This achievement underscores Nexdata’s commitment to becoming the premier destination for datasets in the marketplace, further solidifying its expertise in natural language processing.
In Multi-timbral TTS, Nexdata has long demonstrated its prowess with over 2 million voice samples across about 20 languages, all recorded by native speakers. These high-quality, multi-scenario, multi-domain TTS data solutions have proven invaluable to a wide range of applications.
Recognizing the rapidly evolving needs for LLM and Multi-modal data solutions, Nexdata has developed and tailored its data services and extensive datasets that enable clients to launch their Generative AI projects with ease. The transformative data solutions showcased at INTERSPEECH include data services for Supervised Fine-Tuning (SFT), Reinforcement Learning from Human Feedback (RLHF), Red Teaming, and a comprehensive selection of professional off-the-shelf multilingual RLHF datasets. These datasets include over 100 million natural conversation texts, correction pairs, question-answer pairs, and SFT data, as well as 200 million pairs of high-resolution images and videos, meticulously annotated with descriptive captions and metadata, among other offerings.
Nexdata is proud to be a Silver Sponsor of INTERSPEECH 2024. We invite in-person attendees to visit our booth (#01) to meet with our data solution experts, participate in Q&A sessions, and explore demonstrations of our latest speech data solutions that are designed to enhance AI model performance for tens of thousands of companies worldwide.
During INTERSPEECH 2024, Nexdata, in collaboration with ELDA—a company specialized in Data and Language Resources for AI-based Language Technologies, is co-hosting a social event titled Social Night: Tech and Data for Speech, Connection, and Inspiration. The event features attendance from industry leaders and researchers from Google, Microsoft, LG, and Pindrop, exchanging ideas and sharing insights.
For more information, visit: nexdata.ai
About Nexdata
Nexdata provides top-notch training data solutions and serves as your reliable partner. With an extensive array of off-the-shelf datasets and flexible data collection and annotation services, their mission revolves around unleashing AI’s full potential and expediting the AI industry’s growth.
Nexdata firmly believes in the transformative power of AI. At Nexdata, they deliver high-quality data solutions to clients in various industries, including automotive, retail, finance, high-tech, and others, allowing customers’ AI initiatives to thrive and benefit humanity.
SOURCE Nexdata