Scientists develop machine learning tool to accurately identify Arabic dialects in 22 Arabic-speaking countries
Scientists from the University of Sharjah have developed an AI system capable of precisely identifying Arabic dialects in 22 Arabic-speaking nations. Published in IEEE Xplore, their work addresses the challenges posed by the rich linguistic diversity, making traditional systems ineffective. Ashraf Elnagar, a Computer Science and Intelligence Systems professor, emphasized the uniqueness of Arabic dialects, each with distinct vocabulary, expressions, and pronunciation, complicating complete technological comprehension and differentiation.
Drawing on datasets of over 3,000 hours from YouTube, encompassing 19 dialects from regions such as Algeria, Egypt, and Jordan, the authors acknowledge technical hurdles. The innovation, however, demonstrated a high identification accuracy of 97.29% for regional and 94.92% for specific country dialects, using only 29% of typical data input. The significance of this technology lies in its potential to bolster voice-activated tech, including virtual assistants and automated customer service, thereby bridging communication divides across Arabic-speaking areas and rendering technology more accessible.
Such results attract considerable industry attention. Professors Elnagar, and student researchers Amr Barakat and Abdulla Aldhaheri highlight the model's capability as a multi-modal linguistic tool enhancing the functionality of AI-driven applications with reduced data demand. Released on the HuggingFace platform for further refinement, the project prefaces industry-wide application across several technology sectors. Prof. Elnagar promises the model's ability to continually change Arabic technology with collaborative industry contributions, broadening accessibility and efficiency across AI communication tools worldwide.
Earlier, SSP wrote that a third-person mode is coming to Halo Infinite next month.