Polish outperforms English in AI long-context chatbot tasks – podcasty.polskieradio.pl

Listen
Polish achieves the highest accuracy in multilingual long-context AI tasks, surpassing English and other major languages, according to a new study by researchers from the University of Maryland and Microsoft.
A recentย analysis found that AI models perform best when prompted in Polish, outperforming widely spoken languages such as English and Chinese.
The ranking comes fromย One Ruler to Measure Them All: Benchmarking Multilingual Long-Context Language Models, by Yekyung Kim, Jenna Russell, Marzena Karpinska, and Mohit Iyyer, affiliated with the University of Maryland and Microsoft.
The study introduces ONERULER, a benchmark for evaluating large language models across 26 languages, focusing on tasks requiring long contextual understanding.
Researchers found that Polish achieved the highest performance, while English ranked 6th.
๐ค๐ต๐ฑ A new study from the ๐จ๐ป๐ถ๐๐ฒ๐ฟ๐๐ถ๐๐ ๐ผ๐ณ ๐ ๐ฎ๐ฟ๐๐น๐ฎ๐ป๐ฑ has found that AI chatbots like ChatGPT ๐ฝ๐ฒ๐ฟ๐ณ๐ผ๐ฟ๐บ ๐ฏ๐ฒ๐๐ ๐๐ต๐ฒ๐ป ๐ฝ๐ฟ๐ผ๐บ๐ฝ๐๐ฒ๐ฑ ๐ถ๐ป…
Experiments included both open-weight and closed large language models, such as OpenAIโs o3-mini-high, and tested context lengths from 8K to 128K tokens.
The results also highlighted performance drops in low-resource languages and fluctuations in cross-lingual scenarios where instructions and context appeared in different languages.
(mp)
Source:ย arXiv:2503.01996/Radio Poland