When AI Becomes Less Intelligent Beyond the English-speaking World
Published
Social media depend largely on English-language models to train their AI content moderation models. This has become a problem.
As artificial intelligence (AI) becomes more embedded in content moderation on social media platforms, its role in maintaining digital safety has never been more important or more contentious. While AI tools have significantly improved the detection of harmful or illegal content at scale, they remain deeply flawed, especially when operating in languages other than English. Beyond the English-speaking world, the challenges of deploying AI for content moderation are not merely technical, but also cultural, political, and ethical — a failure to train AI models adequately could potentially result in harm in the real world.
Content moderation algorithms are primarily trained on English-language data. About 95 per cent of the Large Language Models (LLMs) are trained primarily on English or a combination of English with other dominant languages. A report by the Center for Democracy and Technology in 2024 similarly found that major tech companies like Meta (formerly Facebook), YouTube, and X (formerly Twitter) spent disproportionately more on English-language moderation than on other languages, even though the majority of their user base lived outside the anglophone world. As a result, AI systems often fail to understand context, nuance, or regional dialects in languages, leading to over-censorship of innocuous posts and the under-enforcement of genuinely harmful content.
A stark example is Myanmar, where Facebook faced global criticism for its role in amplifying hate speech during the Rohingya crisis of 2017. Despite Facebook’s dominance in the country, the platform was slow to recognise and act on hate speech written in Burmese. Scholars, reporters, and United Nations special rapporteurs have unanimously concluded that Facebook had been used to incite violence and that its content moderation policies were inadequate amid the genocide of Rohingya in that year. The platform has since tried to improve its Burmese language moderation, but the damage had already been done. This underscores the dangers of relying on English-centric AI systems in multilingual societies.
Even when local language support exists, AI struggles with local context. Sarcasm, coded language, and regional slang often slip past filters or get flagged erroneously. For example, during political protests in Thailand, protesters adopted creative and indirect ways of criticising the monarchy, such as using food metaphors or fictional characters. Lacking cultural fluency, AI systems missed these workarounds entirely or mistakenly flagged unrelated content.
The path forward lies not in smarter algorithms alone, but in more inclusive, accountable, and culturally aware systems.
A more recent example can be found in the recruitment tactics used by malignant actors. The image below is a screenshot of a Facebook post that uses a mixture of Thai, English, and symbols to form a sentence as denoted in the green boxes. To local Thais, the text is quite obvious: it is read and understood as a job post recruiting for an administrative staff to oversee auditing and financial transactions for an illicit “grey” business (known as สายเทา- or sai tao).

This, however, is not so obvious for those who are not fluent in Thai. Like people not fluent in Thai, AI models, too, are unable to flag social media posts when alphabets are being turned into symbols, rendering keyword-based filtering or tokenisation ineffective measures for content moderation. This means that harmful content can be hidden in plain sight on social media platforms, perpetuating the global rise of scam operations.
Understandably, there is also the challenge of data scarcity. Many languages lack the vast annotated datasets required to train effective AI models. This is especially true for low-resource languages spoken in parts of Africa, Southeast Asia, and Latin America. In these cases, companies rely on poorly translated datasets for machine-learning models, resulting in higher error rates.
To improve, platforms should invest more equitably in global language support. Efforts can include hiring and fairly compensating native-speaking moderators. Intuitively, this runs counter to the emerging trend of downsizing content moderation teams among tech giants, but is quite necessary to ensure the safety of marginalised communities as well as provide more funding to support the creation of high-quality local language datasets to then be used to build LLMs.
Transparency is also crucial. Companies should disclose the languages their AI tools support, the accuracy of moderation decisions in each language, and the role of human oversight. Such efforts would be of some benefit. Currently, social media companies do not disclose what they do (or do not) in their content moderation. While it should be expected that some form of human intervention is employed in the moderation process, it is also unclear when such human agents intervene and how much their services are actually used in the content moderation process.
Ultimately, content moderation is not just a technological problem but a governance challenge. AI can assist in scaling moderation, but it should be complemented by local expertise. The current Anglocentric approach is unsustainable in a digital ecosystem where most users communicate in languages other than English. The path forward lies not in smarter algorithms alone, but in more inclusive, accountable, and culturally aware systems. Until then, the promise of safe and equitable digital spaces will remain out of reach for much of the world where English and other major languages are not the norm.
2025/184
Surachanee Sriyai was a Visiting Fellow with the Media, Technology and Society Programme at ISEAS – Yusof Ishak Institute. She is the interim director of the Center for Sustainable Humanitarian Action with Displaced Ethnic Communities (SHADE) under the Regional Center for Social Science and Sustainable Development (RCSD), Chiang Mai University.









