While significant in light of Sino-US technology rivalry, DeepSeek also holds a number of important implications for Southeast Asia and other emerging regions. (Photo by Samuel Boivin / NurPhoto / NurPhoto via AFP)

China’s Open-Source Revolution in Generative AI: Policy and Business Implications for Southeast Asia

Published

China has led the way in providing 'open weight' large language models (LLMs) such as DeepSeek-R1. This means that governments and companies in Southeast Asia can access them more easily and cheaply.

The emergence of the DeepSeek-R1 large language model (LLM) in January 2025 marked an important milestone from a technological, commercial, and political perspective. DeepSeek demonstrated that LLMs could be trained using 90 per cent less computational power than more established platforms such as ChatGPT, making them cheaper to develop. The emergence of DeepSeek-R1 also underscores another thing: what Silicon Valley can do, Hangzhou on the other side of the Pacific can do as well, if not better.

While significant in light of Sino-US technology rivalry, DeepSeek’s moment also holds a number of important implications for Southeast Asia and other emerging regions. The emergence of low-cost LLMs increases access for businesses, governments and researchers, and provides an opportunity for Southeast Asian nations to assert their data autonomy.

The first important point is that DeepSeek’s LLMs are largely based on open-source technology. This means that the source code, training data, and methodology are available for modification and improvement. Models like DeepSeek-R1 are made available on an “open-weight” basis. While this allows the model to be copied and adjusted, the methods used to train the models are not revealed.

Being open-source or open-weight impacts both the technological and commercial trajectory of LLMs.

Chinese technology companies such as Baidu, Alibaba, and Tencent have been active in developing open-source artificial intelligence (AI) models for many years. Their strategy, supported by Chinese universities and the government, can be seen as applying an open innovation model to accelerate technological development and leapfrog past the US.

However, Chinese companies are not the only ones investing in open-source AI. Meta and Google have also developed open-source LLMs as part of a commercial strategy aimed at lowering LLM development costs, attracting talent, and more effectively competing with proprietary LLMs.

A common competitive strategy for technology businesses is to try and “commoditise the complement”. For business users of OpenAI’s ChatGPT, whose models are closed, it can be smart to invest in open-source alternatives. The availability of free alternatives can erode the pricing power of OpenAI and lower the cost of LLM services overall. A similar strategy was followed by Oracle, which supported the open-source Linux operating system as a way to reduce the commercial power of Microsoft’s proprietary operating system.

Regardless of the underlying motivation, the availability of high-quality open-weight LLMs means that governments and companies in Southeast Asia can access them more easily and cheaply.

Governments can run their own LLMs without having to worry about the transfer of sensitive data abroad, giving them greater data autonomy. In 2022, Vietnam required social media providers to store data locally.

The rapid fall in AI model training costs is necessary for data and AI localisation to be economically viable. Singapore reportedly spent US$52 million developing its SEA-LION LLMs, eight times more than DeepSeek claims to have spent.

…China seems to have opened a door for Southeast Asia to catch up with the technology leaders. Yet, for Southeast Asia to walk through it requires not just an investment in local infrastructure and capabilities, but also a clear assertion of data and AI autonomy.

Looking beyond the public sector, open-source LLMs also level the commercial playing field with start-ups in Southeast Asia now having access to the same core LLMs as start-ups in China and the US. Locally developed AI models can boost productivity growth across the wider economy, while ensuring value is captured by local businesses instead of foreign firms.

Yet, the emergence of Chinese AI has also highlights a different cultural problem. Chinese LLMs are known to be trained to repeat the Chinese Communist Party’s (CCP) version of history and its political views, thus conforming to the censorship system in China. Likewise, models trained primarily on English texts have a predominantly Western worldview. 

Especially in Southeast Asia, with its large cultural, religious, and linguistic diversity, Western or Chinese LLMs may be insensitive to local social hierarchies, customs, and expressions. Poorly trained models based on unsuitable source material can pose significant societal risks. Just like how Facebook allegedly contributed to interethnic violence in Myanmar, new AI models could exacerbate existing social tensions.

Fortunately, LLMs can be retrained with relative ease. Given the training that they have received, open-weight Chinese LLMs are effectively CCP members. But R1 1776, another open source project, has shown that DeepSeek-R1 can be post-trained to remove these perceived biases.

For Southeast Asian countries, this highlights the importance of developing sufficient domestic capacity to localise and post-train LLMs for local conditions. Some of that capacity already appears to be present in the region, as highlighted by the aforementioned SEA-LION LLMs. Founded in 2015, Indonesia’s Kata.ai developed leading natural language processing technology specific for the Indonesian language. It outperformed the Indonesian-language capability of foreign competitors. Vietnam’s VinAI, a developer of manufacturing AI applications founded in 2019, recently sold its generative AI division to US semiconductor producer Qualcomm. This highlights the world-leading AI R&D in the region.

In short, the open-source turn in LLM development means that Southeast Asian countries now have an opportunity to exert much greater autonomy in using and applying such models.

First, countries should take advantage of the smaller size of new LLMs, which makes them much cheaper to deploy, use, and retrain locally without relying on foreign technology providers.

Second, countries should develop the capacity to retrain LLMs, making them more useful for local languages and more sensitive to local culture. Investments in LLM retraining could be seen as a public good and anchored at local universities, thus nurturing local talent and advancing R&D.

Third, countries should host their own models to collect their own data. Due to the limited digital content generated in some regional languages, assembling and curating a sufficiently large corpus of text is an essential resource for improving local LLMs. Instead of large amounts of information used by foreign firms, such data should ideally be stored and used by local organisations.

With its current generation of open-source LLMs, China seems to have opened a door for Southeast Asia to catch up with the technology leaders. Yet, for Southeast Asia to walk through it requires not just an investment in local infrastructure and capabilities but also a clear assertion of data and AI autonomy. Local universities and policymakers must ensure that they maintain a broad, society-centric view of AI, rather than a narrow commercial-technical perspective. This is essential to both achieve the full economic potential of AI and to ensure that the technology empowers, instead of harms, Southeast Asian societies.

2025/136

Pieter E. Stek is a Senior Lecturer at the Asia School of Business in Kuala Lumpur, Malaysia.