China proposes blacklist of training data for generative AI models

Illustration shows Artificial Intelligence words

Artificial Intelligence words are seen in this illustration taken March 31, 2023. REUTERS/Dado Ruvic/Illustration Acquire Licensing Rights

BEIJING, Oct 12 (Reuters) – China has published proposed security requirements for firms offering services powered by generative artificial intelligence, including a blacklist of sources that cannot be used to train AI models.

Generative AI, popularised by the success of OpenAI’s ChatGPT chatbot, learns how to take actions from past data, and creates new content like text or images based on that training.

The requirements were published on Wednesday by the National Information Security Standardization Committee, which includes officials from the Cyberspace Administration of China (CAC), the Ministry of Industry and Information Technology, and the police.

The committee proposes conducting a security assessment of each body of content used to train public-facing generative AI models, with those containing “more than 5% of illegal and harmful information” to be blacklisted.

Such information includes “advocating terrorism” or violence, as well as “overthrowing the socialist system”, “damaging the country’s image”, and “undermining national unity and social stability”.

The draft rules also state that information censored on the Chinese internet should not be used to train models.

Its publication comes just over a month after regulators allowed several Chinese tech firms, including search engine giant Baidu (9988.HK), to launch their generative AI-driven chatbots to the public.

The CAC has since April said it wanted firms to submit security assessments to authorities before launching generative AI-driven services to the public.

In July, the cyberspace regulator published measures governing such services that analysts said were far less onerous than measures outlined in an April draft.

The draft security requirements published on Wednesday require organisations training these AI models to seek the consent of individuals whose personal information, including biometric data, is used for training purposes.

They also lay out detailed guidelines on how to avoid intellectual property violations.

Countries globally are grappling with setting guardrails for the technology. China sees AI as an area in which it wants to rival the U.S, and has set it sights on becoming a world leader in the field by 2030.

Reporting by Eduardo Baptista. Editing by Jane Merriman and Jan Harvey

Our Standards: The Thomson Reuters Trust Principles.

Acquire Licensing Rights, opens new tab

Go to Source