ICT TREND

A team of Korean researchers has achieved the feat of establishing a key international standard for verifying the safety and reliability of artificial intelligence (AI) systems. This achievement is the result of over five years of effort, and South Korea has come to lead not only AI technology but also AI norms and reliability verification standards.

Electronics and Telecommunications Research Institute (ETRI) announced that the “Overview of AI System Testing” standard1)AI System Testing Overview (ISO/IEC TS 42119-2) Standard: This is a technical specification (TS) that provides requirements and guides for applying existing software testing standards (ISO/IEC/IEEE 29119 series) to AI systems. It defines test processes, test levels, and test types tailored to the characteristics of AI systems., which defines the procedures and methodologies to test AI systems, was officially established by the International Organization for Standardization (ISO/IEC JTC1)2)International Organization for Standardization (ISO/IEC JTC 1): The “Joint Technical Committee 1”, jointly established in 1987 by the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC) to develop international standards in the field of information and communication technology (ICT). A joint committee for standardization related to information and communication technology (ICT), whose main purpose is to prevent standard conflicts between the two organizations and promote efficient joint standardization. on the 3rd of November, 2025.

Progress of the AI System Testing Overview (ISO/IEC TS 42119-2) standard establishment

This achievement holds great significance in that it is the first key international AI testing standard led and established by South Korea in the ISO/IEC Artificial Intelligence Technical Committee (SC 42).

This achievement proves that ETRI has positioned itself as a “First Mover” beyond a “Fast Follower” in the global AI technology hegemony competition.

ETRI stated that this achievement is the first to define AI full-lifecycle test methodologies such as data quality, model performance, and bias of artificial intelligence. Furthermore, the institute explained that this technology is significant in that it is the establishment of the first international standard to be used for international official testing and international conformity assessment in the future. It also means that standards for verification and certification methods and conformity testing methods, which are required as essential items for high-impact and high-risk AI systems in the AI Basic Act and the EU AI ACT, have been created.

Specifically, this standard extends the existing software testing standard3)Software Testing Standard (ISO/IEC/IEEE 29119 series): An international standard series on software testing. This standard defines test terminology, processes, documentation, and techniques that can be used throughout the software development lifecycle. It replaces and integrates several existing standards such as IEEE 829 (Test Documentation) and IEEE 1008 (Unit Testing). to fit AI systems, defining new testing stages such as ▲data quality testing4)Data quality testing: A test level that verifies the quality of data used to create (train) an AI model. It aims to reduce the risk of model quality degradation by evaluating data accuracy, completeness, bias, and representativeness. and ▲model testing5)Model testing: A test level where the trained AI model itself is the subject of testing. It verifies whether the model shows acceptable performance within the intended context of use and whether there are risks related to functional accuracy or bias. tailored to AI characteristics.

Through this, a foundation has been laid to comprehensively verify everything from data quality, a core element of AI systems, to model performance. In addition, the concept of “risk-based testing6)Risk-based testing (RBT): A systematic testing strategy that identifies and analyzes potential risks associated with an AI system and determines the testing approach and effort based on the priority of these risks.” was introduced to check potential risks of AI in advance.

Based on this, AI-specific testing procedures were specified, such as ▲bias testing7)Unwanted bias testing: A testing that analyzes and measures whether an AI model produces unfavorable or unfair results for specific groups based on sensitive attributes such as gender, race, or age. Bias can occur when the training data itself under/over-represents a certain group (selection bias) or follows past prejudices (confirmation bias), and this test aims to detect discriminatory patterns before model deployment. for AI bias verification, ▲adversarial testing8)Adversarial testing: A type of testing that identifies vulnerabilities in which a model malfunctions unexpectedly by generating inputs (adversarial examples) intentionally designed to deceive the AI model. This improves the robustness and security of the model. using input value changes, and ▲drift testing9)Drift testing: A type of testing that detects the “concept drift” phenomenon, by which the performance of an AI model deployed in an operational environment degrades over time. It continuously monitors whether the model accurately processes the latest data. to check performance degradation during operation.

This standard is a “general” standard that serves as the basis for subsequent standards to be established in the future, such as ▲AI Red Teaming Testing10)AI Red Team Testing (ISO/IEC 42119-7): Exploratory and aggressive testing methods that intentionally try different attacks and test the limits of a system to identify potential vulnerabilities, harmful or biased results, security issues, etc., in an AI system It aims to proactively detect real-world threats and strengthen the defense system. and ▲Generative AI Testing11)Generative AI testing (ISO/IEC 42119-8): A Technical Specification (TS) standard for “Quality evaluation of prompt-based text-to-text generative AI systems.” This standard provides definitions, requirements, and guides for evaluating the quality and safety of generative AI such as chatbots, and includes evaluation methodologies such as benchmark testing for text-based generative AI systems., so it is of great significance that Korea designed the foundation of the AI reliability verification system.

“This international standard was first proposed by Jeon Jong hong, Principal Researcher at ETRI’s Intelligence & Information Standards Research Section, and was finalized in collaboration with Dr. Stuart Reid, a internationally recognized authority in software testing and Technical Director of STA Testing Consulting, with both serving as co-editors.” STA Testing Consulting, a software testing specialist company, is one of ETRI’s startup companies.

The two organizations formed a Joint Working Group (JWG 2) between the AI Standardization Committee (SC 42) and the SW Testing Committee (SC 7) under ISO/IEC JTC 1 and jointly promoted development for five years.

This achievement serves as a technical basis for supporting “Sovereign AI12)Sovereign AI: Technological sovereignty, by which a country can secure its own AI technology, data, and infrastructure, and develop and control AI without being dependent on external sources and in accordance with its own values and norms.” and the “AI G3 Leap” strategy promoted by the government, which aims to ‘implement safe and reliable AI’.

By establishing global standards for the objective verification of AI system performance and risk, Korea has laid the groundwork to lead international norms in AI safety and trustworthiness in the international market.

ETRI President Bang Seung chan stated, “Ensuring the safety and reliability of AI is a core task in the era of artificial intelligence. The establishment of this international standard will be a turning point for Korea such that it will be able to lead not only AI technology but also AI testing and evaluation norms.”

Lee Seung yun, assistant vice president of the Standards Research Division at ETRI, also said, “This standard is the ‘skeleton’ of common criteria for testing and evaluating the safety and reliability of AI systems worldwide, created by our own hands. We will actively strive to lead ‘Sovereign AI testing technology and standardization’ in the future.”

Principal researcher Jeon Jong Hong, who led the establishment of this standard and is also a member of the National AI Strategy Committee, emphasized, “Standardization activities of JTC 1/SC 42, which creates common core AI standards, should be strengthened,” and “It is time to create a national AI standardization strategy and make more active and long-term investments in international AI standardization.”

Based on this established standard, ETRI plans to lead the development of international AI testing standard series by continuing to develop subsequent series such as the Red Teaming standard (ISO/IEC 42119-7)13)Red Teaming standard (ISO/IEC 42119-7): The first common international standard for AI red teams that defines terminology, procedures, and methods for AI red team testing. Development of the standard began in March 2025, and is currently underway, with a target date of December 2027. (Editor: Jeon Jong Hong, Principal Researcher of ETRI), which is currently under development, as well as the Ontology standard (ISO/IEC 42119-10) and AI Benchmark (ISO/IEC 42119-11).

This achievement was made through the linkage and expansion of the “Development of International Standards for AI-based Medical Device Performance Evaluation Technology (2020.9.~2023.2.)” project and the “Development of International Standards for AI/Machine Learning Medical Device Performance Evaluation (2023.1.~2025.12.)” project, which were promoted as part of the “Inter-ministerial Full-cycle Medical Device R&D Project” supported by the Ministry of Food and Drug Safety.

Jeon Jong Hong, Principal Researcher
Intelligence & Information Standards Research Section
(+82-42-860-5333, hollobit@etri.re.kr)

ETRI Claims Historic First in Establishing Global Standard for “AI Testing”

ICT TREND

ETRI Claims Historic First in
Establishing Global Standard
for “AI Testing”