NIST Issues New AI Risk Mitigation Guidelines and Software

On July 26, 2024, the National Institute for Standards and Technology (“NIST”), part of the Department of Commerce, released guidelines, a global engagement plan, and software covering various aspects of AI technology pursuant to Biden’s Executive Order on AI (“AI EO”). The release comprises both draft and finalized documents providing useful guidelines on risk mitigation for entities using AI to consider when deploying the technology; software for testing AI systems’ response to cyberattacks; and a plan for global engagement to develop AI standards. While following the guidelines and using the software to test models are not legally required, entities that use or develop AI should review these materials and consider the extent to which they may want to incorporate the recommended risk mitigation measures.

A. Risk Mitigation Guidelines for Developers of Generative AI and Dual-Use Foundation Models

1. Managing Misuse Risk for Dual-Use Foundation Models (AI.800-1.ipd.pdf)

Per Section 4.1(a)(ii) of the AI EO, NIST’s U.S. AI Safety Institute, which was recently launched to research and promote AI policy, published an initial draft of guidelines titled “Managing Misuse Risk for Dual-Use Foundation Models.” The document outlines voluntary best practices for developers of dual-use foundation models, which are defined in the AI EO as AI models trained on broad data, that contain at least tens of billions of parameters, and that exhibit performance that can pose a serious risk to security, national economic security, national public health, or safety. The draft guidelines do not discuss other types of potential risks that may be present in foundational models, such as bias, discrimination, or hallucinations. Although the guidelines are aimed primarily at foundation models’ initial developers, they emphasize that managing misuse of dual-use foundation models will ultimately require caution from stakeholders at all stages of AI development and deployment.

The document first recognizes the key challenges of managing risks for dual-use foundation models, such as the unpredictable and diverse use case of dual-use foundation models. It then outlines seven objectives that any developer of foundation models should consider:

Anticipate potential misuse risk;
Establish plans for managing misuse risk;
Manage the risks of model theft;
Measure misuse risk;
Ensure that misuse risk is managed before deploying foundation models;
Collect and respond to information about misuse after deployment; and
Provide appropriate transparency about misuse.

For each objective, the guidelines provide more specific tips for practice and recommendations. For example, developers should plan for various misuse risks, including development of chemical, biological, radiological, or nuclear weapons; cyber operations; or generation of child sexual abuse material or nonconsensual intimate images. Threat profiles should be identified, and models should be evaluated for their capabilities and red-teamed before and after deployment. The guidelines also warn against possible thefts of models and recommend development of and compliance with strict cybersecurity practices. Such measures should be taken especially seriously if the model’s weights are publicly accessible. Lastly, the guidelines recommend that developers disclose their risk management practices and any misuse cases for transparency and accountability.

2. AI Risk Management Framework (RMF) Generative AI (GAI) Profile (NIST AI 600-1)

NIST also released the final version of its AI RMF GAI Profile. The profile describes risks unique to or exacerbated by GAI and provides a set of suggested practices that organizations can adopt to manage these risks based on their business requirements, risk tolerances, and resources. The profile is to be used as a companion resource to NIST’s AI RMF, per Section 4.1(a)(i)(A) of the AI EO.

The profile describes 12 risks unique to or exacerbated by the development and use of GAI, including (i) confabulation, i.e., the production of confidently stated but erroneous or false content; (ii) difficulty controlling public exposure to dangerous, violent, or hateful content; (iii) data privacy impacts due to leakage and unauthorized use, disclosure, or de-anonymization of PII or sensitive data; (iv) harmful bias and homogenization possibly occurring due to nonrepresentative training data; (v) human-AI configuration, i.e., arrangements of or interactions between a human and an AI system, which can result in algorithmic aversion, automation bias, or inappropriate anthropomorphizing of, overreliance on, or emotional entanglement with GAI systems; (vi) risks to information security, through lowered barriers for offensive cyber capabilities; and (vii) intellectual property risks, including eased exposure of trade secrets or illegal replication.

Also, using a tabular format, the GAI profile provides high-level suggestions to manage these risks based on the relevant AI RMF subcategories. The profile also lists the GAI risks that each of its suggested actions would be addressing. For example, in the case of Manage 2.4, the risks being addressed are “information security” and “human-AI configuration.” Manage 2.4 of the AI RMF accordingly recommends ensuring that “[m]echanisms are in place and applied, and responsibilities are assigned and understood, to supersede, disengage, or deactivate AI systems that demonstrate performance or outcomes inconsistent with intended use.” To address this recommendation, the profile suggests four action items, including that organizations “[e]stablish and regularly review specific criteria that warrant[] the deactivation of GAI systems in accordance with set risk tolerances and appetites.”

3. Secure Software Development Practices (“SSDPs”) for Generative AI and Dual-Use Foundation Models (NIST Special Publication (SP) 800-218A)

NIST also published the final version of the supplemental SSDPs for Generative AI and Dual-Use Foundation Models.¹ This document supplements Secure Software Development Framework (SSDF) version 1.1, which broadly addresses safe software coding practices, to explore best practices for AI models and AI systems developers and acquirers. The practices note unique challenges to developing secure AI-based software, such as the blurring of traditional boundaries between system code and system data. AI systems are also unique in that they are vulnerable to malicious training data. The practices ultimately recommend that software developers anticipate and respond to security vulnerabilities at the organizational, model, and software programming levels, and install robust systems for assessing and responding to vulnerabilities, including those related to training data.

B. Secure Software for Testing AI Systems Response to Cyberattacks

In response to Section 4.1(ii)(B) of the AI EO, NIST released a software known as “Dioptra.” Dioptra is a security testbed (i.e., a testing platform for evaluating the performance, reliability, and safety of new technology) that makes it easier for users to both determine the sorts of attacks that would make their AI model perform less effectively and quantify the performance reduction so that the user can learn how often and under what circumstances the system would fail. The software supports the “Measure” function of the NIST AI RMF by providing functionality to assess, analyze, and track identified AI risks.

Dioptra is designed to be used for AI model testing, research purposes, evaluations and challenges, and red-teaming exercises. NIST has identified four primary user levels: (i) individuals who have little or no hands-on experience with the testbed; (ii) individuals who want to analyze a wider variety of scenarios; (iii) individuals who want to run experiments using novel metrics, algorithms, and analytical techniques; and (iv) individuals who want to expand the testbed’s core capabilities by contributing to the distribution. The software is available for download here.

C. Plan for Global Engagement to Develop International AI Standards (NIST AI 100-5)

Lastly, pursuant to Section 11(b) of the AI EO, NIST published the Department of Commerce’s final plan for global coordination of AI standards development.² “Standards” are documents “established by consensus and approved by a recognized body, that provide[] for common and repeated use, rules, guidelines or characteristics for activities or their results, aimed at the achievement of the optimum degree of order in a given context.”³ The plan prioritizes standardizing terminology and taxonomy, transparency about the origins of digital content and data characteristics, risk-based management of AI systems, and security and privacy, among others. The plan notes that ultimately standards should be accessible and amenable to adoption and reflect diverse input from global stakeholders. This will require continued domestic and global capacity building and international collaboration among AI experts. Notably, the plan briefly mentions the importance of continued exploration of the relationship between standards and open-source software in the context of AI systems, which has yet to develop.

¹Initial draft available at: https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-218A.ipd.pdf

² Initial draft available at: https://airc.nist.gov/docs/NIST.AI.100-5.Global-Plan.ipd.pdf

³ISO/IEC Guide 2 “Standardization and related activities — General vocabulary,” available at https://isotc.iso.org/livelink/livelink/fetch/2000/2122/4230450/8389141/ISO_IEC_Guide_2_2004_%28Multilingual%29_%2D_Standardization_and_related_activities_%2D%2D_General_vocabulary.pdf?nodeid=8387841&vernum=-2