Information

Go Buy AI: A Breakdown of OMB's M-24-18 Memorandum

"Go buy AI, but be careful, and talk more, but do it better/faster/cheaper too"

Oct 4, 2024

tl;dr:

The OMB's M-24-18 memo outlines new requirements for federal agencies acquiring AI. It emphasizes:

  • Collaboration: Agencies need clear cross-functional collaboration procedures for AI acquisition, involving CAIOs, CIOs, CISOs, CFOs, SAOPs, and others. This includes a plan for coordination, submitted to OMB by March 23, 2025.

  • Risk Management: Existing contracts for "rights-impacting" or "safety-impacting" AI must be compliant with OMB M-24-10 by December 1, 2024. This includes ongoing testing, cybersecurity, incident reporting, and potentially, public notice requirements.

  • Competition: To avoid vendor lock-in, agencies should prioritize interoperability, use performance-based acquisitions, and consider innovative practices like modular contracting (see Appendix I of the memo).

Cross-Functional Collaboration is Key

The memo stresses that buying AI isn't just an IT issue. It requires a joined-up approach within agencies.

  • Formalized Processes: Agencies need written policies ensuring any AI acquisition follows the memo's guidelines and OMB M-24-10's risk management framework. By March 23, 2025, CAIOs must update OMB on their progress.

    • These processes should cover roles and responsibilities throughout the acquisition lifecycle.

    • They need to specify when and how decisions about AI acquisition, deployment, and even decommissioning are escalated to higher levels.

  • Strategic Alignment: It's not enough to just tick boxes, agencies need to think strategically about AI.

    • The CAIO must work with the CAO, CIO, CFO, SAOP, and others to develop a plan for coordinating AI acquisition with the agency's broader goals. This should include forecasted budgets.

    • Critically, this planning MUST factor in the costs of managing AI risks, especially those to privacy, civil liberties, and safety. This needs to be reflected in budget submissions.

  • Sharing is Caring: The memo wants to avoid agencies constantly reinventing the wheel.

    • The CAIO Council, with OMB and others, will create a central repository for AI acquisition information.

    • This includes things like successful and unsuccessful acquisition attempts, contract templates, and best practices.

    • GSA will explore making this repository easily accessible online for agencies.

Managing AI Risks: Going Beyond the Usual

The memo acknowledges that AI brings unique risks, requiring steps beyond typical IT procurement.

  • Knowing What You're Buying: It seems obvious, but agencies need to be clear if they're even buying AI!

    • The memo provides guidance on what constitutes an "AI system" versus common commercial products where AI is just a small part.

    • Officials should list key AI features in solicitations, ask vendors to identify any AI used in their proposed solutions, and train staff to understand these aspects.

    • SAOPs and privacy programs MUST be involved EARLY in the process, ensuring privacy is baked in, not an afterthought.

  • Rights and Safety First: For AI that could impact people's rights or safety, the bar is higher.

    • By November 1, 2024, agencies must identify ALL existing contracts where AI use falls under this category.

    • By December 1, 2024, these contracts MUST be compliant with OMB M-24-10, including additional requirements outlined in Sections 4(d), 4(e), and 4(f)(ii) of M-24-18.

    • From December 1, 2024 onward, this compliance is mandatory for ALL new contracts in this category.

Performance and Risk: Baking It In From the Start

The memo pushes for a more hands-on approach to ensure AI works as intended and risks are controlled.

  • Performance-Based Acquisition: Forget just buying a box, agencies need to buy OUTCOMES.

    • Requirements should focus on the desired results, allowing agencies to evaluate whether AI can actually deliver, not just what vendors claim it can do.

    • This ties into ongoing monitoring and performance measurement throughout the contract's life.

  • Privacy and Security: No surprises here, but the memo lays out specific steps.

    • Contracts must include requirements for cybersecurity approvals and data protection that meet existing government-wide standards.

    • Before signing on the dotted line, agencies need detailed documentation on how the AI was trained and how data will be managed. This includes data sources, labeling, access controls, and disclosure of any copyrighted material used in training.

    • This due diligence extends to the vendor's data supply chain, not just their internal practices.

  • Avoiding Vendor Lock-In: Agencies should prioritize interoperability and data portability, ensuring they're not stuck with one vendor forever.

    • Contracts must clearly state what rights the government has over its OWN data, even after it's been used to train an AI model.

    • Agencies should push for open-source development practices where feasible, avoiding proprietary black boxes.

Specific Requirements for Rights and Safety

When AI could impact fundamental rights or public safety, the memo gets even more granular in its instructions.

  • Transparency is Paramount: Agencies must be upfront about using AI in these sensitive areas.

    • Whenever possible, solicitations should explicitly state whether the AI being procured will be used for rights or safety-impacting applications.

    • This allows potential vendors to understand the higher stakes and tailor their bids accordingly.

  • Proof is in the Data: It's not enough to just say an AI is safe or fair; agencies need evidence.

    • Contracts MUST require vendors to provide all documentation and access needed to monitor the AI's impact on rights and safety.

    • This includes details on training data, model design, and any bias mitigation strategies used.

  • Continuous Monitoring and Evaluation: Deploying an AI isn't the finish line; it's just the beginning.

    • Contracts must include provisions for regular monitoring and evaluation of the AI's performance and risks throughout its lifecycle.

    • Agencies need to know if the AI's performance degrades over time or if its impact on rights and safety changes with new data.

  • Incident Reporting: Even with the best precautions, things can go wrong.

    • Contracts must require vendors to report any "serious AI incidents" within 72 hours.

    • This includes malfunctions, unexpected outcomes impacting rights or safety, and disruptions to critical infrastructure.

  • Public Notice and Feedback: For AI that directly impacts the public, transparency is crucial.

    • When feasible, agencies should notify individuals if an AI-enabled decision affects them.

    • This might involve explaining the AI's role and providing avenues for recourse or appeal.

    • Contracts should require vendors to support these notification and appeal processes, ensuring the agency has the necessary information and access.

Generative AI: New Kid on the Block, New Rules Apply

Recognizing the unique nature and potential risks of generative AI, the memo outlines additional guidelines.

  • General Use, Enterprise-Wide Systems: These are powerful AI models used across multiple agency components for a wide range of tasks, not just specific use cases.

  • Transparency in Training and Evaluation: Agencies must document extensively how these AI systems were developed and vetted.

    • This includes data sources, labeling processes, model architecture, and any bias mitigation efforts.

  • Mitigating Harmful Outputs: Safeguards are crucial to prevent these AI systems from generating inappropriate or dangerous content.

    • Contracts must include requirements for vendors to implement best practices to minimize risks, such as filtering out harmful content from training data.

    • Agencies must also be able to configure the AI to limit harmful outputs, ensuring they retain control over its behavior.

Candidly…

Frankly, some of the memo reads as "lets not invent the wheel 37 times", it would be helpful to know who the current gold standard is for these processes so that we can start looking in that direction for where the rest of USG is going.

It is possible however that no one is the gold standard.


Formalize Cross-Functional Collaboration to Manage AI Performance and Risks. Each agency must establish or update policies and procedures for internal agency collaboration to ensure that acquisition of an AI system or service will have the appropriate controls in place to comply with the requirements of this memorandum, and that the agency’s use of the acquired AI will conform to OMB Memorandum M-24-10. Within 180 days of issuance of this memorandum, agency CAIOs must submit written notification to OMB identifying progress made toward implementing this requirement, and identifying any challenges encountered or best practices identified during implementation. These policies and procedures should facilitate the cross-functional collaboration necessary to achieve timely acquisition and proactive risk management. Agencies must address: 

A. How planned acquisitions that involve an AI system or service will be initially reviewed by relevant agency officials to determine whether additional practices for managing AI performance and risk, as delineated in Section 4, are necessary; 

B. How officials with AI expertise and relevant equities (e.g., acquisition (including competition advocates), IT, cybersecurity, privacy, civil rights and civil liberties, budgeting, data, legal, program evaluation) are included in decision-making and coordination processes associated with the acquisition;16 and 

C. Conditions under which reviews and decision-making must be escalated, and to whom, including for planned AI acquisitions, implementing performance and risk management practices, monitoring and post-award management, and decommissioning. 

How is this done currently? We have some thoughts…

Centralization of Interagency Information and Knowledge Sharing. The CAIO Council, in consultation with OMB, the CIO Council, the General Services Administration (GSA) and the AI Community of Practice (CoP), will identify information and artifacts on AI acquisition to be collected and made available to all executive branch agencies. At a minimum, the CAIO Council should consider information such as: 

A. Examples and lessons learned from successful and unsuccessful attempts to acquire AI, including sample requirements and contract clauses and provisions, and issues discovered in procured AI, especially for generative AI and other models which are commonly used across agencies; 

B. Templates, including on novel and innovative AI practices as discussed in Section 4(c)(i) and Section 6, mechanisms to monitor and manage risk, disclosures from vendors (e.g., model and dataset cards, reports about company policies and processes); 

C. Best practices, guides, methodologies (e.g., for responsible AI acquisition; testing, evaluation, and continuous monitoring; data access and restrictions related to the appropriate control of Federal data; and applying modular contracting practices in the AI context); and 

D. Resources for assessing benefits and trade-offs between in-house AI development, contracted AI development, and licensing of AI-enabled software. 

It will be interesting to see who is first to start putting this language into their RFIs, RFPs, and RFQs.

Agencies are required by Section 3 of OMB Memorandum M-24-10 to maintain and annually update an AI use case inventory. Understanding when AI is being acquired is also a prerequisite to managing the risks and performance of AI systems and services. To help agencies identify the acquisition of AI covered by this memorandum, officials responsible for acquisition planning, requirements development, and proposal evaluation should: 

i. Communicate to the vendor, to the greatest extent practicable, whether the acquired AI system or service is intended to be used in a manner that could impact rights or safety. In cases where an agency intends to procure AI capacity without full awareness of potential future use cases, the agency should decide during acquisition planning whether or not to require that any awards support use cases involving rights-impacting or safety-impacting AI, and plan accordingly; 

ii. In cases where an agency’s solicitation does not explicitly ask for an AI system, consider requirements language asking vendors to report any proposed use of AI as part of their proposal submissions; 

iii. Require contractors to provide a notification to relevant agency stakeholders prior to the integration of new AI features or components into systems and services being delivered under contract. When notified, agencies should leverage their standard processes for determining whether risks from the use of AI are sufficiently managed, consistent with the requirements of OMB Memorandum M-24-10 and this memorandum, prior to accepting the contractor’s proposed integration. This includes cases where integration of new AI features or components could impact rights or safety; in such cases agencies must ensure compliance with all applicable requirements for use of such AI; and 

iv. Communicate with vendors to determine when AI is a primary feature or component in an acquired system or service. This should also include questions to the vendor to understand if AI is being used in the evaluation or performance of a contract that does not explicitly involve AI.

This section SHOULD be easy. Government retains data right to all of the data is gives vendors. Vendors retain rights to what they bring to the table. They negotiate any shared IP. Why is this even an issue? Because the government has been burned many times in the past with companies ingesting their data, converting it to proprietary formats, and then not letting it go.

Determine Appropriate Intellectual Property Rights and Ownership. Consistent with applicable laws and governmentwide policy, an agency must include appropriate contractual terms that clearly delineate the respective ownership and intellectual property (IP) rights of the Government and the contractor. Careful consideration of respective IP licensing rights is even more important when an agency procures an AI system or service, including where agency information is used to train, fine-tune, and develop the AI system. 

To that end, agencies must develop an approach to IP that considers what rights and deliverables are necessary for the agency to successfully accomplish its mission, protects Federal information used by vendors in the development and operation of AI systems and services for the Federal Government, considers the exploration of open-source development practices of AI code, avoids vendor lock-in, and avoids unnecessary costs. Agencies must scrutinize terms of service and licensing terms, including those that specify what information, models, and transformed agency data should be provided as deliverables, to ensure that they clearly articulate the scope of rights needed by the Government over its own data and any derived products.25 Furthermore, agencies should conduct careful due diligence to the supply chain of a vendor’s data. Best practices include the following: 

A. Negotiating the appropriate scope of licensing rights and other rights that are necessary to accomplish the Government’s mission in the long term while avoiding vendor lock-in. This includes strategically selecting the appropriate FAR or agency supplemental clauses and making affirmative decisions about which alternates to these clauses are necessary. For example, as part of its acquisition planning, an agency may determine it needs unlimited rights to certain contractor deliverables based on its long-term approach to IP. Another agency may determine it requires assignment of copyright to deliverables specified in the contract. In all circumstances, agencies must consider its mission, long-term needs, and larger enterprise architecture while avoiding vendor lock-in and maximizing competition; 

B. Ensuring the contract clearly defines the process and timeline for delivery of components needed to operate and monitor the AI system, including as appropriate: data; inputs to the development, testing, and operation process; 

models; software; other technical components; and documentation as described in the agency’s technical requirements. Contracts should ensure such components and their foundational code remain available for the acquiring agency to access and use for as long as it may be necessary (e.g., to re-train the model); 

C. Ensuring complete and timely delivery of information necessary to fulfill requirements of OMB Memorandum M-24-10, including incident reporting. This information should be provided in a machine-readable and native format for ingestion and analysis; 

D. Requiring appropriate handling, access, and use of agency information, such as original input data, prompts, processed data, output data, weights, and models, at least in part by providing clear parameters to ensure that such information must only be collected and retained by a vendor when reasonably necessary to serve the intended purposes of the contract; and 

E. Opting out of or prohibiting the contractor from using agency data to train AI without an agency’s consent. The contract should permanently prohibit the use of inputted agency data and outputted results to further train publicly or commercially available AI algorithms, including generative AI, consistent with applicable law. 

This is going to be messy and honestly wasteful. PMs and KOs are going to crap CDRLs into contracts demanding data and artifacts that are a pain to compile, without a standard or format, which will then not get viewed or used. The government will pay a ton of money to comply with policy that buys no tangible benefit.

Incorporate Transparency Requirements into Contractual Terms and Solicitations to Obtain Necessary Information and Access. Agencies must ensure that vendors provide them with the information and documentation necessary to monitor the performance of an AI system or service and implement applicable requirements of OMB Memorandum M-24-10.29 This may include information about the AI’s functionality and use that may be publicly posted in the agency’s AI use case inventory. 

The level of transparency agencies must require of a vendor, both in the solicitation and evaluation process and through resulting contractual obligations, should be commensurate with the risk and impact of the use case for which the AI system or service will be used. Furthermore, careful consideration should be given to the range of potential agency use cases for the acquired AI system or service, and how the information required to facilitate compliance may depend on whether vendors are developers or deployers of an AI system or service. Agencies must consider whether any or all of the following categories of information must be provided by the vendor to satisfy the requirements of OMB Memorandum M-24-10 or to meet the agency’s objectives: 

A. Performance metrics, including real-world performance for specific sub-groups and demographic groups to surface discriminatory outcomes; 

B. Information about the training data, including the source, provenance, selection, quality, and appropriateness and fitness-for-purpose of the training data, the input features used, time period across which training data was collected, and any filters used; 

C. Information about programmatic evaluations of the AI system or service, including the methodology, design, data, and results of how the evaluation of the program delivering the AI system or service was conducted.; 

D. Information about testing and validation data, including the source, provenance, quality, and appropriateness and fitness-for-purpose of the testing and validation data, the time period across which it was collected, and the extent of overlap or other possible lack of independence from training data; 

E. Information about how input data is used, transformed, and retained by the AI and whether such data is accessible to the vendor; 

F. Information about the AI model(s) integrated into an AI system or service, including the model’s version, capabilities, and mitigations, to the extent it is available to the vendor; 

G. The intended purpose of the AI system or service, known or likely unintended consequences that may occur when deployed for the intended purpose, and known limitations; and 

H. Data protection metrics or assurance indicators for data in transit and at rest in AI systems. 

This is going to be interesting, one wonders how many agencies have quality test data or protocols. Certainly there are data provider companies champing at the bit for the solicitations that are bound to flow from this:

Delineate Responsibilities for Ongoing Testing and Monitoring and Build Evaluations into Vendor Contract Performance. OMB Memorandum M-24-10 generally requires agencies to institute ongoing procedures to monitor degradation of the functionality of AI systems or services and to detect changes in their impact on rights and safety. However, there are instances when a vendor is best equipped to carry out those activities on the agency’s behalf, and so is required under a contract to closely monitor and evaluate the performance and risks of an AI system. In such instances, agencies must still provide oversight and require sufficient information from a vendor to determine compliance with OMB Memorandum M-24-10. Agencies must ensure that contractual terms provide the ability to regularly monitor and evaluate (e.g., on a quarterly or biannual basis, based on the needs of the program) performance and risks throughout the duration of the contract. To do so: 

A. Agencies must use data defined by the agency (e.g., agency validation and testing datasets) when conducting independent evaluations to ensure the AI system or service is fit for purpose. To the extent practicable, the data used when conducting independent evaluations should not be accessible to the vendor, and should be as similar as possible to the data used when the system is deployed; 

B. Contracts must require vendors to provide agencies with sufficient access and time to conduct any required testing in a real-world context, including testing carried out by others on behalf of or under agreement with the agency. Alternatively, agencies may require a vendor to regularly provide the results of an AI system or service’s testing in a real-world operational context and the benchmarks used, with sufficient detail such that the testing could be independently verified or reproduced, if practicable; 

C. Contracts must not prohibit agencies from disclosing how they conduct testing and the results of testing; 

D. Contracts must detail the examination, testing, and validation procedures the vendor is responsible for and the frequency with which they need to be carried out; 

E. Where appropriate, agency contracts for AI systems or services must also include terms that require vendors to provide the government with the results of performance testing for algorithmic discrimination, including demographic and bias testing, demographic characteristics of groups the performance testing has been conducted on, or third-party evaluations and assessments providing an equivalent level of detail. Alternatively, agencies may require a vendor to provide the results of performance testing to address these issues; and 

F. Agencies must also consider how testing and monitoring, including as part of post-award management, impacts financial planning and budgeting requirements in Sections 3(a)(ii) and 4(c)(vi) of this memorandum. 

This could get really messy, is there a watermarking metadata standard? I'm certain there are major tech companies that are ready and willing to provide their constructs… (https://deepmind.google/discover/blog/watermarking-ai-generated-text-and-video-with-synthid/)

Provide Transparency About Risks and Generated Content. When procuring general use enterprise-wide generative AI, agencies must include contractual requirements for vendors to: 

A. Ensure that any audio, image, and video outputs of AI systems that are not readily distinguishable from reality are created or modified using mechanisms, such as through watermarks, cryptographically-signed metadata, or other technical artifacts,36 that allow the outputs to be identified as generated by AI, attributed to the specific model that was used to produce the output, and linked with other relevant information about the origin or history of outputs; 

B. Document how the general use enterprise-wide generative AI was or will be trained and evaluated, including relevant information about data, data labor, compute, model architecture, and relevant evaluations. 

ii. Mitigate Inappropriate Use. When procuring general use enterprise-wide generative AI, agencies must consider including contractual requirements that ensure vendors provide appropriate protections, where practicable, against the AI systems or services being used in ways that are contrary to law and policy. This may include providing methods for monitoring how the general use enterprise-wide generative AI is used in the agency and guidance for how to monitor such use effectively, as well as potentially implementing technical safeguards against the AI being used in prohibited or otherwise sensitive contexts, such as refusing prompts asking for prohibited outputs.