Written by 11:31 am AI, Big Tech companies

### Scale AI and DOD’s Chief Digital & AI Office Collaborate on Testing Large Language Models

WASHINGTON, Feb. 20, 2024 — Scale AI, a leading test and evaluation (T&E) partner for fro…

Collaboration between Scale AI and the U.S. Department of Defense’s Chief Digital and Artificial Intelligence Office

Scale AI, a key partner specializing in test and evaluation services for cutting-edge artificial intelligence enterprises, recently announced a strategic partnership with the U.S. Department of Defense’s Chief Digital and Artificial Intelligence Office (CDAO) on February 20, 2024. The main goal of this collaboration is to establish a comprehensive test and evaluation (T&E) framework to ensure the responsible use of large language models (LLMs) within the DoD.

In this joint effort, Scale will develop customized benchmark assessments tailored to DoD-specific scenarios, integrate them into its T&E platform, and aid CDAO in formulating a T&E strategy for the efficient deployment of LLMs. The expected outcomes include the creation of a structured framework that enables the CDAO to securely utilize AI technologies by assessing model performance, offering real-time insights to military personnel, and designing specialized evaluation datasets to evaluate AI models for defense-related tasks such as consolidating information from post-mission reports.

This partnership aims to support the DoD in enhancing its T&E protocols to accommodate generative AI technologies by utilizing quantitative benchmarking and qualitative user feedback. Through the establishment of evaluation criteria, the initiative seeks to identify generative AI models capable of providing accurate and relevant results aligned with DoD terminology and knowledge repositories. Through a rigorous T&E process, the objective is to strengthen the resilience and efficiency of AI systems in classified environments, facilitating the integration of LLM technology in secure operational settings.

Alexandr Wang, the visionary founder and CEO of Scale AI, emphasized the organization’s commitment to ensuring the legitimacy of future AI applications for defense purposes and reinforcing the United States’ global leadership in deploying secure, reliable, and ethical AI solutions. Wang stated, “By subjecting generative AI to thorough testing and evaluation, the DoD can gain insights into both its capabilities and limitations, ensuring responsible deployment. Scale is honored to collaborate with the DoD on this crucial framework.”

While conventional T&E practices have been standard in product development across various industries to ensure compliance with safety standards before market entry, the establishment of AI safety benchmarks is still evolving. Scale’s innovative methodology, introduced last summer, represents the industry’s first comprehensive technical blueprint for LLM T&E. Its endorsement by the DoD highlights Scale’s steadfast dedication to exploring the potentials and challenges of LLMs, mitigating associated risks, and meeting the unique requirements of military operations.

To explore further details about Scale’s methodology for test and evaluation, visit https://scale.com/llm-test-evaluation.

About Scale AI

Scale is leading the way in advancing the Generative AI revolution. By leveraging high-quality data and human insights as the foundation of its operations, Scale’s proprietary Data Engine drives the development of cutting-edge models. With extensive partnerships with major model developers over the years, Scale is a trusted ally for any organization looking to harness the power of AI. Industry leaders such as Meta, Microsoft, the U.S. Army, the DoD’s Defense Innovation Unit, OpenAI, Cohere, Anthropic, General Motors, Toyota Research Institute, and NVIDIA rely on Scale’s expertise and capabilities.

Source: Scale

Visited 1 times, 1 visit(s) today
Tags: , Last modified: February 21, 2024
Close Search Window
Close