Written by 4:00 pm AI Security, Latest news

### Microsoft Unveils Advanced AI Security System to Combat Malicious Content

The tools will be built into the company’s Azure AI Studio.

Relational artificial intelligence (AI) technologies such as ChatGPT are revolutionizing the internet, albeit with some reservations about their impact. Microsoft, a key collaborator of OpenAI—the creator of ChatGPT—has heavily invested in this field. This includes integrating the Copilot AI into various products and cloud-based tools for businesses to develop their own large language models (LLMs). However, concerns about potential AI “hallucinations” loom, prompting Microsoft to reassure users of its Azure AI platform by promising new tools to regulate AI behavior effectively.

The developers of LLMs are sometimes astounded when these systems deviate unexpectedly despite rigorous testing and refinement. Even when meticulously designed to avoid generating sexist, false, or violent content, there are instances where users manage to provoke undesirable responses, labeled by Microsoft as “prompt shot attacks.” This term hints at the inventive ways users manipulate the AI through their queries.

Microsoft has unveiled five innovative capabilities for Azure AI Studio, with an additional two scheduled for release following the current three available for preview. These include Prompt Shield, Risk and Security Monitoring, and Safety Evaluations. Prompt Shield addresses indirect attacks that coax the model into producing harmful output, while Risk and Security Monitoring provides tools to promptly detect and mitigate dangerous content. Safety Evaluations, on the other hand, scrutinize model outputs for content and security vulnerabilities, aiding in the creation of adversarial test datasets for human “red team” evaluations.

In the near future, Azure’s AI software will introduce health system communication templates to guide developers in steering models towards safer outputs. Groundedness Detection, focusing on preventing hallucinations, will be the final addition. This feature ensures that outputs are not confidently inaccurate or devoid of basic common sense.

While Microsoft will automatically incorporate these safety features into GPT-4 models, users of less popular LLMs may need to manually integrate these tools. Microsoft’s strategic emphasis on security and safety reflects its commitment to averting the embarrassing incidents that have marred the reputation of generative AI technology since its initial showcase, aligning with the company’s upward trajectory spurred by the growing interest in AI advancements.

Visited 2 times, 1 visit(s) today
Tags: , Last modified: April 1, 2024
Close Search Window
Close