General availability of Prompt Shields in Azure AI Content Safety and Azure OpenAI Service by info.odysseyx@gmail.com September 3, 2024 written by info.odysseyx@gmail.com September 3, 2024 0 comment 4 views 4 Today, we are announcing the general availability of Prompt Shields in Azure AI Content Safety and Azure OpenAI Service, powerful AI security capabilities we announced in 2016. Preview March 2024. Prompt Shields is fully integrated with the Azure OpenAI service content filter and is available in Azure AI Content Safety, providing a strong defense against various types of prompt injection attacks. Prompt Shields effectively identifies and mitigates potential threats from user prompts and third-party data by leveraging advanced machine learning algorithms and natural language processing. This cutting-edge capability supports the security and integrity of AI applications, protecting the system from malicious manipulation or exploitation attempts. Key Features Rapid Shield for Direct Attacks: Formerly known as Jailbreak Threat Detection, this shield targets direct rapid injection attacks where a user intentionally exploits a system vulnerability to cause unauthorized actions in the LLM. This can lead to the creation of inappropriate content or violation of restrictions imposed by the system. Prompt Shield for Indirect Attacks: This shield aims to protect against attacks that use information that is not directly provided by the user or developer, such as external documents. An attacker can embed hidden instructions in such materials to gain unauthorized control of the LLM session. Prompt Shields API: Input and Output Prompt Shield for Azure OpenAI Services User Scenario Azure AI Content Safety’s “Prompt Shields” are specifically designed to prevent generative AI systems from generating harmful or inappropriate content. These shields detect and mitigate risks associated with user prompt attacks (malicious or harmful user-generated input) and document attacks (inputs containing harmful content embedded in documents). Using “Prompt Shields” in environments where GenAI is used is critical to ensuring that AI output is secure, compliant, and trustworthy. The main purposes of the “Prompt Shields” feature for GenAI applications are: Detect and block harmful or policy-violating user prompts (direct attacks) that could lead to unsafe AI output. Identify and mitigate indirect attacks that involve malicious content in user-provided documents. Prevent misuse of GenAI systems by maintaining the integrity, safety, and compliance of AI-generated content. Example Use Cases AI Content Creation Platform: Harmful Prompt Detection Scenario: An AI content creation platform uses generative AI models to generate marketing copy, social media posts, and articles based on user-provided prompts. To prevent harmful or inappropriate content from being created, the platform integrates “Prompt Shields.” Users: Content creators, platform administrators, compliance officers. Action: The platform uses Azure AI Content Safety’s “Prompt Shields” to analyze user prompts before generating content. If a prompt is detected that encourages the creation of something potentially harmful or likely to lead to policy-violating output (such as a prompt requesting defamatory content or hate speech), the shield blocks the prompt and warns the user to modify their input. Results: The platform builds user trust and protects the platform’s reputation by ensuring that all AI-generated content is safe, ethical, and compliant with community guidelines. AI-based chatbots: Mitigating risks from user prompt attacks Scenario: A customer service provider uses an AI-based chatbot for automated support. To protect against user prompts that could manipulate the model to generate inappropriate or unsafe responses, the provider uses “Prompt Shields.” Users: Customer service representatives, chatbot developers, compliance teams. Action: The chatbot system integrates “Prompt Shields” to monitor and evaluate user input in real time. If a user prompt is identified as attempting to exploit the AI (e.g., eliciting an inappropriate response or attempting to extract sensitive information), the shields intervene by blocking the response or redirecting the query to a human agent. Results: Customer service providers maintain high standards for interaction safety and compliance, preventing chatbots from generating responses that could harm users or violate policies. eLearning platforms: Preventing inappropriate AI-generated training content Scenario: An eLearning platform uses GenAI to generate personalized learning content based on student input and reference documents. To avoid security threats that could result in inappropriate or misleading learning content, the platform leverages “Prompt Shields.” Users: Educators, content developers, compliance officers. Action: The platform uses a “Prompt Shield” to analyze both user prompts and linked data sources, such as documents, to look for content that could manipulate the application to produce unsafe or policy-violating AI output. If a user prompt or linked document is detected as likely to produce inappropriate educational content, the shield blocks it and suggests safe alternative input. Outcome: The platform ensures that all AI-generated educational materials are relevant and academically compliant, creating a safe and effective learning environment. With the general availability of Prompt Shields, AI systems can now be more effectively protected from injection attacks immediately, enabling stronger security. Our Customers AXA has joined the wave of generative AI, bringing this new era of AI to its 140,000 employees, with a focus on enabling them to perform their jobs safely and responsibly. “Our goal was to enable all employees to use the same technology as public AI tools in a safe and trusted environment. We wanted to move quickly, and we did it in less than three months.” – Vincent de Ponto, Head of Software and AI Engineering at AXA. AXA can leverage the content filter of the Azure OpenAI service to prevent everything by adding a layer of security to their application using Prompt Shields. Model’s Jailbreak Ensures optimal level of reliability. Read more evil We are leveraging Prompt Shields AXA Secure GPT. resources Source link Share 0 FacebookTwitterPinterestEmail info.odysseyx@gmail.com previous post SharePoint Roadmap Pitstop: August 2024 next post Active Directory Hardening Series – Part 5 – Enforcing LDAP Channel Binding You may also like Insights from MVPs at the Power Platform Community Conference October 10, 2024 Restoring an MS SQL 2022 DB from a ANF SnapShot October 10, 2024 Your guide to Intune at Microsoft Ignite 2024 October 10, 2024 Partner Blog | Build your team’s AI expertise with upcoming Microsoft partner skilling opportunities October 10, 2024 Attend Microsoft Ignite from anywhere in the world! October 10, 2024 Get tailored support with the new Partner Center AI assistant (preview) October 10, 2024 Leave a Comment Cancel Reply Save my name, email, and website in this browser for the next time I comment.