AI Guardrails Profile

This profile is required for those who have enabled AI Defense and integrated it with their Multicloud Defense tenant. Guardrails (security rules and policies) are applied across different use cases; this profile allows you to configure a security profile within Multicloud Defense that is a direct reflection of the guardrails available in the AI Defense policy. By configuring this profile you set up secure communiction between Multicloud Defense and AI Defense to to allow the enforcement of the following types of AI guardrails:

  • Security guardrails - The security guardrail detects attempts to override the LLM's internal safety and alignment rules via direct and indirect prompt injection. This guardrail also detects software code in the model endpoint interactions, reducing risks such as malicious code execution, and more.

  • Privacy guardrails - Privacy attacks attempt to reveal sensitive information contained an ML model or its data. The privacy guardrail detects personal, confidential, or sensitive information that can be cause harm if leaked to unauthorized parties. This guardrail detects Personally Identifiable Information (PII), Protected health information (PHI), and Payment Card Industry (PCI) data.

  • Safety guardrails - Safety harms can encompass various categories, including user-specific, societal, reputational, and financial impacts. This guardrail examines prompts and responses to detect harmful language including toxic content, hate speech, harassment, profanity, sexual content and exploitation, messages driving social division and polarization, violence and public safety threats.