Mistral AI provides a sophisticated content moderation service that enables users to detect harmful text content across multiple policy dimensions, helping to secure LLM applications and ensure safe AI interactions.

To get started with Mistral, visit their documentation:

Get Started with Mistral Moderation

Using Mistral with Portkey

1. Add Mistral Credentials to Portkey

  • Navigate to the Integrations page under Settings
  • Click on the edit button for the Mistral integration
  • Add your Mistral La Plateforme API Key (obtain this from your Mistral account)

2. Add Mistral’s Guardrail Check

  • Navigate to the Guardrails page and click the Create button
  • Search for “Moderate Content” and click Add
  • Configure your moderation checks by selecting which categories to filter:
    • Sexual
    • Hate and discrimination
    • Violence and threats
    • Dangerous and criminal content
    • Selfharm
    • Health
    • Financial
    • Law
    • PII (Personally Identifiable Information)
  • Set the timeout in milliseconds (default: 5000ms)
  • Set any actions you want on your check, and create the Guardrail!

Guardrail Actions allow you to orchestrate your guardrails logic. You can learn more about them here

Check NameDescriptionParametersSupported Hooks
Moderate ContentChecks if content passes selected content moderation checksModeration Checks (array), Timeout (number)beforeRequestHook, afterRequestHook

3. Add Guardrail ID to a Config and Make Your Request

  • When you save a Guardrail, you’ll get an associated Guardrail ID - add this ID to the input_guardrails or output_guardrails params in your Portkey Config
  • Create these Configs in Portkey UI, save them, and get an associated Config ID to attach to your requests. More here.

Here’s an example config:

{
  "input_guardrails": ["guardrails-id-xxx", "guardrails-id-yyy"],
  "output_guardrails": ["guardrails-id-xxx", "guardrails-id-yyy"]
}
const portkey = new Portkey({
    apiKey: "PORTKEY_API_KEY",
    config: "pc-***" // Supports a string config id or a config object
});

For more, refer to the Config documentation.

Your requests are now guarded by Mistral’s moderation service, and you can see the verdict and any actions taken directly in your Portkey logs!

Content Moderation Categories

Mistral’s moderation service can detect content across 9 key policy categories:

  1. Sexual: Content of sexual nature or adult content
  2. Hate and Discrimination: Content expressing hatred or promoting discrimination
  3. Violence and Threats: Content depicting violence or threatening language
  4. Dangerous and Criminal Content: Instructions for illegal activities or harmful actions
  5. Self-harm: Content related to self-injury, suicide, or eating disorders
  6. Health: Unqualified medical advice or health misinformation
  7. Financial: Unqualified financial advice or dubious investment schemes
  8. Law: Unqualified legal advice or recommendations
  9. PII: Personally identifiable information, including email addresses, phone numbers, etc.

Mistral’s moderation service is natively multilingual, with support for Arabic, Chinese, English, French, German, Italian, Japanese, Korean, Portuguese, Russian, and Spanish.

Get Support

If you face any issues with the Mistral integration, join the Portkey community forum for assistance.

Learn More