Mistral
Mistral moderation service helps detect and filter harmful content across multiple policy dimensions to secure your AI applications.
Mistral AI provides a sophisticated content moderation service that enables users to detect harmful text content across multiple policy dimensions, helping to secure LLM applications and ensure safe AI interactions.
To get started with Mistral, visit their documentation:
Get Started with Mistral Moderation
Using Mistral with Portkey
1. Add Mistral Credentials to Portkey
- Navigate to the
Integrations
page underSettings
- Click on the edit button for the Mistral integration
- Add your Mistral La Plateforme API Key (obtain this from your Mistral account)
2. Add Mistral’s Guardrail Check
- Navigate to the
Guardrails
page and click theCreate
button - Search for “Moderate Content” and click
Add
- Configure your moderation checks by selecting which categories to filter:
- Sexual
- Hate and discrimination
- Violence and threats
- Dangerous and criminal content
- Selfharm
- Health
- Financial
- Law
- PII (Personally Identifiable Information)
- Set the timeout in milliseconds (default: 5000ms)
- Set any
actions
you want on your check, and create the Guardrail!
Guardrail Actions allow you to orchestrate your guardrails logic. You can learn more about them here
Check Name | Description | Parameters | Supported Hooks |
---|---|---|---|
Moderate Content | Checks if content passes selected content moderation checks | Moderation Checks (array), Timeout (number) | beforeRequestHook , afterRequestHook |
3. Add Guardrail ID to a Config and Make Your Request
- When you save a Guardrail, you’ll get an associated Guardrail ID - add this ID to the
input_guardrails
oroutput_guardrails
params in your Portkey Config - Create these Configs in Portkey UI, save them, and get an associated Config ID to attach to your requests. More here.
Here’s an example config:
For more, refer to the Config documentation.
Your requests are now guarded by Mistral’s moderation service, and you can see the verdict and any actions taken directly in your Portkey logs!
Content Moderation Categories
Mistral’s moderation service can detect content across 9 key policy categories:
- Sexual: Content of sexual nature or adult content
- Hate and Discrimination: Content expressing hatred or promoting discrimination
- Violence and Threats: Content depicting violence or threatening language
- Dangerous and Criminal Content: Instructions for illegal activities or harmful actions
- Self-harm: Content related to self-injury, suicide, or eating disorders
- Health: Unqualified medical advice or health misinformation
- Financial: Unqualified financial advice or dubious investment schemes
- Law: Unqualified legal advice or recommendations
- PII: Personally identifiable information, including email addresses, phone numbers, etc.
Mistral’s moderation service is natively multilingual, with support for Arabic, Chinese, English, French, German, Italian, Japanese, Korean, Portuguese, Russian, and Spanish.
Get Support
If you face any issues with the Mistral integration, join the Portkey community forum for assistance.
Learn More
Was this page helpful?