Mistral AI provides a sophisticated content moderation service that enables users to detect harmful text content across multiple policy dimensions, helping to secure LLM applications and ensure safe AI interactions. To get started with Mistral, visit their documentation:Documentation Index
Fetch the complete documentation index at: https://docs.portkey.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Get Started with Mistral Moderation
Using Mistral with Portkey
1. Add Mistral Credentials to Portkey
- Navigate to the
Integrationspage underSettings - Click on the edit button for the Mistral integration
- Add your Mistral La Plateforme API Key (obtain this from your Mistral account)
2. Add Mistral’s Guardrail Check
- Navigate to the
Guardrailspage and click theCreatebutton - Search for “Moderate Content” and click
Add - Configure your moderation checks by selecting which categories to filter:
- Sexual
- Hate and discrimination
- Violence and threats
- Dangerous and criminal content
- Selfharm
- Health
- Financial
- Law
- PII (Personally Identifiable Information)
- Set the timeout in milliseconds (default: 5000ms)
- Set any
actionsyou want on your check, and create the Guardrail!
Guardrail Actions allow you to orchestrate your guardrails logic. You can learn more about them here
| Check Name | Description | Parameters | Supported Hooks |
|---|---|---|---|
| Moderate Content | Checks if content passes selected content moderation checks | Moderation Checks (array), Timeout (number) | beforeRequestHook, afterRequestHook |
3. Add Guardrail ID to a Config and Make Your Request
- When you save a Guardrail, you’ll get an associated Guardrail ID - add this ID to the
input_guardrailsoroutput_guardrailsparams in your Portkey Config - Create these Configs in Portkey UI, save them, and get an associated Config ID to attach to your requests. More here.
- NodeJS
- Python
- OpenAI NodeJS
- OpenAI Python
- cURL
Content Moderation Categories
Mistral’s moderation service can detect content across 9 key policy categories:- Sexual: Content of sexual nature or adult content
- Hate and Discrimination: Content expressing hatred or promoting discrimination
- Violence and Threats: Content depicting violence or threatening language
- Dangerous and Criminal Content: Instructions for illegal activities or harmful actions
- Self-harm: Content related to self-injury, suicide, or eating disorders
- Health: Unqualified medical advice or health misinformation
- Financial: Unqualified financial advice or dubious investment schemes
- Law: Unqualified legal advice or recommendations
- PII: Personally identifiable information, including email addresses, phone numbers, etc.

