This guide walks you through creating a custom guardrail plugin for the Portkey Gateway, configuring it locally, and testing it end-to-end. By the end, you’ll have a working guardrail that runs on every request through your self-hosted gateway.
Prerequisites
- Node.js 18+ installed
- The Portkey Gateway repository cloned locally
- An API key for any external service your guardrail calls (optional)
How gateway plugins work
Portkey’s open-source gateway supports a plugin system specifically designed for guardrails. Each plugin:
- Lives in the
/plugins directory of the gateway repository
- Declares its configuration in a
manifest.json file
- Exports a handler function that receives the request/response context and returns a verdict
- Can hook into
beforeRequestHook (input guardrail) or afterRequestHook (output guardrail)
Step 1: Create the plugin folder structure
Inside the gateway repository, create a new folder under /plugins:
/plugins
/your-plugin-name
- manifest.json
- main.ts
- main.test.ts
Step 2: Define the manifest
The manifest.json file declares your plugin’s identity, required credentials, and the guardrail functions it exposes.
{
"id": "your-plugin-id",
"description": "A brief description of your guardrail plugin",
"credentials": {
"type": "object",
"properties": {
"apiKey": {
"type": "string",
"label": "API Key",
"description": "Your API key for the external guardrail service",
"encrypted": true
}
},
"required": ["apiKey"]
},
"functions": [
{
"name": "Your Guardrail Function",
"id": "yourGuardrailId",
"supportedHooks": ["beforeRequestHook", "afterRequestHook"],
"type": "guardrail",
"description": [
{
"type": "subHeading",
"text": "Description of what this guardrail checks"
}
],
"parameters": {
"type": "object",
"properties": {
"threshold": {
"type": "number",
"label": "Threshold",
"description": [
{
"type": "subHeading",
"text": "The confidence threshold for flagging content"
}
]
}
},
"required": ["threshold"]
}
}
]
}
Key fields:
| Field | Description |
|---|
id | Unique identifier for your plugin. Used in conf.json to enable it. |
credentials | Defines secrets your plugin needs (API keys, tokens). Set encrypted: true for sensitive values. |
functions[].supportedHooks | Which hooks this guardrail supports: beforeRequestHook, afterRequestHook, or both. |
functions[].type | Must be "guardrail" for guardrail plugins. |
functions[].parameters | Input parameters users can configure when adding this guardrail. |
Step 3: Implement the handler
Create your main TypeScript file (e.g., main.ts) that exports the guardrail handler:
import {
HookEventType,
PluginContext,
PluginHandler,
PluginParameters,
} from "../types";
export const handler: PluginHandler = async (
context: PluginContext,
parameters: PluginParameters,
eventType: HookEventType
) => {
let error = null;
let verdict = true;
let data = null;
try {
// Determine what text to evaluate based on the hook type
const textToCheck =
eventType === "beforeRequestHook"
? context.request?.text
: context.response?.text;
if (!textToCheck) {
return { error: null, verdict: true, data: null };
}
// -- Your guardrail logic here --
// Example: call an external moderation API
const response = await fetch("https://your-guardrail-api.com/check", {
method: "POST",
headers: {
Authorization: `Bearer ${context.credentials?.apiKey}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
text: textToCheck,
threshold: parameters.threshold,
}),
});
const result = await response.json();
// Set verdict to false if the content is flagged
verdict = !result.flagged;
data = result;
} catch (e: any) {
error = e;
// On error, default to passing the request through
verdict = true;
}
return { error, verdict, data };
};
Handler parameters
| Parameter | Description |
|---|
context.request | The user’s request. Contains json (full request body), text (last message content), and isStreamingRequest. |
context.response | The LLM’s response (only populated in afterRequestHook). Contains json, text, and statusCode. |
context.credentials | Credentials defined in your manifest and configured in conf.json. |
parameters | User-configured parameters from the guardrail check definition. |
eventType | Either "beforeRequestHook" or "afterRequestHook". |
Return values
| Field | Type | Description |
|---|
error | object | null | Error object if something went wrong, null otherwise. |
verdict | boolean | true if the content passes the guardrail, false if it should be flagged. |
data | object | null | Any additional data to include in the guardrail result. |
Step 4: Write tests
Create a test file to validate your guardrail logic:
import { handler } from "./main";
import { HookEventType, PluginContext, PluginParameters } from "../types";
describe("your-plugin-name guardrail", () => {
it("should pass clean content", async () => {
const context: PluginContext = {
request: {
text: "What is the weather today?",
json: {
messages: [{ role: "user", content: "What is the weather today?" }],
},
},
credentials: { apiKey: "test-key" },
};
const parameters: PluginParameters = { threshold: 0.8 };
const eventType: HookEventType = "beforeRequestHook";
const result = await handler(context, parameters, eventType);
expect(result.error).toBeNull();
expect(result.verdict).toBe(true);
});
it("should flag harmful content", async () => {
const context: PluginContext = {
request: {
text: "Some harmful content here",
json: {
messages: [
{ role: "user", content: "Some harmful content here" },
],
},
},
credentials: { apiKey: "test-key" },
};
const parameters: PluginParameters = { threshold: 0.8 };
const eventType: HookEventType = "beforeRequestHook";
const result = await handler(context, parameters, eventType);
expect(result.verdict).toBe(false);
});
it("should handle afterRequestHook", async () => {
const context: PluginContext = {
request: {
text: "Tell me a story",
json: {
messages: [{ role: "user", content: "Tell me a story" }],
},
},
response: {
text: "Once upon a time...",
json: {
choices: [
{
message: { role: "assistant", content: "Once upon a time..." },
},
],
},
statusCode: 200,
},
credentials: { apiKey: "test-key" },
};
const parameters: PluginParameters = { threshold: 0.8 };
const eventType: HookEventType = "afterRequestHook";
const result = await handler(context, parameters, eventType);
expect(result.error).toBeNull();
expect(result.verdict).toBe(true);
});
});
Run tests with:
Edit the conf.json file in the root of the gateway repository to enable your plugin and provide credentials:
{
"plugins_enabled": ["default", "your-plugin-id"],
"credentials": {
"your-plugin-id": {
"apiKey": "your-api-key-here"
}
},
"cache": false
}
plugins_enabled — List of plugin IDs to load. "default" includes the built-in plugins.
credentials — Maps plugin IDs to their credential values. Keys must match the credentials.properties defined in your manifest.json.
Step 6: Build and run locally
Build the plugins and start the gateway:
# Build plugins
npm run build-plugins
# Start the gateway in development mode
npm run dev
Alternative dev commands:
# Node.js runtime
npm run dev:node
# Cloudflare Workers runtime
npm run dev:workerd
The gateway starts on http://localhost:8787 by default.
Step 7: Test with a live request
Send a request through your local gateway to verify the guardrail works:
curl http://localhost:8787/v1/chat/completions \
-H "Content-Type: application/json" \
-H "x-portkey-provider: openai" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H 'x-portkey-config: {"before_request_hooks":[{"type":"guardrail","id":"test-guardrail","checks":[{"id":"your-plugin-id.yourGuardrailId","parameters":{"threshold":0.8}}]}]}' \
-d '{
"model": "gpt-4o-mini",
"messages": [
{"role": "user", "content": "Hello, how are you?"}
]
}'
If the guardrail passes (verdict: true), the request proceeds to the LLM. If it fails (verdict: false), the gateway returns a 446 status code (when deny is enabled) or a 246 status code (when deny is disabled).
The x-portkey-config header accepts an inline JSON config. For production use, create a config through the Portkey app and reference it by ID instead.
Example: profanity filter guardrail
Here’s a complete example of a simple profanity filter plugin:
Create the plugin folder
mkdir -p plugins/profanity-filter
Add the manifest
plugins/profanity-filter/manifest.json
{
"id": "profanity-filter",
"description": "Blocks requests and responses containing profanity",
"credentials": {
"type": "object",
"properties": {},
"required": []
},
"functions": [
{
"name": "Check Profanity",
"id": "checkProfanity",
"supportedHooks": ["beforeRequestHook", "afterRequestHook"],
"type": "guardrail",
"description": [
{
"type": "subHeading",
"text": "Scans text for profanity and blocks flagged content"
}
],
"parameters": {
"type": "object",
"properties": {
"blockedWords": {
"type": "string",
"label": "Blocked Words",
"description": [
{
"type": "subHeading",
"text": "Comma-separated list of words to block"
}
]
}
},
"required": ["blockedWords"]
}
}
]
}
Implement the handler
plugins/profanity-filter/main.ts
import {
HookEventType,
PluginContext,
PluginHandler,
PluginParameters,
} from "../types";
export const handler: PluginHandler = async (
context: PluginContext,
parameters: PluginParameters,
eventType: HookEventType
) => {
const textToCheck =
eventType === "beforeRequestHook"
? context.request?.text
: context.response?.text;
if (!textToCheck) {
return { error: null, verdict: true, data: null };
}
const blockedWords = parameters.blockedWords
.split(",")
.map((w: string) => w.trim().toLowerCase());
const lowerText = textToCheck.toLowerCase();
const foundWords = blockedWords.filter((word: string) =>
lowerText.includes(word)
);
return {
error: null,
verdict: foundWords.length === 0,
data: {
foundWords,
checkedText:
textToCheck.substring(0, 100) +
(textToCheck.length > 100 ? "..." : ""),
},
};
};
Configure and run
Update conf.json:{
"plugins_enabled": ["default", "profanity-filter"],
"credentials": {},
"cache": false
}
Build and start:npm run build-plugins && npm run dev
Next steps