Models DeepInfra
sentence-transformers/clip-ViT-B-32-multilingual-v1
8K max output
chat
Pricing
Per 1M tokens
Input
—
Cached input
—
Output
—
Cache write
—
Modalities
Input
text
Output
text
Features
Streaming
Function calling
Vision
Reasoning
JSON mode
Share This Model
Share on X or copy the link
DeepInfra
Clip Vit B 32 Multilingual V1
Input
—/M
Output
—/M
Text Generation