Smart Routing
Select the optimal model per context, quick A/B, and fallback on errors/overload.
- Cost / Latency / Quality-first
- Flexible rules & weights
A/BFallbackCanary
AI Gateway intelligently routes queries to the right AI models (LLM, Embedding, Vision) to optimize outcomes, lower cost, and improve user experience. Standardized API, A/B testing & fallback, centralized observability, and enterprise-grade security.
Route by intent/cost/latency, run A/B & canary, auto-fallback; manage keys & team policies; centralized token/latency/error observability; supports LLM/Embedding/Vision.
Select the optimal model per context, quick A/B, and fallback on errors/overload.
RBAC, quota, key rotation, audit logs, data boundary & IP allowlist.
Metrics, tracing, alerting; export to Prometheus/Grafana/Datadog.
// POST /v1/ai-gateway/chat
{
"prompt": "Summarize customer profile",
"strategy": "latency-first",
"fallback": true,
"constraints": {"max_latency_ms": 1200}
}
// → Gateway routes: embed→rerank, or LLM-4-mini; on error → fallback to LLM-3.5