Smart Routing
Select the optimal model per context, quick A/B, and fallback on errors/overload.
- Cost / Latency / Quality-first
 - Flexible rules & weights
 
A/BFallbackCanary
      AI Gateway intelligently routes queries to the right AI models (LLM, Embedding, Vision) to optimize outcomes, lower cost, and improve user experience. Standardized API, A/B testing & fallback, centralized observability, and enterprise-grade security.
Route by intent/cost/latency, run A/B & canary, auto-fallback; manage keys & team policies; centralized token/latency/error observability; supports LLM/Embedding/Vision.
Select the optimal model per context, quick A/B, and fallback on errors/overload.
RBAC, quota, key rotation, audit logs, data boundary & IP allowlist.
Metrics, tracing, alerting; export to Prometheus/Grafana/Datadog.
// POST /v1/ai-gateway/chat
{
  "prompt": "Summarize customer profile",
  "strategy": "latency-first",
  "fallback": true,
  "constraints": {"max_latency_ms": 1200}
}
// → Gateway routes: embed→rerank, or LLM-4-mini; on error → fallback to LLM-3.5