1 criticalUpdated 2s ago · auto-refresh 30s

System overview

5 / 6 services healthy, 18.42M requests in the last 24h, 99.94% uptime.

Uptime 24h
99.94%
+0.02%
p50 latency
128ms
+4ms
Error rate
0.80%
+0.2pp
Deploys today
24
+8
ml-inference · p95 latency · last 60 min
2,240ms
↑ 4,660%
SLO 1500ms
Services

Production services

View all →
api-gatewayv2.18.4 · tpe1
4.8k rpm · p50 48ms · p95 142ms · error 1.20%
Healthy
ml-inferencev3.4.2 · tpe1
1.8k rpm · p50 680ms · p95 2240ms · error 3.40%
Degraded
embeddings-servicev1.22.0 · tpe1
720 rpm · p50 82ms · p95 280ms · error 0.40%
Healthy
job-queue-workerv1.5.8 · tpe1
2.3k rpm · p50 0ms · p95 0ms · error 0.20%
Healthy
vector-storev0.9.4 · tpe1
3.4k rpm · p50 24ms · p95 96ms · error 0.80%
Healthy
webhook-dispatcherv2.0.1 · tpe1
680 rpm · p50 18ms · p95 64ms · error 0.10%
Healthy
Alerts

Active alerts

2 open
p95 latency exceeded 2s
ml-inference · 18m ago
p95=2240ms · threshold=1500ms · sustained 12m
Error rate elevated to 3.4%
ml-inference · 18m ago
baseline 0.5% · affected endpoint /infer/chat
Rate limit approaching ceiling
api-gateway · 34m ago
82% of 10k rpm quota used
Deployments

Recent deployments

View all →
StatusServiceEnvCommitBranchAuthorDurationStarted
building
ml-inferencestaginga4c8de2feat/batching-v2Zhou H.3m 12s2m ago
ready
ml-inferenceprode12f8b1mainChen Y.4m 52s18m ago
ready
api-gatewayprod7f28c41mainLiu K.2m 08s2h ago
ready
embeddings-servicepreviewba1d9effix/memory-leakWang R.3m 42s4h ago
error
api-gatewaypreview1c3a582feat/rate-limitZhou H.0m 48s5h ago
ready
job-queue-workerprod8d4e07amainChen Y.1m 24s5h ago