LLM Speed Index — Live AI API Latency Rankings
Measured latency and reliability for 62 LLM API providers, ranked by 7-day p50 response time. Toggle between 7-day and 24-hour windows. Probed every ~90 minutes by DownForAI.
Last updated: Fri, 19 Jun 2026 10:25:43 GMT
62 providers · updated < 2 min ago
| # | Provider | Status | p50 7d↑ | p95 7d↕ | Uptime 7d↕ | Trend 24h↕ | Incidents 7d↕ | Reports 24h↕ |
|---|---|---|---|---|---|---|---|---|
| 1 | Ollama | OK | 84ms | 216ms | 100.00% | ~stable | 0 | 1 |
| 2 | Apple Intelligence | OK | 89ms | 413ms | 100.00% | ~stable | 0 | 0 |
| 3 | Nous Hermes | OK | 94ms | 325ms | 100.00% | ~stable | 0 | 0 |
| 4 | WizardLM | OK | 94ms | 260ms | 100.00% | ~stable | 0 | 0 |
| 5 | Falcon (TII) | OK | 99ms | 288ms | 100.00% | ~stable | 0 | 0 |
| 6 | H2O.ai | OK | 105ms | 287ms | 100.00% | −40ms | 0 | 0 |
| 7 | Reka AI | OK | 105ms | 284ms | 100.00% | ~stable | 0 | 0 |
| 8 | xAI Grok | OK | 105ms | 343ms | 100.00% | ~stable | 0 | 0 |
| 9 | Mistral AI | OK | 106ms | 422ms | 100.00% | −39ms | 0 | 0 |
| 10 | Microsoft Copilot | OK | 108ms | 599ms | 100.00% | −26ms | 0 | 0 |
| 11 | Corcel | OK | 116ms | 316ms | 100.00% | ~stable | 0 | 0 |
| 12 | Cohere | OK | 124ms | 519ms | 100.00% | ~stable | 0 | 0 |
| 13 | IBM Granite | OK | 129ms | 657ms | 100.00% | ~stable | 0 | 0 |
| 14 | OpenRouter | OK | 129ms | 335ms | 100.00% | ~stable | 0 | 0 |
| 15 | Upstage Solar | OK | 136ms | 267ms | 100.00% | ~stable | 0 | 0 |
| 16 | Copy.ai | OK | 141ms | 527ms | 100.00% | +22ms | 0 | 0 |
| 17 | Thinking Machines Lab | OK | 160ms | 349ms | 100.00% | ~stable | 0 | 0 |
| 18 | HuggingChat | OK | 162ms | 471ms | 99.40% | ~stable | 0 | 0 |
| 19 | Genspark | OK | 166ms | 777ms | 100.00% | ~stable | 0 | 0 |
| 20 | Google Gemini | OK | 166ms | 521ms | 100.00% | ~stable | 0 | 0 |
| 21 | Claude Chat | OK | 185ms | 438ms | 100.00% | +23ms | 0 | 0 |
| 22 | Google Gemma | OK | 218ms | 399ms | 99.39% | +55ms | 0 | 0 |
| 23 | Le Chat (Mistral) | OK | 228ms | 943ms | 100.00% | ~stable | 0 | 0 |
| 24 | Huawei Pangu | OK | 236ms | 479ms | 100.00% | +65ms | 0 | 0 |
| 25 | AI21 Labs | OK | 240ms | 566ms | 100.00% | ~stable | 0 | 0 |
| 26 | Writesonic | OK | 243ms | 659ms | 100.00% | −27ms | 0 | 0 |
| 27 | Poe | OK | 249ms | 576ms | 100.00% | ~stable | 0 | 0 |
| 28 | LM Studio | OK | 256ms | 460ms | 100.00% | ~stable | 0 | 0 |
| 29 | Writesmith | OK | 271ms | 679ms | 100.00% | −28ms | 0 | 0 |
| 30 | DeepSeek | OK | 274ms | 535ms | 100.00% | ~stable | 0 | 0 |
| 31 | Amazon Nova | OK | 276ms | 817ms | 100.00% | +59ms | 0 | 0 |
| 32 | Anthropic | OK | 280ms | 722ms | 100.00% | ~stable | 0 | 0 |
| 33 | DeepSeek Coder | OK | 285ms | 744ms | 100.00% | ~stable | 0 | 0 |
| 34 | Character.AI | OK | 296ms | 1259ms | 97.30% | ~stable | 0 | 0 |
| 35 | ChatGPT | OK | 309ms | 527ms | 100.00% | +36ms | 0 | 0 |
| 36 | OpenAI | OK | 374ms | 635ms | 100.00% | ~stable | 0 | 0 |
| 37 | DuckDuckGo AI | OK | 390ms | 951ms | 100.00% | −39ms | 0 | 0 |
| 38 | AI2 OLMo | OK | 406ms | 1038ms | 96.95% | −79ms | 0 | 0 |
| 39 | Skywork | OK | 415ms | 711ms | 100.00% | ~stable | 0 | 0 |
| 40 | Coze | OK | 420ms | 1028ms | 100.00% | −117ms | 0 | 0 |
| 41 | Aleph Alpha | OK | 482ms | 684ms | 100.00% | −49ms | 0 | 0 |
| 42 | Inflection Pi | OK | 569ms | 968ms | 100.00% | −200ms | 0 | 0 |
| 43 | YouChat | OK | 586ms | 1310ms | 100.00% | ~stable | 0 | 0 |
| 44 | LMArena | OK | 589ms | 1274ms | 100.00% | ~stable | 0 | 0 |
| 45 | Jais (Inception) | OK | 668ms | 817ms | 100.00% | +30ms | 0 | 0 |
| 46 | Yuan 2.0 | OK | 670ms | 1155ms | 100.00% | ~stable | 0 | 0 |
| 47 | LINE AI | OK | 695ms | 882ms | 100.00% | −34ms | 0 | 0 |
| 48 | Meta Llama | OK | 701ms | 1742ms | 100.00% | +158ms | 0 | 0 |
| 49 | QuillBot | OK | 733ms | 1398ms | 100.00% | ~stable | 0 | 0 |
| 50 | Tencent Hunyuan | OK | 769ms | 1178ms | 100.00% | +60ms | 0 | 0 |
| 51 | SenseTime SenseChat | OK | 970ms | 1473ms | 100.00% | −86ms | 0 | 0 |
| 52 | Zhipu AI (ChatGLM) | OK | 1045ms | 1458ms | 100.00% | +40ms | 0 | 0 |
| 53 | Alibaba Qwen | OK | 1202ms | 2107ms | 100.00% | −25ms | 0 | 0 |
| 54 | NAVER CLOVA | OK | 1327ms | 1611ms | 100.00% | +28ms | 0 | 0 |
| 55 | iFlytek Spark | OK | 1495ms | 2767ms | 100.00% | +36ms | 0 | 0 |
| 56 | Baidu ERNIE Bot | OK | 1507ms | 2020ms | 100.00% | −22ms | 0 | 0 |
| 57 | Baichuan AI | OK | 1676ms | 2663ms | 98.76% | −167ms | 0 | 0 |
| 58 | Qwen Chat | OK | 1732ms | 2759ms | 98.80% | −22ms | 0 | 0 |
| 59 | StepFun | OK | 1777ms | 4396ms | 99.40% | −225ms | 0 | 0 |
| 60 | 01.AI (Yi) | OK | 1813ms | 2400ms | 100.00% | +101ms | 0 | 0 |
| 61 | Moonshot AI (Kimi) | OK | 1833ms | 3227ms | 96.89% | −60ms | 0 | 0 |
| 62 | ByteDance Doubao | OK | 1903ms | 2782ms | 100.00% | −78ms | 0 | 0 |
Methodology: DownForAI measures monitored API and endpoint response latency through scheduled checks (each surface is re-checked roughly every 90 minutes). Latency reflects monitored surface response time — it is NOT a tokens-per-second model generation benchmark. Rankings use a 7-day or 24-hour window and exclude services with insufficient observations (<10 probe results in the 7-day window). Trend compares the last 24 hours against the previous 24-hour period. Data is updated approximately every 15 minutes. Full methodology →