DownForAI

LLM Speed Index — Live AI API Latency Rankings

Measured latency and reliability for 62 LLM API providers, ranked by 7-day p50 response time. Toggle between 7-day and 24-hour windows. Probed every ~90 minutes by DownForAI.

Last updated: Fri, 19 Jun 2026 10:25:43 GMT

Fastest API
Ollama
84ms p50
Most Reliable
Ollama
100.00% uptime
📈
Biggest Improvement
StepFun
225ms faster vs yesterday
🔔
Most Reported 24h
Ollama
1 reports
62 providers · updated < 2 min ago
#ProviderStatusp50 7dp95 7dUptime 7dTrend 24hIncidents 7dReports 24h
1OllamaOK84ms216ms100.00%~stable01
2Apple IntelligenceOK89ms413ms100.00%~stable00
3Nous HermesOK94ms325ms100.00%~stable00
4WizardLMOK94ms260ms100.00%~stable00
5Falcon (TII)OK99ms288ms100.00%~stable00
6H2O.aiOK105ms287ms100.00%−40ms00
7Reka AIOK105ms284ms100.00%~stable00
8xAI GrokOK105ms343ms100.00%~stable00
9Mistral AIOK106ms422ms100.00%−39ms00
10Microsoft CopilotOK108ms599ms100.00%−26ms00
11CorcelOK116ms316ms100.00%~stable00
12CohereOK124ms519ms100.00%~stable00
13IBM GraniteOK129ms657ms100.00%~stable00
14OpenRouterOK129ms335ms100.00%~stable00
15Upstage SolarOK136ms267ms100.00%~stable00
16Copy.aiOK141ms527ms100.00%+22ms00
17Thinking Machines LabOK160ms349ms100.00%~stable00
18HuggingChatOK162ms471ms99.40%~stable00
19GensparkOK166ms777ms100.00%~stable00
20Google GeminiOK166ms521ms100.00%~stable00
21Claude ChatOK185ms438ms100.00%+23ms00
22Google GemmaOK218ms399ms99.39%+55ms00
23Le Chat (Mistral)OK228ms943ms100.00%~stable00
24Huawei PanguOK236ms479ms100.00%+65ms00
25AI21 LabsOK240ms566ms100.00%~stable00
26WritesonicOK243ms659ms100.00%−27ms00
27PoeOK249ms576ms100.00%~stable00
28LM StudioOK256ms460ms100.00%~stable00
29WritesmithOK271ms679ms100.00%−28ms00
30DeepSeekOK274ms535ms100.00%~stable00
31Amazon NovaOK276ms817ms100.00%+59ms00
32AnthropicOK280ms722ms100.00%~stable00
33DeepSeek CoderOK285ms744ms100.00%~stable00
34Character.AIOK296ms1259ms97.30%~stable00
35ChatGPTOK309ms527ms100.00%+36ms00
36OpenAIOK374ms635ms100.00%~stable00
37DuckDuckGo AIOK390ms951ms100.00%−39ms00
38AI2 OLMoOK406ms1038ms96.95%−79ms00
39SkyworkOK415ms711ms100.00%~stable00
40CozeOK420ms1028ms100.00%−117ms00
41Aleph AlphaOK482ms684ms100.00%−49ms00
42Inflection PiOK569ms968ms100.00%−200ms00
43YouChatOK586ms1310ms100.00%~stable00
44LMArenaOK589ms1274ms100.00%~stable00
45Jais (Inception)OK668ms817ms100.00%+30ms00
46Yuan 2.0OK670ms1155ms100.00%~stable00
47LINE AIOK695ms882ms100.00%−34ms00
48Meta LlamaOK701ms1742ms100.00%+158ms00
49QuillBotOK733ms1398ms100.00%~stable00
50Tencent HunyuanOK769ms1178ms100.00%+60ms00
51SenseTime SenseChatOK970ms1473ms100.00%−86ms00
52Zhipu AI (ChatGLM)OK1045ms1458ms100.00%+40ms00
53Alibaba QwenOK1202ms2107ms100.00%−25ms00
54NAVER CLOVAOK1327ms1611ms100.00%+28ms00
55iFlytek SparkOK1495ms2767ms100.00%+36ms00
56Baidu ERNIE BotOK1507ms2020ms100.00%−22ms00
57Baichuan AIOK1676ms2663ms98.76%−167ms00
58Qwen ChatOK1732ms2759ms98.80%−22ms00
59StepFunOK1777ms4396ms99.40%−225ms00
6001.AI (Yi)OK1813ms2400ms100.00%+101ms00
61Moonshot AI (Kimi)OK1833ms3227ms96.89%−60ms00
62ByteDance DoubaoOK1903ms2782ms100.00%−78ms00
Methodology: DownForAI measures monitored API and endpoint response latency through scheduled checks (each surface is re-checked roughly every 90 minutes). Latency reflects monitored surface response time — it is NOT a tokens-per-second model generation benchmark. Rankings use a 7-day or 24-hour window and exclude services with insufficient observations (<10 probe results in the 7-day window). Trend compares the last 24 hours against the previous 24-hour period. Data is updated approximately every 15 minutes. Full methodology →
AI Reliability Rankings·Reliability Index·AI API Uptime Comparison 2026·How We Monitor