Model Comparison

Compare performance metrics across all tested LLM models

GPT-4-Turbo OpenAI
7.82
Quality Score
0.72%
Conversion Rate

Avg Latency 1245.0 ms
P95 Latency 2890.0 ms
Error Rate 0.012%
Cost/1k tokens $0.01
Total Interactions 125000
GPT-3.5-Turbo OpenAI
6.92
Quality Score
0.65%
Conversion Rate

Avg Latency 458.0 ms
P95 Latency 890.0 ms
Error Rate 0.008%
Cost/1k tokens $0.0005
Total Interactions 98000
Claude-3-Opus Anthropic
8.15
Quality Score
0.75%
Conversion Rate

Avg Latency 1420.0 ms
P95 Latency 3200.0 ms
Error Rate 0.015%
Cost/1k tokens $0.015
Total Interactions 87000
Claude-3-Sonnet Anthropic
7.54
Quality Score
0.70%
Conversion Rate

Avg Latency 820.0 ms
P95 Latency 1650.0 ms
Error Rate 0.011%
Cost/1k tokens $0.003
Total Interactions 72000
Llama-3-70B Meta
7.22
Quality Score
0.68%
Conversion Rate

Avg Latency 920.0 ms
P95 Latency 1850.0 ms
Error Rate 0.018%
Cost/1k tokens $0.0009
Total Interactions 65000
Mixtral-8x7B Mistral AI
7.05
Quality Score
0.65%
Conversion Rate

Avg Latency 680.0 ms
P95 Latency 1200.0 ms
Error Rate 0.014%
Cost/1k tokens $0.0002
Total Interactions 54000
Gemini-1.5-Pro Google
7.68
Quality Score
0.71%
Conversion Rate

Avg Latency 750.0 ms
P95 Latency 1480.0 ms
Error Rate 0.01%
Cost/1k tokens $0.007
Total Interactions 48000
Quality vs Conversion
Latency Comparison
Cost-Quality Analysis
Detailed Comparison
Model Provider Quality Conversion Avg Latency P95 Latency Error Rate Cost/1k Interactions
GPT-4-Turbo OpenAI 7.82 0.72% 1245.0 ms 2890.0 ms 0.012% $0.01 125000
GPT-3.5-Turbo OpenAI 6.92 0.65% 458.0 ms 890.0 ms 0.008% $0.0005 98000
Claude-3-Opus Anthropic 8.15 0.75% 1420.0 ms 3200.0 ms 0.015% $0.015 87000
Claude-3-Sonnet Anthropic 7.54 0.70% 820.0 ms 1650.0 ms 0.011% $0.003 72000
Llama-3-70B Meta 7.22 0.68% 920.0 ms 1850.0 ms 0.018% $0.0009 65000
Mixtral-8x7B Mistral AI 7.05 0.65% 680.0 ms 1200.0 ms 0.014% $0.0002 54000
Gemini-1.5-Pro Google 7.68 0.71% 750.0 ms 1480.0 ms 0.01% $0.007 48000