Real-time insights into your LLM A/B testing experiments
| Experiment Name | Status | Users | Primary Metric | Action |
|---|---|---|---|---|
| GPT-4 vs Claude-3 Code Generation | Completed | 4521 |
response_quality_score
|
View |
| Llama-3 vs Mixtral Q&A | Completed | 7834 |
converted
|
View |
| GPT-3.5 vs GPT-4 Cost-Quality Tradeoff | Running | 3245 |
response_quality_score
|
View |
| Gemini-1.5 vs GPT-4 Summarization | Running | 2156 |
converted
|
View |