Dashboard Models12 Analytics Conversations3

AI Tools

PlaygroundNew Fine-tuning Prompt Templates Model Comparison Agent BuilderBeta Vector Database Deployments Evaluation

Development

API Documentation Experiments Model Registry

Management

Billing Settings

Upgrade to Pro

Unlock premium features and boost productivity!

Evaluation

Evaluate and benchmark your AI models and agents.

Accuracy

94.8%

+2.4% from last month

Response Time

124ms

-15ms from last month

Relevance Score

8.7/10

+0.5 from last month

Error Rate

0.8%

-0.3% from last month

Performance by Model

Performance chart would appear here

Metrics Over Time

Trends chart would appear here

Model Performance

Model	Accuracy	Response Time	Relevance	Error Rate	RAGAS Score	Last Evaluated
GPT-4o	96.2%	145ms	9.1/10	0.5%	0.87	2 days ago
Claude 3	95.8%	120ms	8.9/10	0.7%	0.85	3 days ago
Llama 3	93.5%	95ms	8.5/10	1.2%	0.82	1 week ago
Mistral Large	94.2%	110ms	8.7/10	0.9%	0.83	5 days ago
Custom Fine-tuned Model	97.1%	130ms	9.3/10	0.4%	0.89	1 day ago