anthropic ·claude-sonnet-4-5Apr 3, 07:58 PM
Asset 88Lb01q
Score
1
Latency
5.26s
Cost
$0.0109
Workflow Eval Detail
Generates concise summaries and smart tags from your content—perfect for search, discovery, and quick recaps.
High-quality summarization across providers, with `claude-sonnet-4-5` best on quality and `gemini-3.1-flash-lite-preview` best on latency/cost, but results are only directional given the 6-case sample size.
Each eval run captures efficacy, efficiency, and expense. We use this data to compare providers and track regressions over time.
We score summary quality, tag relevance, and semantic similarity while tracking latency, token usage, and cost.
| Provider | Model | Cases | Avg Score | Avg Latency | Avg Tokens | Avg Cost | Avg Cost / Min |
|---|---|---|---|---|---|---|---|
| anthropic | claude-sonnet-4-5 | 4 | 0.99 | 5.79s | 3,108 | $0.011 | $0.0189/min |
| gemini-2.5-flash | 4 | 0.98 | 7.23s | 2,386 | $0.0029 | $0.0047/min | |
| gemini-3-flash-preview | 4 | 0.99 | 6.49s | 2,889 | $0.0026 | $0.0054/min | |
| gemini-3.1-flash-lite-preview | 4 | 0.99 | 3.57s | 2,319 | $0.0007 | $0.0013/min | |
| openai | gpt-5-mini | 4 | 0.93 | 17.27s | 3,633 | $0.002 | $0.0043/min |
| openai | gpt-5.1 | 3 | 0.99 | 4.36s | 1,870 | $0.0016 | $0.0026/min |