anthropic ·claude-sonnet-4-5Apr 3, 07:58 PM
Asset 1XIUcA9
Score
0.96
Latency
3.53s
Cost
$0.0103
Workflow Eval Detail
Automatically segments long-form video content into navigable chapters with timestamps and titles—enabling viewers to jump to key moments instantly.
Chapters performs accurately and cheaply across providers, with Google gemini-3.1-flash-lite-preview as the best current all-round option, though per-model results are based on only 2 runs each.
Each eval run captures efficacy, efficiency, and expense. We use this data to compare providers and track regressions over time.
We evaluate chapter segmentation quality, timestamp accuracy, and title relevance alongside latency and cost metrics.
| Provider | Model | Cases | Avg Score | Avg Latency | Avg Tokens | Avg Cost | Avg Cost / Min |
|---|---|---|---|---|---|---|---|
| anthropic | claude-sonnet-4-5 | 5 | 0.98 | 4.02s | 3,272 | $0.0113 | $0.0013/min |
| gemini-2.5-flash | 5 | 0.98 | 6.37s | 4,367 | $0.0041 | $0.0006/min | |
| gemini-3-flash-preview | 5 | 0.98 | 6.06s | 4,165 | $0.0039 | $0.0004/min | |
| gemini-3.1-flash-lite-preview | 5 | 0.99 | 1.75s | 3,282 | $0.001 | $0.0001/min | |
| openai | gpt-5-mini | 4 | 0.95 | 14.71s | 3,728 | $0.0025 | $0.0003/min |
| openai | gpt-5.1 | 3 | 0.98 | 2.71s | 2,856 | $0.0017 | $0.0002/min |