@mux/ai Eval Results

Measuring efficacy, efficiency, and expense on every run.

Every workflow ships with evals that measure quality, speed, and cost. This dashboard surfaces the latest results so you can compare providers and trust the defaults we recommend.

Latest Runcompleted
muxinc/ai
mainb7cce22·@mux/ai v0.13.1
Cases
92
Providers
3
Started
Apr 3, 07:56 PM
Completed
Apr 3, 07:58 PM

Workflow Scorecards

Run b7cce22

Ask Questions

Evalite suite coverage

View eval details →
Cases
6
Avg Score
0.98
Avg Latency
5.86s
Avg Cost
$0.0036

Answers natural-language questions about a video by retrieving relevant context and answering with a concise response.

Burned-in Captions

Evalite suite coverage

View eval details →
Cases
18
Avg Score
0.98
Avg Latency
5.8s
Avg Cost
$0.0029

Analyzes video frames to detect hardcoded captions baked into the visual content—useful for compliance checks and accessibility audits.

Caption Translation

Evalite suite coverage

View eval details →
Cases
18
Avg Score
0.96
Avg Latency
10.61s
Avg Cost
$0.0058

Converts captions into multiple languages, helping you reach global audiences without manual translation work.

Chapters

Evalite suite coverage

View eval details →
Cases
27
Avg Score
0.98
Avg Latency
5.85s
Avg Cost
$0.0043

Automatically segments long-form video content into navigable chapters with timestamps and titles—enabling viewers to jump to key moments instantly.

Summarization

Evalite suite coverage

View eval details →
Cases
23
Avg Score
0.98
Avg Latency
7.59s
Avg Cost
$0.0035

Generates concise summaries and smart tags from your content—perfect for search, discovery, and quick recaps.

Recent Runs

muxinc/ai ·main
b7cce22·Apr 3, 07:58 PM·@mux/ai v0.13.1
completed
muxinc/ai ·main
699faf5·Apr 3, 05:33 PM·@mux/ai v0.13.0
completed
muxinc/ai ·main
4b0562a·Apr 3, 05:31 PM·@mux/ai v0.12.1
completed
muxinc/ai ·main
b4c8d4f·Mar 31, 10:08 PM·@mux/ai v0.12.1
completed
muxinc/ai ·main
2fa9de6·Mar 27, 09:03 PM·@mux/ai v0.12.0
completed

How to read this

Each workflow card summarizes the latest eval suite for that workflow. Metrics are aggregated for the most recent run only.

  • 1Case count shows total provider/model executions.
  • 2Avg score is the mean of Evalite case scores.
  • 3Latency + cost are averaged across the run.