@mux/ai Eval Results

Measuring efficacy, efficiency, and expense on every run.

Every workflow ships with evals that measure quality, speed, and cost. This dashboard surfaces the latest results so you can compare providers and trust the defaults we recommend.

Latest Runcompleted
muxinc/ai
mainc15880c·@mux/ai v0.22.0
Cases
108
Providers
3
Started
May 18, 05:58 PM
Completed
May 18, 06:01 PM

Workflow Scorecards

Run c15880c

Ask Questions

Evalite suite coverage

View eval details →
Cases
7
Avg Score
0.95
Avg Latency
5.36s
Avg Cost
$0.0036

Answers natural-language questions about a video by retrieving relevant context and answering with a concise response.

Burned-in Captions

Evalite suite coverage

View eval details →
Cases
21
Avg Score
0.98
Avg Latency
5.13s
Avg Cost
$0.0028

Analyzes video frames to detect hardcoded captions baked into the visual content—useful for compliance checks and accessibility audits.

Caption Translation

Evalite suite coverage

View eval details →
Cases
21
Avg Score
0.96
Avg Latency
9.11s
Avg Cost
$0.0059

Converts captions into multiple languages, helping you reach global audiences without manual translation work.

Chapters

Evalite suite coverage

View eval details →
Cases
32
Avg Score
0.97
Avg Latency
6.72s
Avg Cost
$0.0048

Automatically segments long-form video content into navigable chapters with timestamps and titles—enabling viewers to jump to key moments instantly.

Summarization

Evalite suite coverage

View eval details →
Cases
27
Avg Score
0.97
Avg Latency
6.89s
Avg Cost
$0.0042

Generates concise summaries and smart tags from your content—perfect for search, discovery, and quick recaps.

Recent Runs

muxinc/ai ·main
c15880c·May 18, 06:01 PM·@mux/ai v0.22.0
completed
muxinc/ai ·main
716fc94·May 13, 01:52 PM·@mux/ai v0.21.2
completed
muxinc/ai ·main
8d20931·May 13, 01:43 PM·@mux/ai v0.21.1
completed
muxinc/ai ·main
e737ed7·Apr 30, 07:50 PM·@mux/ai v0.21.1
completed
muxinc/ai ·main
06ffdfb·Apr 30, 07:13 PM·@mux/ai v0.21.0
completed

How to read this

Each workflow card summarizes the latest eval suite for that workflow. Metrics are aggregated for the most recent run only.

  • 1Case count shows total provider/model executions.
  • 2Avg score is the mean of Evalite case scores.
  • 3Latency + cost are averaged across the run.