@mux/ai Eval Results

Measuring efficacy, efficiency, and expense on every run.

Every workflow ships with evals that measure quality, speed, and cost. This dashboard surfaces the latest results so you can compare providers and trust the defaults we recommend.

Latest Runcompleted

muxinc/ai

main·91c5674·@mux/ai v0.26.0

Cases

Providers

Started

Jul 6, 07:16 PM

Completed

Jul 6, 07:18 PM

Workflow Scorecards

Run 91c5674

Ask Questions

Evalite suite coverage

View eval details →

Cases

Avg Score

0.95

Avg Latency

5.89s

Avg Cost

$0.004

Answers natural-language questions about a video by retrieving relevant context and answering with a concise response.

Burned-in Captions

Evalite suite coverage

View eval details →

Cases

Avg Score

0.98

Avg Latency

5.56s

Avg Cost

$0.0036

Analyzes video frames to detect hardcoded captions baked into the visual content—useful for compliance checks and accessibility audits.

Caption Translation

Evalite suite coverage

View eval details →

Cases

Avg Score

0.95

Avg Latency

12.87s

Avg Cost

$0.0075

Converts captions into multiple languages, helping you reach global audiences without manual translation work.

Chapters

Evalite suite coverage

View eval details →

Cases

Avg Score

0.98

Avg Latency

5.36s

Avg Cost

$0.005

Automatically segments long-form video content into navigable chapters with timestamps and titles—enabling viewers to jump to key moments instantly.

Summarization

Evalite suite coverage

View eval details →

Cases

Avg Score

0.98

Avg Latency

6.61s

Avg Cost

$0.0041

Generates concise summaries and smart tags from your content—perfect for search, discovery, and quick recaps.

Recent Runs

muxinc/ai ·main

91c5674·Jul 6, 07:18 PM·@mux/ai v0.26.0

completed

muxinc/ai ·main

67d79cb·Jul 1, 09:07 AM·@mux/ai v0.26.0

completed

muxinc/ai ·main

72d06c4·Jun 30, 05:55 PM·@mux/ai v0.25.0

completed

muxinc/ai ·main

1808ff2·Jun 23, 10:19 PM·@mux/ai v0.25.0

completed

muxinc/ai ·main

595f998·Jun 23, 03:44 PM·@mux/ai v0.24.0

completed

How to read this

Each workflow card summarizes the latest eval suite for that workflow. Metrics are aggregated for the most recent run only.

1Case count shows total provider/model executions.
2Avg score is the mean of Evalite case scores.
3Latency + cost are averaged across the run.