@mux/ai Eval Results

Measuring efficacy, efficiency, and expense on every run.

Every workflow ships with evals that measure quality, speed, and cost. This dashboard surfaces the latest results so you can compare providers and trust the defaults we recommend.

Latest Runcompleted

muxinc/ai

main·d5b5d84·@mux/ai v0.7.4

Cases

Providers

Started

Feb 18, 09:10 PM

Completed

Feb 18, 09:12 PM

Workflow Scorecards

Run d5b5d84

Ask Questions

Evalite suite coverage

View eval details →

Cases

Avg Score

0.98

Avg Latency

7.12s

Avg Cost

$0.0039

Answers natural-language questions about a video by retrieving relevant context and answering with a concise response.

Burned-in Captions

Evalite suite coverage

View eval details →

Cases

Avg Score

0.96

Avg Latency

6.15s

Avg Cost

$0.003

Analyzes video frames to detect hardcoded captions baked into the visual content—useful for compliance checks and accessibility audits.

Caption Translation

Evalite suite coverage

View eval details →

Cases

Avg Score

0.94

Avg Latency

17.02s

Avg Cost

$0.0077

Converts captions into multiple languages, helping you reach global audiences without manual translation work.

Chapters

Evalite suite coverage

View eval details →

Cases

Avg Score

0.96

Avg Latency

8.32s

Avg Cost

$0.0047

Automatically segments long-form video content into navigable chapters with timestamps and titles—enabling viewers to jump to key moments instantly.

Summarization

Evalite suite coverage

View eval details →

Cases

Avg Score

0.94

Avg Latency

8.81s

Avg Cost

$0.004

Generates concise summaries and smart tags from your content—perfect for search, discovery, and quick recaps.

Recent Runs

muxinc/ai ·main

d5b5d84·Feb 18, 09:12 PM·@mux/ai v0.7.4

completed

muxinc/ai ·main

11c4311·Feb 18, 08:55 PM·@mux/ai v0.7.3

completed

muxinc/ai ·vb/add-smaller-models-to-published-evals

22e484b·Feb 18, 08:55 PM·@mux/ai v0.7.3

completed

muxinc/ai ·main

9298365·Feb 18, 01:10 PM·@mux/ai v0.7.3

completed

muxinc/ai ·main

c942a12·Feb 17, 08:37 PM·@mux/ai v0.7.3

completed

How to read this

Each workflow card summarizes the latest eval suite for that workflow. Metrics are aggregated for the most recent run only.

1Case count shows total provider/model executions.
2Avg score is the mean of Evalite case scores.
3Latency + cost are averaged across the run.