Case study / Descript

Building the data backend behind Descript’s ASR accuracy benchmark.

Descript needed a practical way to manage speech-recognition benchmark data, review Word Error Rate results and support a public comparison of automatic transcription services. I delivered an Airtable-backed workflow and a custom WER reader application to help the team organize, inspect and publish the evaluation data.

Discuss a similar data workflow View outcomes

Input Audio samples and transcripts

Benchmark audio, reference transcripts and ASR-generated transcripts organized for repeatable comparison.

Analysis WER reader application

A custom workflow for reading Word Error Rate outputs and comparing transcription accuracy across providers.

Output Publishable benchmark data

Structured results prepared for review, filtering and publication in Descript’s public ASR accuracy article.

System-level workflow representation showing how benchmark evidence, WER analysis and publishable ASR results connect.

Overview

From raw transcription tests to a structured benchmark workflow.

The project supported Descript’s comparison of automatic speech-recognition providers by organizing test data, reference transcripts, ASR outputs and WER results into a workflow the team could inspect and publish with confidence.

Context

Descript was evaluating automatic transcription providers to understand which ASR engine offered the strongest combination of speed, accuracy and affordability for its customers.

Challenge

The benchmark required many audio samples, professional reference transcripts, ASR-generated transcripts and WER outputs to be organized in a way that made the results clear, comparable and publishable.

Solution

I built the Airtable backend structure and a custom WER reader application so Descript could review transcription accuracy data and support the final benchmark publication.

Outcome snapshot

A data workflow that supported a published ASR accuracy benchmark.

The public article presented a transparent comparison of transcription services using Word Error Rate, grounded in representative audio samples and professionally verified reference transcripts.

Audio clips evaluated

The benchmark used about 50 clips, each around 3–5 minutes, covering broadcast, calls, meetings and other customer-like audio.

ASR engines compared

The published results compared providers including Google Speech, Temi, Amazon, Speechmatics, Trint, Microsoft and IBM Watson.

16%

Best average WER reported

Google Speech Video achieved the strongest reported average Word Error Rate in Descript’s 2018 benchmark.

Solution design

A backend and analysis workflow designed around benchmark clarity.

The solution focused on making the evaluation data easier to store, inspect and compare, so Descript could move from raw ASR outputs to a credible public accuracy report.

Airtable backend

Structured backend records for audio samples, reference transcripts, ASR providers, test outputs and benchmark metadata.

WER reader app

A custom application layer for reading Word Error Rate outputs and helping the team inspect accuracy results.

Provider comparison flow

Data structures that supported side-by-side comparison across multiple automatic transcription services.

Publication-ready evidence

Organized benchmark data that could support public reporting, filtering and explanation of the ASR accuracy results.

Benchmark flow

How ASR benchmark data moves from sample to result.

Audio samples, provider transcripts, and normalized Word Error Rate data are organized into a clear benchmark flow for comparing ASR performance with less manual review.

Benchmark pipeline

Audio input, provider output and WER comparison.

Result analysis

Clear movement from ASR output to comparable accuracy results.

WER outputs are framed around audio samples, reference transcripts and provider records so results can be reviewed consistently.

Evidence structure

Reference transcripts, test transcripts, provider metadata and WER values in one benchmark model.

The structure supports a credible case narrative while avoiding fake screenshots or unsupported production interface claims.

Delivery path

A practical delivery sequence for benchmark data infrastructure.

The work is framed around the real operational path: structure the benchmark data, support WER review, organize provider comparisons and prepare the evidence for publication.

Phase 01
Benchmark data mapping

Define the relationships between audio samples, reference transcripts, ASR provider outputs and WER results.
Phase 02
Airtable backend setup

Structure the backend tables and fields needed to organize benchmark records and keep evaluation data reviewable.
Phase 03
WER reader implementation

Build the custom application workflow used to read WER outputs and support transcription accuracy analysis.
Phase 04
Results review and publication support

Prepare the organized benchmark data so the Descript team could review, explain and publish the ASR comparison.

Technical direction

Technical structure that supports benchmark accuracy and data clarity.

The technical direction focused on Airtable-backed data organization, WER analysis and benchmark reporting support, using only confirmed project details instead of unsupported production-stack claims.

Airtable backend WER reader app ASR benchmark data Reference transcripts Provider comparison Results reporting

Project takeaway

“The value of the build was in turning benchmark data, reference transcripts and WER outputs into a workflow Descript could use to compare ASR providers with clarity.”

My delivery note A case-study takeaway focused on data structure, WER analysis and credible benchmark publication support.

Building the data backend behind Descript’s ASR accuracy benchmark.

Benchmark data flow from audio samples to WER results

From raw transcription tests to a structured benchmark workflow.

Context

Challenge

Solution

A data workflow that supported a published ASR accuracy benchmark.

Audio clips evaluated

ASR engines compared

Best average WER reported

A backend and analysis workflow designed around benchmark clarity.

Airtable backend

WER reader app

Provider comparison flow

Publication-ready evidence

How ASR benchmark data moves from sample to result.

Audio input, provider output and WER comparison.

Clear movement from ASR output to comparable accuracy results.

Reference transcripts, test transcripts, provider metadata and WER values in one benchmark model.

A practical delivery sequence for benchmark data infrastructure.

Benchmark data mapping

Airtable backend setup

WER reader implementation

Results review and publication support

Technical structure that supports benchmark accuracy and data clarity.

What's Your Project?

Let’s discuss your project.

Email

Phone

LinkedIn