
Transcribe a 30-Minute Podcast in 90 Seconds (2026 Workflow)
A 30-Minute Episode Should Not Take an Afternoon
You finished editing the episode 20 minutes ago. The audio file is sitting on your desktop. The publish window is in 90 minutes. You still need a transcript, show notes, episode description, three social pull-quotes, and chapter markers.
The transcription step alone used to be the blocker. Eight years ago you sent the file to a human transcriber and waited until tomorrow. Three years ago you ran it through an AI tool and waited 8 to 10 minutes. In 2026 a 30-minute episode finishes in roughly 90 seconds on Whisper or 60 seconds on Deepgram, and the show notes write themselves immediately after.
This guide is the actual workflow podcasters use to turn a freshly edited 30-minute episode into transcript, show notes, and chapter markers in under 3 minutes total.
Why Short Files Are Even Faster
Processing time scales with audio length, but startup costs (queue time, file validation, model load) are roughly constant. A 30-second clip and a 5-minute clip both spend 5 to 10 seconds in setup. After that, audio processing runs at 15 to 30 times real-time on Whisper Large-v3 and faster on Deepgram Nova-3.
For a 30-minute file:
- Deepgram Nova-3 (English-only): 50 to 75 seconds.
- Whisper Large-v3 (multilingual): 75 to 110 seconds.
If you record in English and do not need the multilingual accent handling, Deepgram is the faster path. If your show has international guests with strong accents, Whisper wins on accuracy and the difference is 30 seconds.
The 3-Minute Workflow
Step 1: Export at 128 kbps Mono MP3
After your final edit, export the episode as 128 kbps mono MP3. A 30-minute file at this setting is about 30 MB and uploads in well under a minute on any reasonable connection.
There is no transcription benefit to higher bitrates or stereo. The model downsamples to 16 kHz mono internally anyway. A 320 kbps stereo WAV is 6 times the upload size for zero accuracy gain. See supported audio formats for the full reference.
Step 2: Upload to the Podcast Transcription Tool
Go to podcast transcription and drop the MP3. The tool runs Whisper or Deepgram depending on your selection and includes podcast-specific defaults like timestamp granularity (per-sentence) and speaker labels for interview shows.
If your episode is in a non-English language, use the matching language page which preselects the right engine settings.
Step 3: Run the Podcast Episode Template
Once the transcript is back (60 to 110 seconds in), the podcast episode template processes the transcript into:
- Episode summary (150 words for the show description).
- Three to five chapter markers with timestamps.
- Five pull quotes suitable for social.
- SEO-optimized show notes with H2 headings.
This second pass takes another 20 to 40 seconds.
Real Speeds From Real Episodes
A few recent files:
- 28-minute interview podcast (host + 1 guest, clean studio): 1 minute 4 seconds on Deepgram, 1 minute 32 seconds on Whisper.
- 31-minute solo episode (one voice, in-home studio): 58 seconds on Deepgram.
- 35-minute panel podcast (3 voices, light overlap): 1 minute 48 seconds on Whisper (recommended for multi-speaker).
- 29-minute French language podcast (2 voices): 1 minute 41 seconds on Whisper.
Total pipeline (upload + transcribe + template): 2 to 3 minutes for English, 3 to 4 minutes for multilingual.
What to Do With a 30-Minute Transcript
The transcript itself is the foundation. Almost every downstream task gets faster once it exists:
Show notes go from 30 minutes to 5. With a transcript in front of you, writing the episode description is editing, not composing.
Chapter markers stop being guessed. The template extracts natural topic transitions with timestamps you can paste into Spotify, YouTube, and Apple Podcasts.
Pull quotes write themselves. Instead of re-listening to find the best line, you skim the text. Pull quotes are how listeners decide whether to click play. Without a transcript, finding them is the slowest part of social promotion.
SEO finally works for podcasts. Search engines cannot index audio. They can index the transcript published alongside the episode. Shows that publish full transcripts rank for every topic they discuss, often years after the episode airs.
Common Mistakes That Slow the Workflow Down
A few patterns that turn a 3-minute job into 30 minutes:
Uploading the unmixed multi-track export. A 12-track session export is huge and unnecessary. Transcribe the final mixdown.
Over-processing the audio first. Heavy compression, noise reduction, and EQ help human ears but sometimes hurt speech recognition. Use a moderately clean file, not an over-mastered one.
Manually adding timestamps in the editor. The template handles this. Do not waste 20 minutes pasting times into a text doc.
Re-transcribing edits. If you change 3 sentences in the audio after publishing, do not re-run the whole transcript. Edit the text version directly to match.
When 90 Seconds Becomes 30 Seconds
For very short clips (under 5 minutes), the free tier handles a full transcription with no signup in roughly 20 to 30 seconds. Most podcasters use the free tier to test the engine on a sample clip, then move to Pro once they want unlimited episodes and the AI templates.
For shows publishing weekly, the 60 minutes of free transcription per month does not stretch far. The unlimited Pro tier at $19.99 per month covers any podcast schedule short of multiple daily shows.
Cross-Show Workflow
If you run multiple shows or produce for clients, the transcription API lets you automate the upload + transcribe + template pipeline. New episode lands in S3, webhook fires, transcript and show notes appear in Notion 90 seconds later. That kind of automation is what makes podcast networks possible to run with a small team.
The Honest Comparison
Human transcription services still exist for podcasts. Rev's human tier charges $1.50 per audio minute, so a 30-minute episode costs $45 and arrives next day. For shows where every word matters (legal podcasts, medical content), that is still worth it. For the 95% of shows where AI transcription at 95 to 98% accuracy is fine after a 5-minute proofread, the 90-second AI path wins.
Try one episode and time it yourself. The "transcription takes too long" assumption is the one that breaks when you actually run it.
Building This Into Your Production Workflow
For podcasters who publish weekly or more often, the transcription step can be fully automated. Drop the final mixdown into a Dropbox folder; a watcher script picks it up, submits to our API, and posts the transcript and templated show notes to your CMS draft folder. The producer's only job is editing the draft, not generating it.
For the architecture, see building with transcription API and building an internal transcription tool. For larger podcast networks running multiple shows in parallel, this kind of automation is the difference between scaling and burnout.
Try transcription free
Convert any audio or video to accurate text in seconds. Speaker labels, timestamps, and AI summaries included. No account required.
Related Articles

How to Transcribe a Podcast Episode (And Why You Should)
A working podcaster's guide to transcribing episodes: file prep, multi-track audio, show notes, chapter markers, and the workflow that scales to a weekly show.

Best Transcription Tools for Podcasts in 2026: Honest Ranking by Workflow
Podcasts need fast turnaround, speaker labels, show notes, and SEO-ready exports. Here are the eight tools that actually deliver, ranked by what most podcasters need.