Over 99% of video content on the web lacks audio descriptions
1.2.5 Audio Description (Prerecorded)
In Plain Language
[1.2.5 Audio Description (Prerecorded)](https://www.w3.org/WAI/WCAG22/Understanding/audio-description-prerecorded.html) is a Level AA success criterion that requires an audio description for every prerecorded video with a synchronised soundtrack[1]. An audio description is a narration track that conveys visual information -- actions, scene changes, on-screen text, charts, facial expressions -- during the natural pauses in the dialogue, so a blind or low-vision user listening to the soundtrack receives the same information a sighted viewer sees.
1.2.5 is the Level AA escalation of [1.2.3 Audio Description or Media Alternative (Prerecorded)](https://www.w3.org/WAI/WCAG22/Understanding/audio-description-or-media-alternative-prerecorded.html). At Level A, 1.2.3 lets you pick one of two paths: an audio description track, or a full text-based media alternative that describes everything in the video. At Level AA, 1.2.5 removes the text-alternative escape hatch -- the audio description track is mandatory[1]. If the original audio track already narrates the visual content (a talking-head lecture where the speaker describes every slide, for example), the criterion is satisfied without a second track, because the visual information is already available from the audio alone.
Why It Matters
- A blind or low-vision user listening to a video without audio description receives only the dialogue and incidental sound. Every on-screen action, caption card, chart, diagram, demo cursor movement, and scene change is lost -- and in instructional or product-demo video, those visuals often carry the entire lesson.
- A text transcript or media alternative (the 1.2.3 escape hatch) forces the user out of the playback timeline into a separate document, breaking synchronisation with any live audio cues and any shared viewing experience. 1.2.5 closes that loophole at Level AA specifically to keep blind users inside the timeline[1].
- Demos that rely on phrases like "click the blue button" or "drag this into that column" are unintelligible without description -- the referents only exist on screen. Without a description track, the video is not a tutorial for a blind viewer; it is a podcast missing half its words.
- WCAG 2.2 ships 1.2.5 at Level AA, which means most public-sector procurement baselines (EN 301 549, Section 508 refresh, ADA Title II's WCAG 2.1 AA reference) inherit it[2]. For video-heavy products, it is a default requirement, not an optional enhancement.
Examples
<track kind="descriptions"> alongside the video
<video controls>
<source src='training.mp4' type='video/mp4'>
<track kind='descriptions' src='ad.vtt'
srclang='en' label='Audio Descriptions'>
</video>
✔ Audio description track narrates visual information during pauses in dialogue
<video controls>
<source src="training.mp4" type="video/mp4">
<!-- Audio description track for blind users -->
<track kind="descriptions"
src="training-ad.vtt"
srclang="en"
label="Audio Descriptions">
<!-- Captions for deaf users -->
<track kind="captions"
src="training-captions.vtt"
srclang="en"
label="English" default>
</video>
<!-- The audio description VTT file contains
narration timed to pauses in dialogue:
WEBVTT
00:00:03.000 --> 00:00:06.500
A woman walks to a whiteboard and draws
a flowchart with three connected boxes.
00:00:15.000 --> 00:00:18.000
She circles the middle box labeled
"Accessibility Review" in red. -->
<video controls>...</video>
<a href='transcript.html'>Read full transcript</a>
✘ A text transcript satisfies 1.2.3 (Level A) but does not satisfy 1.2.5 (Level AA) -- an audio description track is required
<!-- FAILS 1.2.5: text transcript is not
a substitute for audio description -->
<video controls>
<source src="demo.mp4" type="video/mp4">
<track kind="captions" src="captions.vtt"
srclang="en" label="English" default>
</video>
<a href="transcript.html">Read full transcript</a>
<!-- A transcript link meets 1.2.3 (Level A)
because 1.2.3 allows a text alternative OR
audio description. But 1.2.5 (Level AA)
specifically requires audio description --
users must be able to hear visual details
narrated while watching the video. -->
<p>Choose a version:</p>
<ul>
<li><a href='standard.html'>Standard video</a></li>
<li><a href='described.html'>With audio description</a></li>
</ul>
✔ When pauses are too short for inline description, provide an extended version
<p>Choose a version:</p>
<ul>
<li><a href="standard.html">Standard video</a></li>
<li><a href="described.html">
Video with audio description
</a></li>
</ul>
<!-- When the original video has little or no
natural pause for description, create a
second version where the video pauses to
allow the narrator to describe visual content.
This "extended audio description" approach
ensures all visual information is conveyed
even in fast-paced videos. Place the link
to the described version immediately adjacent
to the original video. -->
<track kind='descriptions' src='ad.vtt'>
Audio description says: "Sarah says hello and talks about the project."
But on screen: Sarah points to a chart showing 40% drop in compliance scores, then highlights three failing pages.
✘ Audio description must narrate visual information, not just restate dialogue
<!-- FAILS: audio description ignores key visuals -->
<track kind="descriptions" src="bad-ad.vtt"
srclang="en" label="Audio Descriptions">
<!-- The description track only says:
"Sarah says hello and talks about the project."
But the video shows Sarah pointing to a chart
with a 40% drop in compliance scores, then
highlighting three specific failing pages.
Audio description must convey visual information
that is not available from the audio alone --
charts, graphs, actions, scene changes, and
on-screen text. Simply restating dialogue
does not meet the requirement. -->
How to Fix It
- Inventory every prerecorded video in scope and flag visual-only information. Watch each asset with the picture off and list everything the audio alone does not convey: on-screen text, diagrams, cursor movements in demos, chart values, scene changes, speaker gestures, and any visual that the dialogue references without naming. Anything on that list is a description target.
- Decide per-video whether inline or extended description is the right mechanism. If the dialogue has natural gaps long enough to drop a sentence of narration into, inline description against the original timeline meets 1.2.5. If the dialogue is wall-to-wall, the description cannot fit without overrunning the next line -- that is the case 1.2.7 Extended Audio Description is written for, and the practical fix at 1.2.5 is a second, described-video file where playback pauses to make room for the narration[1].
- Write the description script to the pause budget. For each visual target, write narration that fits the available pause without colliding with the next line of dialogue. Describe what is happening, not what it means -- "a chart drops from 90 percent to 50 percent" rather than "the results are disappointing." Prioritise information the viewer cannot recover from audio; drop redundant details when the pause runs out.
- Record a clearly distinct narrator voice. The description voice must be distinguishable from the on-screen speakers so listeners can tell narration from dialogue without guessing. Match the loudness of the programme audio so users do not have to ride the volume.
- Deliver the description through one of the supported mechanisms. For HTML5 video, attach a synchronised audio description track using
<track kind="descriptions">referencing a WebVTT file -- thekindattribute is what signals to assistive technology that the track contains descriptive narration rather than captions. For platforms that do not expose a second audio track, publish a separate described-video file and link to it adjacent to the original. For low-visual-content talking-head footage, a description baked directly into the original audio track is sufficient provided it covers every visual-only element. - Keep the transcript too. A text transcript is still useful for deafblind users and search, and it satisfies 1.2.3 at Level A, but it does not substitute for the audio description track 1.2.5 requires[1]. Plan both deliverables into the video production workflow so neither becomes a retrofit.
References
- [1] W3C (2023). Understanding Success Criterion 1.2.5: Audio Description (Prerecorded). W3C, Accessed 2026-04-07. https://www.w3.org/WAI/WCAG22/Understanding/audio-description-prerecorded.html ↩ ↩ ↩ ↩ ↩
- [2] W3C (2023). Web Content Accessibility Guidelines (WCAG) 2.2. W3C, Accessed 2026-04-07. https://www.w3.org/TR/WCAG22/ ↩