EN 301 549 7.2 -- Audio Description Technology
What It Is
ETSI EN 301 549 v3.2.1 clause 7.2 "Audio description technology" is the parallel of clause 7.1 for the secondary audio channel that narrates visual information between lines of dialogue[1]. Sub-clauses cover three mechanics: 7.2.1 requires a mechanism to select and route audio description to the default audio output; 7.2.2 requires the description to stay synchronised with the video and the main programme audio; 7.2.3 requires the description data to be preserved through transmission, transcoding, and recording.
Why It Matters
Audio description is carried as a secondary audio track, a pre-mixed described variant, or a receiver-mix instruction set (DVB AD, ATSC broadcaster-mix). Each link in the playback chain has to keep the track intact and reachable. Typical mechanism failures: HTML5 players that never parse <track kind="descriptions">; streaming transcoders that discard non-primary audio renditions on the way into HLS or DASH; smart TVs that list "Audio Description" in the settings menu but bind it to a stream the tuner is not decoding; conferencing tools that record only the active speaker mix and drop the description track entirely. The description file exists; the technology in the middle makes it unreachable.
How It Relates to WCAG
WCAG 2.2 places the authoring obligation on the content creator: 1.2.3, 1.2.5, and 1.2.7 require audio description (or an equivalent media alternative) for prerecorded synchronised media at Levels A, AA, and AAA respectively[2]. Clause 7.2 is broader -- it places the obligation on the technology. A site can ship a correctly described MP4 and satisfy 1.2.5, and still fail 7.2 if the player on the other end has no audio-track selector, no kind="descriptions" handling, or no pass-through for the AD rendition in its HLS manifest.
Practical Implications
- Support at least one standard description delivery: a secondary audio track in the container (MP4, MKV, WebM), an HLS/DASH alternate audio rendition with
role="description", or DVB/ATSC receiver-mix metadata. For HTML5, implement<track kind="descriptions">where the engine supports it, and fall back to an audio-track selector for the cases where browser support is patchy. - Expose an audio-track selector in the player UI when more than one audio track is present, and label tracks by role and language so assistive technology can announce them.
- Preserve synchronisation across seek, playback-rate change, and adaptive bitrate switches. A description cue that drifts past the shot it describes is worse than no description.
- Preserve the description track through recording, download, and transcoding pipelines. Dropping non-primary audio on re-encode is the common silent failure.
- Make the user's preference persist across sessions, so a viewer who needs description does not have to re-enable it on every video.