Level AAA

Level AAA criterion: prerecorded speech audio must have no background sounds, a mechanism to turn them off, or background mixed at least 20 dB below the foreground speech

1.4.7 Low or No Background Audio

In Plain Language

1.4.7 Low or No Background Audio (Level AAA) applies to prerecorded audio-only content where the foreground is speech. It requires one of three things to be true: the audio contains no background sounds, the background can be turned off, or the background is mixed at least 20 decibels below the foreground speech^[1]. Occasional sound effects of one or two seconds are exempt. The criterion does not apply to CAPTCHAs or to audio where the primary content is musical vocalisation.

W3C frames the 20 dB threshold as "approximately four times quieter than the foreground speech" -- the separation at which users with mild-to-moderate hearing loss can still isolate the speaker's voice from a steady background bed. Below that gap, the speech and the background collide in the same perceptual channel and the narration becomes unintelligible.

Why It Matters

Age-related and noise-induced hearing loss degrades the cochlear frequency selectivity that non-disabled listeners use to separate a voice from a background bed. A hearing aid amplifies the whole mix, not just the speech, so a podcast with music at the same level as the narrator remains unintelligible after amplification -- the user needs the mix itself to put space between the two sources.
The mechanism failure is specific: a continuous background track (music, room tone, synth pad) at a level within 20 dB of the narrator's voice collapses into a single masked signal. Raising the volume does not help, because it raises the masker in lockstep with the speech.
1.4.7 is distinct from 1.4.2 Audio Control^[2], which is a Level A requirement covering any audio that plays automatically for more than three seconds and mandates a pause, stop, or volume control independent of the system volume. 1.4.2 governs whether the user can silence an autoplaying track; 1.4.7 governs the mix inside a prerecorded speech track the user chose to play.
It is also distinct from the 1.2.x captions and transcript criteria, which provide a text alternative to audio. A transcript does not rescue a listener who is trying to use the audio channel -- it routes them around it. 1.4.7 is about keeping the audio channel usable in the first place.

Examples

Do: Speech-only audio with no background

Tutorial: Screen Reader Basics

Clear narration with no background music or ambient noise.

Speech:-6 dB

Background:None

✔ No background audio -- meets 1.4.7

<!-- Good: audio recorded in quiet environment -->
<!-- No background music or ambient noise -->
<audio controls>
  <source src="tutorial-clean.mp3" type="audio/mpeg">
</audio>
<!-- Speech is the only audio content -->

Don't: Speech with loud background music

Podcast Episode 12

Narration competes with background music throughout.

Speech:-6 dB

Background:-12 dB

✘ Background only 6 dB below speech (needs 20 dB gap)

<!-- Bad: background music at -12 dB, speech at -6 dB -->
<!-- Only 6 dB separation (needs at least 20 dB) -->
<audio controls>
  <source src="podcast-noisy.mp3" type="audio/mpeg">
</audio>
<!-- Users with hearing loss cannot separate -->
<!-- the speech from the background music -->

Do: Background at least 20 dB below speech

Product Demo Walkthrough

Soft ambient music well below the narrator's voice.

Speech:-6 dB

Background:-28 dB

✔ 22 dB separation -- background barely audible

Don't: No option to disable background audio

Training Module Audio

Background noise baked into single audio track with no controls.

PlayPause

✘ No mechanism to turn off background sounds

How to Fix It

Eliminate the background at the source. The no-background option in 1.4.7 is the cheapest to meet: record speech in a treated room with a directional microphone and do not layer music or ambience under the narration track. If there is nothing to mask, there is nothing to measure.
If you mix a background bed, measure the gap. Use a meter in your DAW (peak or RMS, consistently applied to both tracks) and verify the background sits at least 20 dB below the foreground speech for the duration of the speech^[1]. A narrator peaking at -6 dBFS needs the music bed below -26 dBFS. Ducking automation that pulls the music down only during speech segments is the standard technique.
Ship a speech-only alternate if you need a loud mix. The turn-off option is met by delivering a separate audio track with background sounds removed, or by a player control that mutes the background stem. A single baked-down MP3 with music married to voice cannot meet the turn-off option -- the stems have to remain separable at playback time.
Do not rely on the short-effect exception as a loophole. The exception covers occasional sound effects of one or two seconds -- a door close, a phone ring, a sting. It does not cover a continuous music bed that dips briefly. If the background sound is present for most of the speech, it is not occasional and the 20 dB rule applies.
Know the scope. 1.4.7 applies only to prerecorded audio-only content where the foreground is speech. It does not apply to video with a synchronised media track (1.2.x covers those), to live audio, to musical performances, or to CAPTCHAs. For autoplaying audio, the applicable criterion is 1.4.2 Audio Control at Level A, not 1.4.7.

References

[1] W3C (2023). Understanding Success Criterion 1.4.7: Low or No Background Audio. W3C, Accessed 2026-04-07. https://www.w3.org/WAI/WCAG22/Understanding/low-or-no-background-audio.html ↩ ↩
[2] W3C (2023). Understanding Success Criterion 1.4.2: Audio Control. W3C, Accessed 2026-04-07. https://www.w3.org/WAI/WCAG22/Understanding/audio-control.html ↩