In your English class, “dither” means to act nervously. When we’re talking about digital audio and home studio recording, dithering is the process of adding noise to the audio signal.
The post below is excerpted and adapted from Dithering With Ozone: Tools, Tips, and Techniques, a guide produced by our friends at iZotope. Reprinted with permission.
What is dithering?
In your English class, to “dither” means to act nervously or indecisively. When we’re talking about digital audio and home studio recording, dithering is the process of adding noise to the audio signal. Adding noise, you say? Why would you want add noise? Basically, it’s a trade — low-level hiss in exchange for a reduction in distortion when you convert 24 bit to 16 bit audio to transfer to a CD.
It starts by recognizing how digital audio is stored and represented. Unlike analog audio, which is “infinitely continuous,” digital audio is represented by individual bits. This “quantization” means that while we might hear a continuous sound, it’s really a whole bunch of discrete 1s and 0s.
As an example, from a distance, a digitally created sine wave looks and sounds pretty continuous.
If we zoom in though, we can see that our continuous looking waveform is actually a bunch of individual, or discrete, samples.
As we take more samples per second (e.g. 44.1 kHz, 48 kHz, 96 kHz — where “Hz” is literally the number of samples per second and “kHz” is the number of 1000’s of samples per second), we get better “horizontal” resolution, which translates to being able to represent higher frequencies. As we use more bits per sample (e.g. 16 bits, 24 bits, 32 bits, 64 bits, etc.) we get better “vertical” resolution, which translates to better “dynamic resolution” (i.e. more dynamic range and/or signal to noise ratio). In the context of dithering, it is in the “bits per sample” that we’re interested.
So, better dynamic resolution is pretty easy to get on your PC; just use more bits per sample. Get a 24-bit A/D converter; mix and edit at 32 bits; process effects at 64 bits. End of story. Except… audio CDs are 16-bit. At some point, if you want to put your songs on a CD, you have to get that 32-bit word length down to 16 bits. And that’s the problem.
Truncate the naughty bits
So it comes down to fitting 24 bits on a 16-bit CD. As Prince sings in “Insatiable,” “We’ll just erase the naughty bits.” That’s one solution — the simplest way to convert is to simply throw away, or “truncate,” the lowest 8 bits. Take 24 bits, throw away the lowest 8, you’re left with 16, and you can make a CD. (Upon further listening, we realize Prince might not have been talking about word length reduction through truncation in that song.)
Before you just throw the bits away though, let’s check out a little comparison. First, look at a spectrum of our 1 kHz 24-bit sine wave. The spectrum of a pure tone should be single spike at the frequency of the tone, as it is here.
Before you just throw the bits away though, let’s check out a little comparison. First, look at a spectrum of our 1 kHz 24-bit sine wave. The spectrum of a pure tone should be single spike at the frequency of the tone, as it is here.
Now we’ll convert our nice smooth 24-bit sine wave to 16 bits by simply truncating or throwing away the least significant 8 bits.
Looking first at the waveform in time, we can see that things got a little less smooth. Especially in the area with the circle, you can see that with fewer bits, we’re not able to represent the original smooth curve.
Truncation introduces “quantization error” which is the difference between where the higher resolution sine wave had its samples and where the lower resolution sine wave has to put the samples.
When in doubt, dither
So what to do? We dither. We’ll get into the details of dithering later, but for now consider it as adding very low-level noise to the audio before it is converted from 24 bits to 16 bits. The results of converting with and without dithering are shown here:
Our dithered tone (white spectrum) seems to look better. We’ve effectively done away with the jagged quantization noise. It sounds better, but we have made a tradeoff and added some low level noise.
Our dithered tone (white spectrum) seems to look better. We’ve effectively done away with the jagged quantization noise. It sounds better, but we have made a tradeoff and added some low level noise.
The terrible truth is revealed: you’ve spent time and money on low noise preamps, A/Ds, and everything else. But in the end, you’re going to deliberately add noise to your mix when you convert it from 24 to 16-bits to put on a CD.
On the bright side, we’ve traded “bad noise” for “good noise.” Instead of quantization error, we’ve added a smoother more “continuous” noise.
(Editor’s note: We’re simplifying not only the principles behind dithering but the benefits as well. Just understanding the tradeoff to begin with is a good place to start, though.)
The Atlantic puffin
At the risk of confusing the issue with an analogy, here’s another way to look at it. Image processing faces the same issues of bit depth, resolution and dithering. When representing a photo on a computer, each sample has a discrete number of colors or shades of gray that it can represent. The number of levels is determined by the bit depth of the file, just like the levels of samples of audio.
Now we only have 4 shades of gray to represent the photo and something is going to be lost. Same as in digital audio, each sample has to be forced to a level, and there aren’t enough levels anymore to represent the data as a continuous image (or a smooth continuous signal in the case of audio).
Like 8-bit audio (or even 16-bit compared to 24-bit) our 2-bit dithered picture is not great, but it’s better than when we just truncated. In fact, it shows three characteristics that are analogous to down-sampling audio with dither.
1) We’ve reduced the quantization error (the jaggedness or “blockiness”) with the tradeoff of adding a different type of noise (the speckles in the dithered picture).
2) We’ve actually maintained some detail from the original higher resolution picture. In audio, this relates to the concept that reducing the number of bits per sample with dithering can give you greater perceived dynamic range than the signal to noise ratio.
Not making sense? Well, the theoretical maximum signal-to-noise ratio of a 16-bit recording is about 96 dB. That means, in basic terms, that the noise floor of a 16-bit recording is around -96 dB. The noise floor should not be confused (but often is) with dynamic range, which represents the difference between the loudest and softest signal that you can hear in the recording. You can hear a dynamic range which extends past the noise floor with proper dithering. Even if a signal is “in the noise,” you will still hear it with proper dithering. The breadth of the dynamic range you can hear below the noise floor can’t be expressed mathematically. It’s a result of the program material, and how well you can hear — or more precisely: how well a listener can resolve signal from noise at low levels.
3) The picture also answers a common audio question: “If I know I’m making a 16-bit CD in the end, should I still record, mix and master at 24-bits?”
Yes, the extra bits provide you (or more specifically the dither) with information that allows the dynamic range to be greater than the noise floor. In short, 24-bit properly dithered to 16-bit will sound better than an original 16-bit recording.
Learn more – download the entire guide: Dithering With Ozone: Tools, Tips, and Techniques. Learn more about iZotope at iZotope.com.
Read More
Audio mastering basics for your home studio
Audio mastering – the mysterious post-production art form
Your home studio mix – recording tips for better results
Ear Fatigue and Mixing Music – Know the Signs, Avoid Mistakes
Home studio posts – recording tips for producers, engineers, and musicians
Are your home studio acoustics killing your mix?
Excellent article, and very nicely analyzed and explained–thank you!
I mostly dither with a WAVES RTAS L3 Multi-band Maximizer. @24 bit & try to create the session 88.2K . I was taught in college that 44.1K is 1/2 so when you dither the conversion will be easier for the algorithm, etc. to read & process. Thanks for the great article.
-Moshae Music
This is a wonderfully written and illustrated article! Now, can the author please write another piece to help us (both men and women) understand the opposite sex??? 😀
Fascinating! I’ll ask about this in production, on my next recording project.
My production facility has been using this practice for years……I never told anyone we were adding noise……and they never knew…….
I think this is the first time I actually understood dithering. Excellent article!
I record a lot on my Fostex VF160EX which is a 16 bit machine, so even though I’m not at 24 bits my recordings still sound very clean, even without the need for dithering.
This is awesome! Very informative and comprehensive! May I suggest one like this explaining the “mastering” process??
Andre – funny you should mention it. We’ll be doing another Izotope article borrowed from their mastering guide. Should be up in a few weeks. Thanks! –Andre Calilhanna, Echoes’ editor
This is the best explanation of dithering that I’ve ever read. I finally get it thanks to the use of the pics.
Terrific article! It dithered my brain! Good work.
This is an excellent article explaining dither. Very well written and easy to comprehend!