Dithering – adding “good noise” to improve your home recordings

by Izotope on January 10, 2013 · 14 comments

in Free Guides,Recording & Mastering

In your English class, “dither” means to act nervously. When we’re talking about digital audio and home studio recording, dithering is the process of adding noise to the audio signal.

The post below is excerpted and adapted from Dithering With Ozone: Tools, Tips, and Techniques, a guide produced by our friends at iZotope. Reprinted with permission.

What is dithering?
In your English class, to “dither” means to act nervously or indecisively. When we’re talking about digital audio and home studio recording, dithering is the process of adding noise to the audio signal. Adding noise, you say? Why would you want add noise? Basically, it’s a trade — low-level hiss in exchange for a reduction in distortion when you convert 24 bit to 16 bit audio to transfer to a CD.

It starts by recognizing how digital audio is stored and represented. Unlike analog audio, which is “infinitely continuous,” digital audio is represented by individual bits. This “quantization” means that while we might hear a continuous sound, it’s really a whole bunch of discrete 1s and 0s.

Sine Wave
As an example, from a distance, a digitally created sine wave looks and sounds pretty continuous.

Discrete Samples
If we zoom in though, we can see that our continuous looking waveform is actually a bunch of individual, or discrete, samples.

Sample Rate Bit Depth

Zooming in once more, you notice how much space is really in between these discrete samples. The horizontal distance between samples represents the sample rate, that is, how often samples are taken of the audio. The vertical distance between samples is a function of the bit depth, or “word length,” of the audio.

As we take more samples per second (e.g. 44.1 kHz, 48 kHz, 96 kHz — where “Hz” is literally the number of samples per second and “kHz” is the number of 1000’s of samples per second), we get better “horizontal” resolution, which translates to being able to represent higher frequencies. As we use more bits per sample (e.g. 16 bits, 24 bits, 32 bits, 64 bits, etc.) we get better “vertical” resolution, which translates to better “dynamic resolution” (i.e. more dynamic range and/or signal to noise ratio). In the context of dithering, it is in the “bits per sample” that we’re interested.

So, better dynamic resolution is pretty easy to get on your PC; just use more bits per sample. Get a 24-bit A/D converter; mix and edit at 32 bits; process effects at 64 bits. End of story. Except… audio CDs are 16-bit. At some point, if you want to put your songs on a CD, you have to get that 32-bit word length down to 16 bits. And that’s the problem.

Truncate the naughty bits
So it comes down to fitting 24 bits on a 16-bit CD. As Prince sings in “Insatiable,” “We’ll just erase the naughty bits.” That’s one solution — the simplest way to convert is to simply throw away, or “truncate,” the lowest 8 bits. Take 24 bits, throw away the lowest 8, you’re left with 16, and you can make a CD. (Upon further listening, we realize Prince might not have been talking about word length reduction through truncation in that song.)

Before you just throw the bits away though, let’s check out a little comparison. First, look at a spectrum of our 1 kHz 24-bit sine wave. The spectrum of a pure tone should be single spike at the frequency of the tone, as it is here.

24 Bit Sine Wave
Before you just throw the bits away though, let’s check out a little comparison. First, look at a spectrum of our 1 kHz 24-bit sine wave. The spectrum of a pure tone should be single spike at the frequency of the tone, as it is here.

Now we’ll convert our nice smooth 24-bit sine wave to 16 bits by simply truncating or throwing away the least significant 8 bits.

Sine Close Up
Looking first at the waveform in time, we can see that things got a little less smooth. Especially in the area with the circle, you can see that with fewer bits, we’re not able to represent the original smooth curve.

Truncation introduces “quantization error” which is the difference between where the higher resolution sine wave had its samples and where the lower resolution sine wave has to put the samples.

No Dither Wave

A spectrum of the truncated 16-bit sine wave is even more revealing. Again, a pure sine wave appears as a single spike. As the sine wave becomes more jagged and “squared off,” the spectrum reveals artifacts related to the quantization error (e.g. noise, harmonics). Whatever you want to call it, it doesn’t look or sound good.

When in doubt, dither
So what to do? We dither. We’ll get into the details of dithering later, but for now consider it as adding very low-level noise to the audio before it is converted from 24 bits to 16 bits. The results of converting with and without dithering are shown here:

Our dithered tone (white spectrum) seems to look better. We’ve effectively done away with the jagged quantization noise. It sounds better, but we have made a tradeoff and added some low level noise.

Our dithered tone (white spectrum) seems to look better. We’ve effectively done away with the jagged quantization noise. It sounds better, but we have made a tradeoff and added some low level noise.

The terrible truth is revealed: you’ve spent time and money on low noise preamps, A/Ds, and everything else. But in the end, you’re going to deliberately add noise to your mix when you convert it from 24 to 16-bits to put on a CD.

On the bright side, we’ve traded “bad noise” for “good noise.” Instead of quantization error, we’ve added a smoother more “continuous” noise.

(Editor’s note: We’re simplifying not only the principles behind dithering but the benefits as well. Just understanding the tradeoff to begin with is a good place to start, though.)

The Atlantic puffin
At the risk of confusing the issue with an analogy, here’s another way to look at it. Image processing faces the same issues of bit depth, resolution and dithering. When representing a photo on a computer, each sample has a discrete number of colors or shades of gray that it can represent. The number of levels is determined by the bit depth of the file, just like the levels of samples of audio.

Puffin 24 Bit

Check out this picture of an Atlantic puffin. He’s looking good in 24-bit gray scale, meaning that every dot that makes up the photo can be one of over 16 million shades of gray. It looks pretty “continuous.” Sixteen million shades of gray is a pretty good dynamic range.

Puffin 2 Bit

Here is what happens if we truncate the photo down to 2 bits.
Now we only have 4 shades of gray to represent the photo and something is going to be lost. Same as in digital audio, each sample has to be forced to a level, and there aren’t enough levels anymore to represent the data as a continuous image (or a smooth continuous signal in the case of audio).

Puffin 2 Bit Dithered

As with digital audio, we can dither images. The concept is the same: we add controlled noise before we convert from 32 bits to 2 bits. The result of the conversion, using dither, is shown here.

Like 8-bit audio (or even 16-bit compared to 24-bit) our 2-bit dithered picture is not great, but it’s better than when we just truncated. In fact, it shows three characteristics that are analogous to down-sampling audio with dither.

1) We’ve reduced the quantization error (the jaggedness or “blockiness”) with the tradeoff of adding a different type of noise (the speckles in the dithered picture).

2) We’ve actually maintained some detail from the original higher resolution picture. In audio, this relates to the concept that reducing the number of bits per sample with dithering can give you greater perceived dynamic range than the signal to noise ratio.

Not making sense? Well, the theoretical maximum signal-to-noise ratio of a 16-bit recording is about 96 dB. That means, in basic terms, that the noise floor of a 16-bit recording is around -96 dB. The noise floor should not be confused (but often is) with dynamic range, which represents the difference between the loudest and softest signal that you can hear in the recording. You can hear a dynamic range which extends past the noise floor with proper dithering. Even if a signal is “in the noise,” you will still hear it with proper dithering. The breadth of the dynamic range you can hear below the noise floor can’t be expressed mathematically. It’s a result of the program material, and how well you can hear — or more precisely: how well a listener can resolve signal from noise at low levels.

3) The picture also answers a common audio question: “If I know I’m making a 16-bit CD in the end, should I still record, mix and master at 24-bits?”

Yes, the extra bits provide you (or more specifically the dither) with information that allows the dynamic range to be greater than the noise floor. In short, 24-bit properly dithered to 16-bit will sound better than an original 16-bit recording.

Learn more – download the entire guide: Dithering With Ozone: Tools, Tips, and Techniques. Learn more about iZotope at iZotope.com.

Learn How to Make 
a Great Master

Read More
Audio mastering basics for your home studio
Audio mastering – the mysterious post-production art form
Your home studio mix – recording tips for better results
Ear Fatigue and Mixing Music – Know the Signs, Avoid Mistakes
Home studio posts – recording tips for producers, engineers, and musicians
Are your home studio acoustics killing your mix?

Share the content you love.Share on Facebook21Tweet about this on TwitterShare on Google+4Share on StumbleUpon0Email this to someoneShare on LinkedIn1Share on Reddit0Digg this

{ 2 trackbacks }

Dithering – Adding “Good Noise”UniqueSquared Pro Audio Blog
June 22, 2013 at 4:56 pm
Audio mastering basics for your home studio » South Florida Music Showcase
November 5, 2015 at 12:21 pm

{ 12 comments… read them below or add one }

Bill Turner July 17, 2013 at 3:39 am

Excellent article, and very nicely analyzed and explained–thank you!


Moshae Beats January 23, 2013 at 5:52 am

I mostly dither with a WAVES RTAS L3 Multi-band Maximizer. @24 bit & try to create the session 88.2K . I was taught in college that 44.1K is 1/2 so when you dither the conversion will be easier for the algorithm, etc. to read & process. Thanks for the great article.

-Moshae Music


JSCE January 22, 2013 at 7:33 pm

This is a wonderfully written and illustrated article! Now, can the author please write another piece to help us (both men and women) understand the opposite sex??? 😀


Sheir McHenry January 22, 2013 at 6:47 pm

Fascinating! I’ll ask about this in production, on my next recording project.


Lucindra January 22, 2013 at 6:28 pm

My production facility has been using this practice for years……I never told anyone we were adding noise……and they never knew…….


VC January 22, 2013 at 6:13 pm

I think this is the first time I actually understood dithering. Excellent article!


jk January 22, 2013 at 5:15 pm

I record a lot on my Fostex VF160EX which is a 16 bit machine, so even though I’m not at 24 bits my recordings still sound very clean, even without the need for dithering.


André Lipsey January 22, 2013 at 5:10 pm

This is awesome! Very informative and comprehensive! May I suggest one like this explaining the “mastering” process??


Andre Calilhanna January 23, 2013 at 9:58 am

Andre – funny you should mention it. We’ll be doing another Izotope article borrowed from their mastering guide. Should be up in a few weeks. Thanks! –Andre Calilhanna, Echoes’ editor


Illson Fisk January 22, 2013 at 4:48 pm

This is the best explanation of dithering that I’ve ever read. I finally get it thanks to the use of the pics.


Janet Fisher January 16, 2013 at 1:10 pm

Terrific article! It dithered my brain! Good work.


Audy Kimura January 15, 2013 at 4:27 pm

This is an excellent article explaining dither. Very well written and easy to comprehend!


Leave a Comment

Previous post:

Next post: