Mastering for SoundCloud using iZotope’s Ozone 8

Illustration: Gabe Alcala

It’s happened to all of us.

After days, weeks, or even months, of tweaking a mix, you finally upload your creation to share with the world. At long last, you’re ready to bare your soul, to open your heart to the unfeeling abyss of your SoundCloud feed, prepared to suffer the slings and arrows of outrageous commenters. You savor those six sweet words like a sommelier swishing an ‘86 Bâtard-Montrachet: “Upload complete. Go to your track.” This is your destiny; you can do no other. Adrenaline rushes through your meatspace frame as you click through and see the play count increment. This is the moment — !

Until you realize your track sounds horrible. Your pristine 24-bit, 96 khz WAV is popping and crackling like Rice Krispies. What used to be roomy, full sub-bass now reminds you of a newborn’s flatulence. Chattering, precise hats sound flat and uninspiring. Worst of all, you’ve got to crank your speakers to even hear it: compared to your mutuals’ tracks, your mix is noticeably softer.

What could have possibly transformed your masterpiece into such a disaster?

Loudness war is hell

In order to answer this question, we’re going to take a (very) brief detour into psychoacoustics. To make a long story short, all the way back in 1933 two very smart people named Fletcher and Munson discovered (among other aural revelations) that louder sounds are more exciting. Makes sense, right? A high sound pressure level stimulates our ears’ stereocilia to a greater degree, releasing more neurotransmitters to our nerves and more electrical signals to our central nervous systems. In other words, we perceive more loudness as “better.”

You don’t need to be a neuroscientist to guess what this means for digital audio. Over time, mastered songs have become louder and louder, because a louder master seems more exciting to the ear compared to its competitors. You may have heard stories about the “Loudness War” – this was the colloquial term for the competition between engineers driving tracks louder and louder, compromising more and more dynamic range in order to stand out against their peers. RMS levels of -21 dBFS were common in pop music from the 1950s and 60s, but a 2000s pop record might be as hot as -10 dBFS RMS or louder!

With the advent of streaming services, however, came a solution to this seemingly impassable problem. Services like Tidal, Spotify and Apple Music now play tracks at a uniform volume across their platform so that listeners don’t need to constantly tweak the volume slider. This is accomplished through a process called loudness normalization, whereby incoming tracks are adjusted to hit a certain loudness level.

This ostensibly small change in audio processing effectively ended the loudness war, as mastering engineers noticed that even if their deliverables were incredibly loud before streaming normalization, the final product was automatically adjusted to fit the platform’s standards. Making tracks louder than -14 LUFS (using Spotify for example) accomplishes nothing other than destroying dynamic range and squashing the track.

What does SoundCloud do to my music and why?

Unlike other streaming services, SoundCloud does not apply loudness normalization to your tracks. This means that if you upload a track at -3 LUFS, it’s going to play back at -3 LUFS. Like the soldiers who continued fighting the Civil War after the truce at Appomattox, the loudness war still exists on SoundCloud. If you mix your track with an ear towards preserving dynamic range and delivering a dynamic master, it’s just not going to stand out against the competition on SoundCloud.

But how can we explain those nasty artifacts that are totally there, dude, I swear it didn’t sound like this in my studio!!!? Well, some of it might be due to the compression that SoundCloud applies to your music. Note that in this context, “compression” is referring to compression of data, not dynamic range compression. SoundCloud takes your upload and converts it to an mp3 file, a process which does, necessarily, degrade the audio – no matter what quality audio file you upload to SoundCloud, lossy or not, you’re going to hear a 128kbps mp3 when you press that big orange play button.

Don’t freak out just yet! There’s a very good reason for this. SoundCloud delivers you mp3s because they are much smaller than lossless files. In the words of Rory Seydel at LANDR:

“In short an MP3 is a coded version of your track that has been strategically degraded in quality to minimize the data size for online streaming. The encoder reduces bits that are perceptually less important, giving us a smaller file.”

Unfortunately, bandwidth remains limited for many SoundCloud users. Data compression through mp3 transcoding is performed in order to improve the streaming experience for users on slower connections. This makes some sense: consider a 3:30 long song. A 16 bit, 44.1khz WAV file with a bitrate of 1,411kbps is about 35 MB in size, whereas a 320kbps mp3 of the same duration is 8.1 MB and a 128kbps mp3 weighs in at a minuscule 3.3 MB. For a user with a fast Internet connection, streaming any of these files is no problem. But for someone on a shared connection, or someone using data on their phone, the 90% reduction in data use might be a really big deal. By compressing streaming audio to a 128kbps mp3, SoundCloud is trading a little bit of audio quality for a lot of usability. It’s probably not the best for power users, but it’s a sound decision for those of us with slow or metered connections.

“By today’s standards, 128kbps MP3 is pretty low quality. Spotify’s free tier streams at 160kbps for desktops, with the superior Ogg Vorbis codec. Spotify’s paid premium service can stream up to 320kbps Ogg Vorbis, which can be considered “transparent” for 99% of the population. Apple Music streams at 256kbps AAC, and TIDAL even streams lossless for $19.99/month. Meanwhile, even SoundCloud’s paid service streams at a measly 128kbps with an outdated codec.”

Brian Li, 441k

How much does all of this matter? That’s a good question. Most trained engineers can discern a difference in quality between 128kbps mp3 files and lossless WAVs, but for the average listener, it might not be a big deal. It is an incontrovertible fact, however, that the mp3 transcode process does introduce artifacts into your audio.

Try and see if you can figure out the difference between WAV, 128kbps and 320kbps MP3:

For a deeper dive into the effects of data compression on your music, check out Ian Corbett’s piece on the topic at Sound on Sound.

TL;DR

So now you understand what Soundcloud does to your music and why. How can we use this information to make sure our tracks sound as perfect as possible given these limitations?

First of all, let’s dispel a common myth. Bouncing to 128kbps mp3 and then uploading to SoundCloud will not result in a more faithful reproduction during streaming playback. Your track will go through the transcoding process anyway, meaning that in all likelihood this method will leave you with an even more heavily-artifacted final product. All you can really do is to upload a lossless file and hope for the best, keeping in mind that lossless audio will typically weather the transition better than lossy formats. If you really need to use SoundCloud to distribute lossless audio, make your track available for download and it will be accessible from the SoundCloud servers in all its uncompressed glory.

That said, you can also use Ozone to achieve some measure of control over your sound.

Maximizer

If we want to achieve the least distortion possible when transcoding to a 128kbps mp3, the Maximizer may be the most important component in our Ozone toolkit. Leaving plenty of headroom in your track will help mitigate the effects of clipping that might occur during the conversion. Intersample peaks can create nasty distortion in the process of the conversion, so leave some room for error.

iZotope themselves recommend setting the Maximizer’s margin to -0.3 at most. If you upload the track and still hear digital clipping effects, jump back into your DAW and bounce the track again with a margin of, say, -1.0. You’ll be giving up a little bit of loudness, but if re-bouncing the track lets you avoid some of that nasty distortion you’ll be glad you took the time.

EQ

Part of the data compression process for a 128kbps mp3 includes filtering out any information that resides above 16khz. It would stand to reason that you shouldn’t have any critically necessary signal living in this range alone. Of course, the only sounds that live at 16khz+ are “air” or “sizzle”, engineering terms for extremely high-frequency content. As we age, humans lose the ability to even recognize this audio: most middle aged-people can’t even hear this frequency range, and even some young people have trouble hearing these sounds. Since the mp3 format was designed to achieve maximum data compression with minimum degradation of audio quality, it makes sense that the transcode would filter out some of these super-high sounds.

With that in mind, consider using EQ to de-emphasize some of these frequencies.

Stereo Imager

MP3 encoding can “blur” carefully crafted stereo images and lead to increased distortion. Paradoxically, one of the best ways to pre-empt this loss is to use the stereo imager to make your track a little bit narrower. This will help prevent individual channel peaks from causing aliasing or clipping in your mix when transcoding.

Note: With great power comes great responsibility. Stereo Imagers are powerful tools, but they do have the ability to seriously mess up your track. If you’re going to convert your mix to mp3, there’s a good chance that the compression will destroy your meticulously panned elements anyway. However, mucking around with the stereo image can introduce phase cancellation issues. It’s always a good idea to check your mixes in mono after everything else is done!

Try reducing the width of your higher frequency bands within the Stereo Imager module – a good starting point is around 10%. This is going to require some trial and error to get completely right, so try bouncing out versions with 5%, 10%, and 15% stereo width reductions on your top 3 frequency bands. Then upload them to SoundCloud and stream them back, taking note of which stereo image feels right to you.

It’s also not a bad idea to mono out your sub bass while you’re already in the Imager module. There’s really no reason to have audio information below 80hz in stereo, since the human ear is unable to place audio signals in the stereo field below this frequency. It’s a good idea to do this directly on your tracks with sub bass info, but you can also set your sub frequency band to around 80hz and mono it out. This will clear up some space in your mix since your ears won’t be able to localize these sub frequencies anyway.

These suggestions are just that – suggestions. It’s important to trust your ears: if going against one of these ideas makes your song sound better, then by all means go with it. That said, mastering and optimizing for Soundcloud is not a magic bullet for making your tracks sound good. A bad mix will produce a bad result no matter what mastering chain you apply to it, and a 128kbps .mp3 will never reproduce an original file as faithfully as a .wav will. But it’s still worth trying out these tips so that you can put your best foot forward to anyone what clicks onto your profile.

Ozone 9 Advanced, the latest version of iZotope’s mastering suite, offers even more powerful features — try it for free for three days.


Explore royalty-free one-shots, loops, FX, MIDI, and presets from leading artists, producers, and sound designers:


audio 1 – 120kbps MP3
audio 2 – 320kbps MP3
audio 3 – WAV

December 11, 2017

Max Rewak Max Rewak is a record producer, audio engineer, and music writer, based in New York and currently working in Sounds content at Splice.