Have you ever wondered why some tracks sound “wide,” or why it sounds like the explosions in an action movie sound like they’re coming from all around you in the theater?
When producers use the term “stereo image,” they’re specifically referring to the difference between the signals in the two channels of stereo audio. This difference can allow the listener to perceive the “location,” or “width,” of a sound source in a recording. But in a more technical sense, the term “stereographic” refers to a technique by which a three-dimensional object is mapped onto a two-dimensional plane. In this post, we’ll use this analogy to understand the various flavors of multi-channel spatial audio techniques used by audio professionals to trick our brains into “locating” sounds.
Monaural sound, usually referred to as “mono,” is the oldest and most basic category of sound reproduction, using only one channel of audio. When speakers reproduce audio signals, our brains perceive “width” based on differences in the signals. Like a single point on a geometric plane, mono sound only has one source, so when a mono sound is reproduced on a stereo speaker system (like headphones or most speaker setups), the mono signal is reproduced in exactly the same way in both audio channels. No difference in the signals means that no psychoacoustic localization occurs, so the sound source has no “width.”
While most audio today is recorded, produced, and played back in stereo, mono audio is still used for some applications like radio broadcasting. Since it requires half the channel count of stereo audio, mono signals are simpler and therefore a better choice than stereo signals for certain applications.
The vast majority of modern music is released in stereo, and stereo output is the default in most, if not all, DAWs. But what does it really mean? Stereophonic sound has two channels, left and right.
This might not sound like a big difference from mono, but this means that stereo recordings can give engineers the ability to imbue audio with a sense of “width” – in other words, having the ability to mix the left and right channel separately means having the ability to create two different “sides.” In stereo, listeners’ brains can locate audio based on amplitude in each channel; if a sound is louder in the left channel than it is in the right, the listener’s brain localizes the audio on the left. Stereo audio allows our signal to be measured across one spatial dimension, just like a straight line on a geometric plane.
As an aside, there is a specific case of stereo audio known as binaural audio that uses psychoacoustics to simulate spatial effects that do not exist in traditional stereo recordings. For more on this case, check out this blog post that explains more about binaural recordings.
Surround sound protocols allow engineers to mix even more channels together: in a standard 5.1 surround setup, six channels of sound are mixed together: five channels with full bandwidth and one for something called LFE, or Low Frequency Effect. These channels’ signals are then sent to speakers set up around the listening position, including behind the listener. This way, the villain in a horror movie creeping up behind the hero really sounds like he’s creaking the floorboard in the hallway behind you.
If stereo is like a straight line, you can imagine surround sound as a square. Two channels of audio allowed us to place sounds to the left and right in the stereo field. Surround sound allows us to use that stereo field as a flat plane instead of simply a line: it has stereo “length.” Although it probably doesn’t make sense for most musical artists to mix their tracks in surround, the power of another stereo dimension historically has been a great choice for film, especially in the context of home theater experiences where a full professional audio installation is overkill.
Where do we go from here in terms of situating audio in a spatial field? This is a question being answered by companies, researchers, and developers in the realm of ambisonics, the next dimension up from surround sound. Through ambisonics, users are able to position sounds in three dimensions – like a sphere around the listening position. Just as a single point (mono) has no length and a single line (stereo) has no width, a single plane (surround) has no height.
Ambisonic audio adds the dimension of height in audio to make sounds seem as though they are coming from above or below the listener. In the case of Dolby’s Atmos systems, for example, this is achieved by placing extra speakers on the ceiling, but one of the coolest features of ambisonic audio is its ability to be automatically downmixed when reproduced on a system without this type of speaker installation. This means that engineers who mix in ambisonic typically aren’t required to generate a separate deliverable for stereo and surround sound formats.
There are several ways to create ambisonic recordings using fancy mic position schemes and arrays. However, the most common technique for creating ambisonic sound is by using our old friend, monaural sound. Software like Dolby’s Atmos Production Suite work by sending mono sounds around a three-dimensional sound field. You can “pan” sounds in three dimensions and leave them there, but superior results may be achieved by assigning mono sounds a directional vector and velocity. Directional cues created in this way help our brains understand and localize the sounds in the field.
Most ambisonic audio deliverables for film or broadcast are created using a stereo “bed,” which is just a standard stereo (or maybe 5.1) audio composition using sounds that are not intended to move around the ambisonic sphere. Then, mono sound objects are placed in, or sent around the space to create the illusion of “movement” around the listener. Inn film, this movement can programmed to be in sync with what’s happening on-screen to enhance the action.
Ambisonic audio in music is still a nascent technology, but some engineers are already beginning to create and remix audio for reproduction on ambisonic systems. Have you tried mixing in any non-stereo formats? Let us know in the comments below.
September 18, 2019