It has taken years, but immersive audio has finally become as popular as 4K television and high dynamic range video. Whether used for video productions or audio-only programming, 360-degree spatial sound has a “wow” factor like little else.
As with any new technology, the various formats, flurry of products and unfamiliar terminology can be confusing and downright overwhelming. This, and a lack of singular technical standards, means we are still in a bit of an immersive audio wild west.
First, what exactly is immersive sound? Let’s start with surround sound, a channel-based technology where the audio is encoded to play through specific speakers in a pre-defined arrangement.
Immersive sound, on the other hand, transforms multi-channel surround to a new sonic level by using height or presence channels to create a dome of sound engulfing the listener. Typically, this is done by mounting speakers high on walls or the ceiling.
While traditional multi-channel audio can be used to create immersive sound, a type of encoding called “object-based audio” has emerged. Object-based audio allows a mix to direct sounds toward specific areas within a space. Compatible receivers can decode object-based audio and use available speakers to replicate the sound placement as originally intended.
So why the popularity of immersive sound? It is mostly driven by the booming adoption of high-quality headphones, which are connected to mobile internet-connected computing devices and audio players. Since audiences have more ear time than eye time, immersive audio listeners out number immersive video viewers.
This idea for 3D audio has been around a long time — only the technologies are new. I remember an AES show in Los Angeles in the mid 1980s when I got my first taste of immersive sound. Hugo Zuccarelli demonstrated his stunning Holophonics technology to crowds standing in long lines to hear what all the fuss was all about.
In those analog days, Zuccarelli’s headphone-based audio system was startling in its detail. A recording of the sound of barber shears cutting your hair was so realistic, you could actually feel your hair was being cut.
It was 1973 — almost 50 years ago — when Neumann introduced the KU80 “dummy head” binaural recording system at the International Radio and Television Exhibition in Berlin. The dummy head was so popular that an upgraded model, the KU100, has been on the market since 1992. It began as only headphones compatible, but has now been made loudspeaker compatible as well.
Major musicians, including Pink Floyd, Paul McCartney and Roger Waters, have done albums with 3D sound. Now due to the lower cost and simplicity of the technology, immersive sound has come to basic YouTube videos. It is now quite easy to use, even by amateurs.
There are a myriad of 3D formats. Auro-3D is one of oldest — first hitting movie theaters with the release of George Lucas’ 2011 film, Red Tails. Then came DTS:X and Dolby Atmos, which dominates the home market and is a current top contender in 3D sound processing.
Roughly one thousand commercial motion picture theaters have deployed Atmos speaker systems and nearly every major consumer electronics manufacturer now features some iteration of the technology in their products. There are also dozens of movie titles available in the format.
Atmos is an object-based audio codec that works using speaker set-ups in the home ranging from 5.1.2 (five surround channels, a subwoofer and a couple of ceiling presence channels) to 7.1.4 or 9.1.2.
Sennheiser’s Ambeo is another immersive sound technology that encompasses a range of products, including microphones, headphones, loudspeakers and production software. The Ambeo line includes the venerable Neumann KU100 binaural head, the newer Sennheiser Ambeo VR microphone, Ambeo smart headsets and a 3D soundbar.
The Ambeo software to create immersive audio has also been simplified and comes in a range of prices. One of Sennheiser’s basic offerings is the dearVR Micro, a simple, free software interface that can be used by anyone.
The Rode SoundField Microphone is a single-piece four-capsule Ambisonic microphone designed for use in immersive audio, virtual reality, sound design and experimental recording applications. It captures broadcast-quality 360-degree sound in A-Format on four separate tracks. The mic outputs fom stereo to 7.1.4 surround sound. Price is about $1,000.
In the past, capturing 3D audio required a dedicated Ambisonic mic, a separate recorder and a computer for encoding the audio from raw Ambisonics A format to Ambisonics B format. Now Zoom has created the H3-VR, a compact recorder that combines encoding and decoding into a single device.
The Zoom recorder is perhaps the lowest cost entry into immersive sound for video producers. The H3-VR ($229) has four Ambisonic mic capsules that captures 360-degree audio recordings up to 24 bit/96 kHz. An onboard decoder allows users to export Ambisonics B-format files directly from the device.
The H3-VR comes with software to allow the conversion of the four microphones to binaural stereo, standard stereo and 5.1 channel surround sound for video. It also streams 360-degree surround sound live.
Another is Dolby Headphone, which creates a virtual surround field in real-time using any set of two-channel stereo headphones. It takes an input of either a 5.1 or a 7.1 channel signal, a Dolby Pro Logic II encoded two channel signal or a standard stereo signal.
As an output, it sends a two-channel stereo signal that includes audio cues intended to place the input channels in a simulated virtual soundstage.
Sony has introduced the 360 Reality Audio system. It’s an immersive audio experience utilizing object-based spatial audio technology and is being used by a number of streaming services.
Sony’s technology works with standard headphones when combined with a streaming services app for iOS/Android smartphones. The app analyzes the listener’s hearing characteristics using images of ear dimensions for Sony’s algorithm.
For speakers, users will require a speaker system with Sony’s decoder for the 360 Reality Audio music format, multiple speaker units and signal processing technology.
Google has an immersive audio platform called Resonance Audio that is scalable to high-fidelity 3D sound for video. It works on both mobile and desktop applications and is compatible across the internet and on other platforms, including Unity game engines. In addition, it’s available as a standalone VST plug-in, allowing it to be used in multiple audio creation programs.
Also, part of the 3D sound movement in video is ASMR. Some ASMR video creators use binaural recording techniques to simulate the acoustics of 3D environments to enhance the experience of being in proximity to actors or vocalists. These videos are designed to create pleasurable sensations and the subjective experience of calm.
Immersive audio opens the door to a new generation of audio and video storytelling. Radio drama, popular in the 1930s and 40s, is making a comeback with immersive audio. It’s description as “theatre of the imagination” is especially true today with immersive sound.
Adam Clayton Powell III, a former media executive, wrote in the New York Times on the potential of audio. His words remain true as immersive sound emerges.
“The mind’s eye is a powerful projector. What we can imagine is usually scarier, funnier, more real and more vivid than the explicit images of video. Some of the wiser film directors have taken this lesson to heart: In horror movies, it is a mistake to show the monster before the end of the film. Left to the imagination, our minds create demons far scarier than any film maker’s image,” Powell wrote.
“Ask any baseball fan about the crack of the bat and the lilt of the announcer’s voice coming of the airwaves on a languid summer afternoon. The pictures are better on radio.”
Though the storytelling power of immersive audio is clear, users still must choose a technical standard to work in today. There are seemingly dozens of choices. SMPTE established a group to study issues related to immersive audio in 2013 and started to create standards in 2014. That work continues on several levels as the nascent technology continues to expand.