The buzz about the benefits of spatial sound recording for video is now constant. No doubt, it can bring an extra dimension to low-budget news reporting, documentaries and promotional videos. However, recording immersive sound is often mystifying to producers, since there are several types of recording technologies grouped under the same general topic.
Recording any kind of spatial sound requires modifications to the workflow processes on both the location and in post-production side. The good news is it’s now simpler than ever before.
First, it is best to define what spatial audio actually means — since there are several interpretations and it can be confusing. The most basic form is stereo. This is when discrete right and left channels are recorded. Stereo can add depth to video sound and has been used for years in television broadcasting.
Next, there is surround sound, which targets different sounds to multiple speakers. The result is to “surround” the audience in a sound field. We’ve all heard surround audio in movie theaters and home theatre setups. There are various techniques available, including by Dolby, THX and DTS. Depending on the system used, there can be 5.1, 7.1 or more channels.
Then there is binaural audio, which delivers a full 360-degree soundscape through a specially-encoded stereo file that requires the user to wear headphones. Interest in binaural sound soared due to the popularity of portable audio and the renewed interest in headphones.
Binaural sound reflects around the head and within the folds of the human ear. Neumann’s KU 100 Dummy Head Binaural stereo microphone ($9000), which mimics the size and shape of the human head, is a tried-and-true professional way to record binaural sound. But users can record binaural at much lower costs by wearing microphones in each ear or attached to glasses.
Finally, there’s the latest technology, 3D spatial audio. It can be used for super-realistic music recording, or for soundtracks in video productions. When used correctly, this truly immersive sound syncs to the visuals on the screen. When a viewer moves the head in one direction or another, the audio changes to reflect that movement. This kind of recording can bring very powerful realism to programming.
For this article, we will focus mainly on recording Ambisonic sound, a 3D immersive format that has been around a long time. In simple terms, Ambisonic audio is a multichannel technique that captures sound from all directions around a single point in space. The audio is rendered binaurally in playback, letting the editor virtually rotate perspective in all directions, vertically as well as horizontally.
Ambisonic audio is now supported by virtually all major post-production applications on the market, making it the appropriate technology for virtual reality and other applications that involve 3D sound and interactive audio.
The four capsules of Ambisonic microphones deliver a raw, four-channel output called A-format. Before you can use A-format audio, it must be converted to Ambisonics B-format, another format-channel format. In B-format, the editor can position each sound element in the sound field.
This extra step in the post process allows the producer to think in terms of source directions rather than loudspeaker positions. There is no “front left” channel. Instead, the channels contain parts of the sound that are combined during the later decoding step. It also offers the listener a considerable degree of flexibility as to the layout and number of speakers used for playback.
Today, there are free downloads of VST, AU and AAX plug-ins that work with all types of personal computers. With these plug-ins, the user can audition any direction from the microphone during Ambisonics B playback.
Ambisonics was developed in the United Kingdom in the 1970s by the British Natonal Research Development Corp. For many years, it was a tiny niche format and its patent has now expired. With advanced digital signal processing now available, Ambisonics has made a comeback in 3D recording. It has been proven effective for use in immersive video applications and can be easily decoded as binaural stereo.
To record spatial audio in the field, there are now some relatively inexpensive products available to small scale video crews that are designed to make the job easier. A couple of one-piece, compact microphones are the Sennheiser Ambeo Ambisonics ($1300) and the RØDE NT-SF1 Ambisonic ($1000) models.
Both of these mics are fitted with four condenser capsules deployed in a tetrahedral array. Each capsule records to a separate channel on an audio recorder equipped to handle 3D sound recording.
An even cheaper alternative is Zoom’s H3-VR ($250), a small audio recorder that integrates an Ambisonic mic, recorder and decoder into a single device that fits in the palm of your hand. It can convert audio from raw Ambisonics A format to B format inside the recorder.
Some other sound equipment, beyond the Ambisonic mics and recorder above, can be used by crews when doing immersive recording. These include the Sennheiser MKH418 mid-side shotgun microphone ($1650) for stereo recording; Sound Devices MixPre-6 II ($1060) or a Zoom F8N PRO ($1100) multi-channel recorders; and the Zoom H2N two or four-channel recorder ($180). The Zoom H2N doesn’t record full 3D spatial audio, but can capture both front and back sounds, but not vertical.
In some cases, using standard audio gear in creative, but unusual, ways can help with some 3D recordings. A good example is clipping two standard omni directional lavalier mics to the sides of a baseball cap to record binaural sound. This pseudo-binaural recording method is certainly no $9,000 Neumann dummy head, but can provide a fairly wide sound field for ambience recording on the fly.
Since 3D recording is still new to most sound operators, one needs to understand the workflow before beginning recording. Make sure you have every part of the production and post chain in place. Extras like portable mounting gear and weather protection for Ambisonic mics and recorders should not be forgotten. They are essential when needed for field recording. Trying to handhold Ambisonic devices can generate unnecessary noise during recording.
On location, the sound operator has to anticipate in the moment both the action of the camera with what is being recorded. This can be tricky at first, but anticipating movement of the camera creates an awareness of the immersive sound field being recorded. It is not like recording a specific sound or voice with a mono shotgun.
Don’t arbitrarily change the direction of either the video or audio during the shoot without thoroughly documenting it. Undocumented movement can become quite confusing in post-production. Plan in advance and clearly understand how the audio works with in relation to the picture. It can be useful to clap in two or three directions around the video rig to make audio and video syncing easier during post.
When worried about confusing which video and audio clips belong together, verbally give a sound description at the top or bottom of the take. This verbal aid can be of great help to the editor later on, especially if you’re not present during the session. Also record extra ambient audio when doing 3D sound to allow the audience to become gradually grounded in the sound before the visual action begins on the screen.
We are in the early days of recording 3D audio and much of the work today is experimental. It is easy to record a combination of Ambisonic and point source audio which allows depth and richness for video viewers.
Field recording with immersive audio is less exotic and of lower cost than ever before. It can have a serious extra impact on any video storytelling.
- Dealing with Video Deepfakes in the Era of Artificial Intelligence - November 22, 2023
- Using Bluetooth Timecode Systems in the Video Workflow - November 15, 2023
- The Shape of Your Video Camera - November 15, 2023