What music/tech companies won’t tell you about spatial audio

This is our first-ever collaborative article for Water & Music — highlighting not only our own original research, but also the collective perspectives and expertise of our member community, which was coordinated through our Discord server over the course of several weeks. Members who helped us conduct research, curate interesting links and resources and led discussions on the topic in our Discord server are credited at the bottom of this report.


There seems to be a lot of recent momentum around spatial and immersive audio in music, entertainment and social media — without much accessible explanation about what it actually is, or why it matters.

Ever since Apple Music launched its Spatial Audio offering for consumers in June 2021, a wide range of companies and artists have jumped onto, or taken up more space on, the hype train:

Many of the companies that offer spatial audio experiences to consumers use the same enthusiastic marketing language — like hearing music “the way that artists intended,” or transforming sound “just like the shift from mono to stereo.” But from a music-industry research perspective, we found that beneath this hype was widespread confusion about the underlying commercial and power dynamics bringing us to this shiny moment.

For instance, in our research, we found very little information about creative decision-making with spatial audio, particularly the extent to which artists were actually involved in the spatialization of their own works. We also heard confusion from our members about the creative tools needed to create spatialized tracks, the price points for those tools and the extent to which streaming services like Apple Music or Amazon Music are incentivized to make those tools more accessible to creators, instead of focusing solely on the benefit of spatial audio for consumers.

Perhaps most importantly for the music industry, we thought there was not enough scrutiny of the fact that it’s big-tech companies — not artists and labels — that are leading the conversation and commercial hype around spatial audio for music, which raises important questions about what leverage music creators really have in the sector.

In this report, we’ve tried to tackle these and many other questions with the hopes of offering a newfound understanding of the role that spatial, immersive and 3D/360 audio play in the music industry — from the creative process flow and supply chain, to the power dynamics and emerging business opportunities that result.

Disclaimers:


First things first: Spatial audio and 360 audio are NOT the same thing

In our research, we found dozens of terms that have been used interchangeably in the media to describe different forms of immersive audio — spatial audio, spatialized audio, 3D audio, 8D audio, surround sound, binaural audio, 360 audio, the list goes on — in a way that understandably creates confusion for fans and industry professionals alike.

For the purposes of this report, the most important distinction to know is that between 360 audio and spatial audio. Many big-tech companies use these two terms interchangeably in interviews and marketing materials about their immersive audio services, when in reality they refer to widely different mastering techniques and/or listening experiences.

One of our community members, Paula Jones (producer/engineer and founder of Blind Chihuahua), provided this helpful explanation a few weeks ago in our Discord server that we’ve quoted here with her permission:

Object-based 360 audio and spatial audio are two very different things, although the term spatial audio is being used to encompass all audio that makes use of dimensions beyond stereo [left and right]. 360 audio like [Dolby] Atmos and Sony [360 Reality Audio] are object-based audio — meaning you can take a single track in a multitrack mix, like a guitar for example, and put it in any space around you, above you, below you, on the left just behind your head or on the right a long way in the distance from your head. It’s like surround sound, except it has unlimited positional possibilities, including above and below. But your head is anchored in the middle — it doesn’t matter which way you turn your head, the positioning of the individual sounds will stay the same.

Spatial audio is not at all the same thing. Spatial audio is more about the position of the listener, as opposed to the position of the audio in a 360 space. For example, if I enable spatial audio in a 3D gaming environment, the perception of where the audio is in the room is relative to where the player is standing and which way they’re facing. So if I, as a character, turn 90 degrees away from the sound source that has spatial characteristics applied, that sound will appear more in my left or right ear as if it would in real life — namely if you turn your head away from someone who’s talking to you so that your right ear is facing them, you’ll hear their voice more in your right ear than your left. If you turn your back to them, their voice will be coming from behind you. You can hear this when you have spatial audio enabled watching Netflix let’s say on your iPhone with AirPods. If you turn your head, the position of the audio will follow the direction of your head relative to your phone. Spatial audio doesn’t make much sense on its own; it needs a visual element for reference.

Listening to music with Atmos is a completely different experience, as 360-degree placement of instruments within the listening space is a different creative concept than spatial audio.”

Let’s apply this definition to Apple’s Spatial Audio feature, which is framed specifically as “Spatial Audio with support from Dolby Atmos.” This separation is important, but perhaps has not been communicated clearly enough to the wider public. Dolby Atmos (like its competitors such as Sony 360 Reality Audio) is a creative tool for object-based, 360-degree audio that artists and engineers are involved with earlier on in the production, mixing and mastering process. In contrast, Spatial Audio as Apple defines it is not a creative tool; rather, it is a hardware/software layer built on top of a consumer-facing listening experience.

This will be especially relevant in section II of this report about the commercial incentives and power dynamics in spatial audio. In the meantime, we encourage you to check out this extensive spatial audio glossary from Mach1 Tech for a more technical breakdown of the differences among all the above terms and how they apply across music, gaming and other media formats.


I. Bottlenecks in spatial audio’s creative workflow

We should start off this section by saying that despite what Apple’s marketing may suggest, the world of spatial and 360 audio is nothing new. There are several books, research projects and academic institutions that have been dedicated to studying the creative and commercial potential of immersive audio for the past several decades.

That said, fast-forward to today, and the reality is still that most spatial or 360 songs you hear on streaming services were created as an afterthought. The song was likely written, recorded and mixed initially with stereo listening (i.e. the confines of audio just coming from your left and right) in mind; then, along came the opportunity to “spatialize” the mix by moving up to 128 sounds (or stems) from the stereo mix in 3D space.

This spatialization process — usually conducted by skilled, specialist sound engineers — is a creative expression of its own, with many intriguing decisions to make along the way that can fundamentally change the feeling of a song. Which sounds should remain static in space, and where (e.g. best practice seems to involve keeping more rhythmic elements like drum and bass relatively closer together and towards the front of the listener)? Which sounds should move, and where should they move? How fast? How often?

Furthermore, as with the transition from mono to stereo, you can sometimes hear in these early spatial mixes more of an exploration of what’s possible, rather than what’s necessarily appropriate. For instance, one spatial mix engineer talked to us about how important it is for vocals to remain stationary in space to avoid confusing the listener — which felt like a useful truth until we listened to a haunting spatial mix of a Bjork track where the vocals moved all around. Like all great creators, perhaps the key is to learn the rules first, and then selectively break them.

Oftentimes, the goal of spatial audio is not to help listeners explicitly locate specific sounds in space, but rather to create a general feeling of immersion akin to what one might experience at a live show. Listen to Norah Jones sing “Don’t Know Why” in spatial audio, for example, and you won’t hear very much moving about — but you’ll feel a real intimacy that doesn’t come across in the stereo mix. Space is being used not to move sounds around the listener, but to make it feel like you’re in an intimate club with the band in a way that stereo audio just can’t.

How do artists and producers get started with working in spatial and 360 audio? There are off-the-shelf plugins like the Dolby Atmos Production Suite ($299 on the Avid Marketplace) and the Sony 360 Reality Audio Creative Suite (also $299), as well as select studios around the world like London’s Metropolis Studios and Eastcote Studios that have dedicated equipment for spatial recording and mixing. Apple Music executives have talked about plans to build “immersive music-authoring tools directly into Logic Pro” this year — but it’s unclear whether this will be a genuinely new, proprietary technology, or just an expansion on Dolby Atmos’ existing integration with the DAW. Outside of the big tech players, artists have access to some other independent software providers like Immersion Networks (cloud-based spatial mixing) and LANDR, which recently launched an “upmastering” service to remaster tracks in Dolby Atmos for $100/track.

A significant problem arises, though, once you get to the level of major DSPs like Apple and Amazon needing to process and remaster thousands of tracks spatially in a short period of time. Not only are few artists overseeing this spatialization process themselves, but many aren’t even listening to the full spatial playback experience before approving the new mix. In many cases, the featured artists in high-profile spatial audio campaigns, like Marvin Gaye for Apple Music, are deceased. In others, artists are struggling to get hold of equipment that allows them to listen properly to their work in all its spatial glory. This seriously puts into question whether tech companies’ claims that spatial/360 audio will help fans “hear music the way the artist intended” are actually true.

We have also heard stories of artists being sent a mix on an iPhone with AirPods Max as part of the approval process; while this is a great way to experience spatial audio, it’s much harder to appreciate the full spatialization versus listening to it on a dedicated spatial audio system with speakers in front, behind and above you. (This is consistent with consumer reviews of Apple’s Spatial Audio feature for music, which have characterized the tech as hit-or-miss.)

Even after talking to labels, mix engineers and technology companies who are deeply involved in spatial and 360 audio, we struggled to find artists who have considered spatial audio earlier in the process of creating a song, versus just as an afterthought for the sake of marketing priorities on DSPs. When we asked how artists would think differently about writing and recording if they were creating for a spatial audio world, we often felt like the first people who were asking these questions in the first place. The music industry is relatively new to this, and many answers aren’t known yet.

And once the spatial audio track is created, it isn’t at all clear that the right people will get to listen to it. While several DSPs like Apple Music, Amazon Music, Tidal and Deezer HiFi technically offer spatial or 360 audio features at this point, the world’s largest paid music streaming platform, Spotify, does not. And backend distribution support for Apple’s Spatial Audio rollout remains spotty: According to Apple’s own data, the majority of music distributors today do not yet support Spatial Audio. Some major distributors like CD Baby, DistroKid, FUGA and The Orchard do support it, but other major ones like Believe/TuneCore, AWAL, Symphonic, Ingrooves and ONErpm do not.

We’re seeing little, if any, of a public push from labels and artists to move into spatial and 360 audio this year — which is strange given that rights holders have certainly been aware of the technology for years (Universal Music Group signed a direct partnership deal with Dolby Atmos back in 2019). Anecdotally, the main culprit seems to be simply a lack of understanding. We’ve heard several examples of unclear communication to the music industry of how fans, or even artists themselves, can easily access spatial versions of recordings. For instance, we spoke with a multi-Grammy award-winning producer who had proudly mixed some tracks spatially, only for the producer to then ask us if we could help work out how to get those tracks up on a music platform. We also spoke to a label that was proudly early to mixing in spatial audio, and heard some amazing tracks from one significant artist in the studio. But two weeks and much googling later, we were still waiting to find out where we could actually hear those tracks at home. The label wasn’t sure and ended up asking the artist … but we haven’t heard back. We suspect the specialized versions of the tracks never actually got published.

Ironically, there seems to be much more investment in consumer-facing, rather than industry-facing, communication about the benefits of spatial and 360 audio. Artists and engineers arguably need to know how to create natively in spatial audio from the first moment they’re in the studio, in order to deliver genuinely groundbreaking experiences that will resonate with fans in the first place. The swooping in of consumer-facing, big-tech incentives is arguably muddying the waters of understanding and creating a fundamental barrier in the growth of the format.

Which brings us to our next section…


II. Tech companies, not artists, are driving the public conversation (for now)

Aside from the creative opportunities and challenges around spatial and 360 audio, one area that we feel has not gotten enough scrutiny in the trade media are the power dynamics driving the public conversation around the format.

In particular, one of our major takeaways from our discussions with artists, engineers and label reps is that big-tech companies — not music companies or rights holders — are driving most of the hype and setting the creative and commercial agenda for spatial audio as we know it today. Ironically, many tech companies seem to be using spatial audio and music as levers to drive sales in other revenue streams, like phones, smart speakers and other hardware, that have little to do with the music itself.

To understand how these power dynamics work in practice, we thought it would be helpful to home in on two specific companies, Apple Music and Amazon Music, and where spatial audio fits into each of their strategies. Generally speaking, having a unified system of hardware integrating with software to deliver a particular listening experience is a significant competitive advantage, and Apple and Amazon already have both — potentially making it difficult for other independent offerings to break through.

Apple Music: Spatial audio is a consumer-facing hardware play

By nature of their huge reach and cult fandom, Apple essentially force-fed the conversation around super-surround-sound content to the wider public in mid-2021, with their announcement of “Spatial Audio (with Dolby Atmos support)”. The announcement was made alongside some limited exclusive releases, and a lot of marketing content around the “magic” of the sound and how consumers are finally going to be able to experience music “as the artist intended.”

More recently, the company also released several stats that give us a better sense of the supply of Spatial Audio content on Apple Music: 40 million Apple Music listeners have experienced the format so far (keep in mind that Spatial Audio is enabled by default if you have an iPhone and AirPods Pro), and 21% of songs that have reached No. 1 on Apple Music’s Daily Top 100 charts since launch are available in Spatial Audio. There are also pretty significant differences in genre adoption, at least in the mainstream: while 60% of pop albums that reached No. 1 on the US Apple Music albums chart in August were also available in Spatial, only 36% of hip-hop albums with that milestone had the same setup.

As discussed earlier in this report, Spatial Audio and Dolby Atmos are not the same thing. While Dolby Atmos is a creative production and mixing tool for artists, Spatial Audio in Apple’s context refers to an enhanced, personalized audio experience for the consumer, with features like dynamic head-tracking built into Apple’s own proprietary hardware.

In general, what struck us the most about Apple’s Spatial Audio rollout for music is that they framed the format not as a new creative outlet, but rather as a consumer-facing benefit first and foremost. And there’s a clear strategic incentive involved: According to an Apple support document, you essentially need to be locked into the Apple ecosystem in order to experience Spatial Audio in its fullest form (e.g. AirPods Pro/Max, a variety of Beats headphones, an Apple TV, HomePod speakers), not to mention Dolby Atmos-compatible sound bars or TVs if you’re listening at home.(As a note of clarification, you do not need Apple headphones to experience Dolby Atmos technology in general, which is available across multiple streaming services and, again, is separate from Apple’s proprietary Spatial Audio technology). Additionally, while Apple’s original press focus was around experiencing music in Spatial Audio, the technology is being used and actively marketed as part of the entire Apple ecosystem across apps like FaceTime and Apple TV, which makes this combination supposedly more of an appeal to their consumers.

Meanwhile, as the podcast The Attack and Release Show pointed out, the fact that Apple is only putting a secondary focus on artists and creators in its Spatial Audio rollout feels… very “un-Apple.” With almost all of its other software apps, Apple has always positioned itself as creator-friendly and creator-first. But the Spatial Audio rollout feels like a cart-before-the-horse situation, where music artists are now scrambling to work to understand what the format actually is, how to create it, how much it costs to create it, how to distribute it and whether or not it even matters. Sources tell us that Apple is also prioritizing spatially remastered music content on Apple Music — putting even more pressure on artists to deliver in a landscape that is still missing a lot of fundamental educational resources.

[original tweet]

Amazon Music: 360 audio is an original content and hardware play

Amazon is in a similar position to Apple in that the former owns both the software (Amazon Music) and the hardware (Amazon Echo speakers) to deliver spatial and 360 audio experiences to consumers. Amazon currently has thousands of songs available under their “3D Audio” feature on Amazon Music HD, which is available to all Unlimited subscribers and boasts similar language of “hear[ing] music the way the artist intended.”

That said, there are a handful of crucial strategic differences.

Firstly, unlike Apple, Amazon Music supports tracks remastered in both Dolby Atmos and Sony 360 Reality Audio, with no additional spatial or dynamic head-tracking features layered on top. In other words, this is purely a 360 audio offering, not a spatial audio offering.

Sources tell us that Apple is unlikely to support two rival/competing immersive audio formats in this way, as both Apple and Dolby’s architectures are closed systems that are easier for the former to control. In contrast, Sony 360RA runs on the open-source MPEG-H architecture, which allows that technology to be integrated more cost-effectively into a wider range of music streaming services including Amazon Music, Tidal and Deezer. Perhaps this dual offering across multiple DSPs will put more pressure on artists and rights holders to have their music mixed in multiple immersive formats moving forward.

Secondly, Amazon Music’s hardware upsell play with 360 audio is more about home hardware (like speakers and TVs) than about headphones. In particular, up until just a few days ago, the only way you could hear 3D audio on Amazon Music is if you also owned an Echo Studio smart speaker or a newer model of their Fire TV and Fire Tablet. For now, Amazon does not have a headphone business they can advertise along with 3D audio.

If you’re ever bored, a *fun* mental exercise is to think about the compatibility complexities that come with this level of vertical integration, and how it potentially creates a clunky experience for the consumer. For instance, it’s interesting, even if obvious, that Apple Music’s proprietary Spatial Audio features won’t work if you’re listening to Apple Music on your Echo Studio, even if both sides are leaning on Dolby Atmos technology to create the base-level “spatialized” track.

Last but not least, Amazon Music seems to be directly funding artists’ spatial audio content production at a higher rate than Apple Music, according to our sources. Whereas Apple Music’s approach is simply to demand that artists and labels deliver content to the platform in Dolby Atmos as well as stereo, Amazon is fronting the cost for a small but still significant slate of exclusive immersive releases, sources tell us. While Amazon does not provide their own creative production tools for artists per se in the same way that Apple does with Logic Pro, the former seems to be taking a more “creator-centric” approach in the sense of fronting the additional costs of spatial production. That said, the market of people who can actually listen to Amazon’s spatial releases in their fullest form — i.e. those who own an Echo Studio — is likely smaller than the addressable market for Apple’s Spatial Audio.

All in all, in studying Apple and Amazon’s strategies around spatial and 360 audio, the question that still surrounds this whole discussion is whether spatial and immersive audio is truly the future of music production and consumption, or if the format is just a gimmicky, FOMO-inducing marketing play and an opportunity to upsell and lock consumers into a wider tech ecosystem.


III. Communicating the value of immersive audio to consumers might require venturing outside of music — and giving artists the reins

While widespread claims that “spatial is to stereo what stereo is to mono” in terms of the format’s impact are ultimately subjective, they are partially true if you compare the slowness of historical music-industry reactions to each format over time.

For instance, labels’ slowness to adapt their operations to spatial audio is happening for the same reasons as was the case with stereo: Lack of education and understanding, and perhaps a lack of willingness to invest in new recording, mixing and mastering equipment. Just like with stereo early on, skills in spatial audio today are relatively rare and highly specialized, and the output often feels confusing, crude or over-exaggerated. Artists themselves might not necessarily be incentivized to invest more in perfecting their spatial mix (or stereo mix at the time), because the vast majority of their fans would listen to the stereo mix (or the mono mix) anyways.

If there’s such widespread confusion and lack of understanding within the very industry that is supposed to deliver spatial audio experiences, how do we ultimately communicate the creative value of the technology to consumers? Might other kinds of creative partnerships and opportunities outside of digital music streaming have a better chance of turning fans on to the power of spatial audio?

Based on our research, we identified a handful of emerging opportunities outside the immediate scope of music streaming, where artists and music rights holders could still break ground creatively and provide engaging, convincing experiences for fans. You’ll notice that almost all of these examples align with member Paula Jones’ earlier comment about how the value of spatial audio might be most effectively communicated to consumers with an accompanying visual or narrative reference, rather than in isolation on an audio-first music service.

Livestreaming

Because spatial audio requires, well, space (or at least a sense of it), many artists have already warmed up to the idea of spatialized audio experiences at in-person live events. One of the leaders in the live 3D audio market is L-Acoustics, which owns both the hardware and software to design and run spatialized live event experiences and counts the likes of Bon Iver, Lorde, alt-J, Odesza and Christine and the Queens as customers on tour.

Even though overall demand for music livestreaming is down significantly from early 2020, spatial audio could present an opportunity to make live broadcasts more immersive and engaging for remote viewers, especially if the in-person live experience relies on spatialized sound techniques to keep IRL audiences hooked. Dolby Atmos’ 360 audio technology is already incorporated into thousands of movie theaters around the world, delivering more immersive watching experiences for films like No Time to Die and Dune; especially as Dolby Atmos becomes highlighted on more and more streaming services, it’s only a matter of time before we see the technology incorporated into next-gen club shows.

Gaming and VR

Gaming and VR are perhaps the most powerful examples of using spatial and 360 audio as a tool to drive narrative progress and overall player immersion in a given virtual environment. Unsurprisingly, then, there are already many 360 audio tools on the market aimed at gaming and VR developers — from Facebook 360 Spatial Workstation and Microsoft’s spatial audio plugins for Unity and Unreal Engine to third-party tools like Dear Reality (a.k.a. dearVR) and Ambisonic Toolkit, as well as dedicated production studios like Pollen Music Group.

Interestingly, though, with respect to bringing spatial audio to consumers in a coherent and interoperable way, the gaming industry is currently facing many similar complexities around hardware/software compatibility to what we discussed earlier with the fragmented nature of Apple Music’s spatial audio ecosystem. For instance, for the PlayStation 5 console, Sony developed a brand-new, proprietary 3D audio technology known as Tempest 3D AudioTech, which competes directly with Dolby’s line of 3D audio products. But for now, the only way you can listen to the full scope of the 3D audio experience PS5 has to offer is through the Pulse 3D Wireless Headset, which is manufactured by Sony. Sound familiar?

In any case, it’s difficult to dispute that the future of gaming now goes hand-in-hand with the future of music, as artists increasingly partner with gaming companies like Epic Games and Roblox, as well as VR/AR/MR platforms like Magic Leap and Decentraland, to stage more immersive, interactive music experiences. As we’ve covered in the past for Water & Music, many of these virtual music events still suffer from a lack of spatial thinking, which limits the feel of immersion from the fan’s perspective. In turn, any immersive music/gaming partnership is also presumably a built-in opportunity to experiment with spatial and object-based 360 audio, with the visual cues in gaming potentially serving as creative inspiration for future audio-only production techniques.

Podcasts

Given how much previously music-centric streaming platforms like Spotify and Amazon Music are now investing in podcasts, it’s interesting that the spatial audio craze seems not to have hit the podcast world as hard.

Perhaps this is because podcast producers, creators and studios including Owl Field, Paragon Collective and QCODE have already been experimenting with capturing binaural audio and mixing using tools like Dolby Atmos for several years now, primarily for storytelling purposes. The format is especially effective for genres such as true crime/horror, relaxation/meditation and nature — plus adjacent trends on YouTube and TikTok, like ASMR — that thrive on a sense of total immersion. Even iHeartRadio announced a content slate known as “iHeart 3D Audio” earlier this year dedicated solely to binaural podcasting.

What’s interesting about spatial audio awareness in the podcast industry is that it flips music-industry power dynamics on its head — namely, spatial audio activity in podcasts is being driven primarily by the creative motivations of producers, instead of being imposed from the top down by tech platforms. This dynamic potentially presents interesting creative partnership opportunities for artists, many of whom are already developing in-depth podcasts to complement their albums or other musical works.

As British songwriter and producer Benbrick once said at the Abbey Road Spatial Audio Forum:

“… it represents a new method of storytelling. If our job as creators is to make people feel new emotions through our art, then we should be open to exploring new ideas. If you can have the instruments represented as characters that can also move around a 3D space, then you can start to paint new mental pictures for the listener.”

This brings us to perhaps the most crucial thesis to emerge from our research: To this day, tech companies have largely been treating consumer- versus artist-facing motivations to embrace spatial audio as separate, when in reality they are arguably interconnected.

Put another way, the existence of a compelling consumer use case for spatial audio arguably requires a critical mass of artists creating natively and intentionally for the format, instead of treating it merely as a marketing afterthought.

But as we covered in this report, there’s still a major gap in clear education even within the music industry about how to access the right tools for creation and distribution such that spatial music tracks land in the hands of the right fans — let alone what the difference even is among all the different terms being thrown around, like spatial and 360 audio. Meanwhile, big-tech companies are taking hold of the commercial and narrative reins to fatten their own wallets.

At the end of the day, it’s easy to critique all the hype around spatial audio and ask whether it’s something that most music consumers really want. But maybe that is the wrong question to ask; instead we should maybe ask, why aren’t we making it easier and clearer for artists to make this case themselves?


This article came together with the help of the following contributors:

Core writers:

Members who helped contribute and curate resources: