How smart speakers are changing music listening

By: Brooke Hawkins

Published: 2020-09-04

In 1999, a new radio station (WMOM 102.7 FM) came to my small town in Northern Michigan and promised to play Top 40 hits, something our town sorely lacked at the time. Every week I would call the new station and request my favorite songs; sometimes they would honor those requests, sometimes they wouldn’t. Other times, the DJs would add in their own favorite songs from the 80s, and I would desperately call and ask (with my very limited musical vocabulary) what had just played in hopes of getting an artist or track name. I would write it down on a piece of paper and take it to a music store where weeks later I would buy a CD, and then repeat this cycle over and over again.

Today, I’m a designer of personalities for smart speakers, and I laugh thinking about how easy it is to listen to and discover new music on devices now. I imagine kids asking Alexa what her favorite band is, and waiting just seconds for her reply. Maybe she likes Kraftwerk (being a computer) or something more universal, like The Beatles?

Designing Alexa’s personality takes a team of people from different disciplines, including screenwriters and designers as well as technical developers and linguists. Dispersed across the world, this team uses insight from user behaviors to decide everything from Alexa’s tone of voice and pronunciation, to how she’ll handle odd requests and questions about her identity. Over the years, her personality has been fine-tuned, and she has plenty of opinions on topics ranging from feminism to pets; she even has culturally different opinions based on the geographic region where you’re speaking to her.

I haven’t tried asking Alexa what her favorite band is, because I’d be skeptical of her recommendation. As a designer of tech, I know Alexa’s preferences aren’t impartial; they’re there to give us a sense of comfort and familiarity, and ultimately to drive us to take some kind of action, like buying a product on a website. Her existence is predicated on satisfying users’ existing tastes — unlike my local radio station, which, after several weeks of me calling and trying to sing the lyrics of “Send Me on My Way” by Rusted Root, gave up and recommended “Burning Down the House” by The Talking Heads instead. (I should have appreciated that recommendation more.)

Alexa isn’t inherently a “music fan,” because she’s a byproduct of company interests and user goals distilled into a human-like personality. My question for Alexa isn’t what kind of music she likes, but why does she care about music at all?

Why does Alexa care about music, anyway?

Today, nearly 88 million people in the U.S., or around 35% of American households, own a smart speaker. 53% of the time, those devices are made by Amazon.

Aside from asking for the news and the weather, streaming music continues to be one of the most popular use cases for smart speakers. In fact, requests for this category have only increased during the COVID-19 pandemic, especially among owners in the 18–35 age bracket.

Smart speaker manufacturers are well aware of the fact that we listen to music across all our devices. In fact, this has prompted many tech giants like Amazon, Apple and Google to enter the music streaming ecosystem over the past several years, as direct competitors to industry giants like Spotify and Pandora. With smart speakers specifically, the Amazon Echo, Google Home and Apple HomePod all prominently feature music in their marketing campaigns — and often bundle in the companies’ owned-and-operated music services as a discounted perk. Even companies that started primarily as audio hardware companies, like Sonos, now maintain their own proprietary streaming services for their smart devices.

But music streaming is far from the bread and butter of any of these tech giants — e.g. Apple Music accounts for less than 4% of Apple’s revenues — and their respective music features and platforms have been rapidly changing over the last 10 years. Amazon and Google, for example, didn’t start exclusively as companies that delivered music to consumers (one, a book seller, the other a search engine, respectively), and the need to keep up with competition and consumer demands has led to many internal reorganizations. Notably, Alphabet executives have expressed disappointment in their path to gaining subscribers for both their YouTube Music and Google Play Music services, and will soon be sunsetting Google Play Music to focus solely on streaming via YouTube moving forward.

And merely launching a music service as a big-tech company isn’t enough to beat out the competition. To give some perspective, the subscriber bases of both Amazon Music and YouTube Music combined still pale in comparison to Spotify’s more than 130 million subscribers, further showing how wide of a gap these tech giants have to win over listeners.

But they show no signs of giving up. Why do streaming platforms matter when it comes to smart speakers, and to these tech companies’ strategies at large?

“Earth’s Most Customer-Centric Company”

In its own review of Alexa’s progress over the past five years, Amazon highlights a specific goal of continuing to develop hyper-personalized experiences for its users. Though Alexa already seems pretty personal today, the vision for the future is based on a growing ability to make decisions about and for Amazon customers based on their data.

For example, Amazon imagines future versions of Alexa to act more like a concierge, whereby you can interact with her more naturally instead of having to engage with individual skills to perform individual tasks. What is normally a disjointed user flow today like asking Alexa to “play a date night playlist,” then “order me an Uber to Nobu,” then “reserve me a table at Nobu with OpenTable” could all be accomplished in one simple interaction: “Alexa, plan my night out.”

In order to fulfill complex requests like this, Alexa needs to make “hunches” about what you need, or assumptions about what you’re asking about based on your past behavior. For your night out, Amazon envisions a world where Alexa could handle everything from understanding you need a dinner reservation, to choosing a trendy spot near a movie theater where you’ll want to check out a new release, to buying a ticket for a showing at the right time, to needing a ride home in an Uber just as the movie is wrapping up — all the while making sure you don’t have intruders at your home with Amazon Ring conveniently monitoring while you’re away.

All in all, music is a small part of being the “most customer-centric company” on the planet. Across that whole night out, you might have listened to music only while you got ready, or to keep the party going when you got back home.

Yet what likely got you interested in buying the smart speaker in the first place — and what you likely spend most of your time doing with your smart speaker today — is listening to music.

In fact, the content you consume is just the bait. The hook, or what makes these speakers such a long-term goldmine for tech companies, is the fact that they present an unprecedented opportunity to sell their customers additional products precisely suited to their personal routines and needs right when they want them.

Sure, you can sort of do this now — it isn’t difficult to take out your phone and quickly add some items to your cart — but imagine a world where you don’t have to ask at all, where this digital concierge seamlessly follows you from your devices at home, in the car, and even in the workplace. This world relies on data, and in this world, everyone is vying for your valuable attention.

How does this impact music listening?

This isn’t just Amazon’s mission. Tech companies of all sizes are moving from simply providing access to services or products, to realizing they are arbiters of large, valuable bodies of information about their users. Using this data intelligently and selling this data to companies that want it is key to their financial success.

Spotify is one of these companies. Founded as a music streaming service, Spotify boasts its ability to use “machine listening” to turn music (something sonic, emotional) into categorizable data (genres, timestamps, BPMs). While it helps listeners quickly and efficiently find almost any conceivable musician and song, this kind of data also turns out to be quite valuable to other companies — particularly how users listen, when they listen and what they listen to. As Spotify puts it: “The more they stream, the more we learn.” The company then sells these learnings to brands, to help them make smarter decisions about their customers and to better sell them products based on their interests.

Liz Pelly, contributing editor at The Baffler, often explores the relationship between these technology companies and their increasing shift from being streaming services that produce valuable data, to data companies that also happen to stream music.

During her keynote at the by:Larm conference in Oslo earlier this year, Pelly specifically pointed to the role of advertising in Spotify’s streaming ecosystem: “In Spotify’s world, the commodity being sold is not music itself. Rather, it prefers to frame the commodity that they are selling to advertisers as users’ moods and emotional states, as their listening habits, as its behavioral data.”

Music streaming ads play a unique role on voice-enabled devices. In fact, both Spotify and Pandora have launched specific programs where advertisers can target their content specifically at music listeners on smart speakers. This is particularly advantageous for brands looking to inspire and reach customers precisely when they would be interested in their products or services — like when they’re listening during breakfast and realize they’re out of Quaker Oats, or when they’re doing their morning beauty routine and decide to order a sample of Nars makeup.

“The decontextualisation of music, and the recontextualisation into playlists of mood, activity and effect, has caused a paradigm shift in music,” Pelly said during her keynote. “And it should be understood that this shift is also largely made as a strategic part of Spotify’s long-term vision for its advertising business.”

This has had material impacts on how some musicians understand the way their music is being consumed. Damon Krukowski of Galaxie 500 wrote about the band’s song Strange, which receives a notably higher number of streams on Spotify than any other song in their catalog despite never being one of the band’s singles, or ever being a particular one of their most popular songs. After doing some research, Krukowski concluded that this is likely because Strange is most sonically similar to other popular indie songs, and thus was being more frequently recommended to listeners via Spotify’s Autoplay feature than any other song in their discography.

The work of writers and musicians like Pelly and Krukowski shows not only that there is a real shift happening with how music is accessed, but also that artists increasingly have to understand how these complex algorithms work on an intricate level in order to ensure that their music is heard, and that they earn any kind of income.

Is there something valuable we are at risk of losing when music becomes so data-driven? With the advent of Amazon, indie booksellers felt this pain first — many unable to compete with the sheer magnitude of offerings and low cost that the tech giant promised to consumers.

In his podcast (and book) Ways of Hearing, Krukowski explores what we’ve lost in the transition from a world of analog media to one of digital. Everything from our relationship to musicians and to physical space, to the types of stores we shop in to buy music and how we listen to music, have all changed radically due to the advent of digital recording, digital listening and digital streaming.

Krukowski never says this shift is inherently bad. I can see the positive effects of smart speakers in my own musical life. I have access to more music than I did growing up in a small town in the 90s with only my local radio station to call. I no longer have to wait weeks to pick up a CD to listen at my leisure. I can regularly ask the Google Assistant on my phone to listen to songs and tell me what artists made them — then I stream them, buy their albums on Bandcamp and form relationships with music that would have taken me years before.

But what Krukowski does say is that asking “what are we losing?” is crucial in our increasingly digital world. He points out the intangible aspects of music that are uncategorizable, unable to be quantified by algorithms or data points. The irregularities on a physical LP worn with love that we never would hear in a lossless FLAC file. The thrill of waiting weeks for a new CD and listening to it track by track, just as the creators intended.

Imagine a day in the near future when you’re listening to Lady Gaga in your smart speaker-enabled car while you’re driving to work. While listening, you decide to ask your digital assistant to pre-order your favorite drink at Starbucks. Your music keeps going, you get your drink quickly and you have a perfect Monday at work.

But now, every Monday when you put on Lady Gaga, you get a notification from your smart speaker to stop by Starbucks, including a discount on your favorite drink. It becomes a ritual, your Monday morning pump-up. Your experience of listening to Lady Gaga is inextricably linked with your desire to drink a latte on your morning drive. You realize there was something nice about listening to music before it became so entrenched with recommendations, personalizations and patterns. Can you ever listen to Lady Gaga again and not feel the Pavlovian urge to sip a discounted cup of coffee?

What are other aspects of music listening that streaming and smart speakers can never replicate? And when we find out what they are, will we miss them when they’re gone?