top of page
Writer's pictureDmitri Vietze

The Last Mile to Music Streaming and the Mother-in-Law Test: VOICE!


Photo by Jason Rosewell via Unsplash

Do you remember a couple of years ago when music streaming services started to look viable and we all were still scratching our heads at the staying power of terrestrial radio? Some clever people at music industry conferences told us: Even though we are all carrying jukeboxes in our pockets, streaming services still must compete with the simplicity of a knob that you can press to play music instantly. Every car came with one and if you worked with your hands or your eyes, but not so much with your ears, you might have an old-school radio keeping you company all day or night.

Those of us dedicated to music, as listeners, but especially professionally, don’t find it so hard to open an app, type an artist’s name or navigate through to a playlist, and play the music we want “right away.” But it’s easy for us to forget that most people in the world do not operate that way. For most people using a streaming service to listen to music, there are six steps:

1. Find or take out your phone. 2. Remember which app plays music

3. Find the app to play music.

4. Click the app to open.

5. Search for an artist, song, album or click on a playlist (possibly stop the music Pandora decided for you to listen to first).

6. Choose between playing a song, an album, or shuffle play.

Many people, if they get to step #5, get stuck when they can’t think of a single thing to play, including not finding an appealing playlist name to click on.

So for those of us in the music industry, we take these steps for granted, but there are many more steps and decision points than clicking on a car radio and possibly then clicking on a different “saved station” button.

But what happens when you do not have to find a device and make multiple clicks? What happens when you speak in your natural voice, either asking for a specific artist, song, or album, or asking for something more generally like a genre or something to go with your dinner party or your workout or to put you to sleep? And what happens if that works?

I recently met Ian Geller, Spotify’s Head of Hardware, and he told me about the “mother-in-law test.” He kept buying devices for his mother-in-law to give her the joy of music streaming. A tablet. A Bluetooth speaker. A Sonos speaker. None of them did the trick. It was just too many steps to learn or do in the moment of musical desire. Then he bought her an Alexa device, a microphone-enabled speaker that uses Natural Language Processing, which allows the user to order an Uber or a pizza, get information like the weather, or… initiate music streaming. Ta-da! A voice-driven interface passed the mother-in-law test!

For the past few years, brilliant founders and engineers have been trying to solve the recommendation and discovery algorithms, the lifestyle integrations, the hardware integrations, and the mobile interfaces to get from the 50-100 million paid subscribers to a larger portion of the 7 billion residents on earth. They’ve made a lot of progress. But it looks like voice is our latest music tectonic, and may be the equivalent of the “last mile” in the music streaming pipeline. It’s the first interface that surpasses the 100-year old radio knob in ease of use and agility.

But this is not just a tectonic for music. If you thought things shifted when the iPhone changed the world and computing became mobile, wait until you see what happens when your eyes and hands are freed up with voice.

Author Dmitri Vietze met Ian Geller at the Collision Conference in New Orleans. Vietze co-authored a free new ebook titled “Music Tech Conferences: A Guide for Startups.” Vietze is the founder and CEO of rock paper scissors, a PR firm that specializes in publicity for music tech companies.

bottom of page