How we made an official GrimesAI banger

By: Yung Spielburg

Published: 2023-06-29

At Water & Music, we’ve poured hundreds of hours into researching how AI will transform creativity and business in the music industry.

What you may not know is that we also have several veteran musicians on the W&M core team — and we’ve recently stepped out of the research lab and into the studio.

W&M founder Cherie Hu and I spent the last few months making a banger using Grimes’ new AI tool Elf Tech, which allows you to process any voice recording through Grimes’ proprietary voice AI model.

With the blessing of Grimes’ team, we’re thrilled to be distributing the song as an official release on DSPs.

Our song “Eggroll” is out TODAY on all major streaming platforms. You can listen to the song on the DSP of your choice by clicking here.

Read on for some behind-the-scenes intel on how we made the track and what it taught us about the role of voice AI in the music industry.

How we made “Eggroll”

For a team that cares deeply about improving the UX around music AI, we were delighted at how easy it was to get started with Elf Tech: Turning your voice into Grimes is literally as simple as visiting a web page and uploading an audio file.

On the Elf Tech site, which feels like an old Windows desktop, you’ll find not only the voice model, but all the files used to train it — even some Ableton sessions. As a producer, it was really interesting to poke around the sessions; I can spot an Ableton wizard’s session immediately, from the endless chains of plug-ins. I think this really speaks to the commitment to the open nature of what Grimes is experimenting with.

When it came to creating our own original song with the model, almost everything about my production process remained the same. While GrimesAI ultimately delivers the vocals in the final recording, Cherie and I still had to write and record the melodies, lyrics, and instrumentation ourselves. We even had the track mastered by the tremendous, multiple Grammy-nominated engineer Ryan Schwabe.

The main difference is that “Grimes” is now the main vocalist on our song — and we never had to speak with her. She never even had to record anything!

Ultimately, the process we landed on was Cherie recording herself in NY and sending me the files to convert into Grimes’ AI voice. Elf Tech’s integration into a production workflow is still a bit clunky — since processing only works in short clips, I had to divide the original verse recordings into two-line chunks in my DAW, export and process each of those chunks on the Elf Tech site, and then re-import them back into the session. In terms of where in the process Elf Tech sits, I would liken it to using vocaloid software or any kind of vocal effect, such as pitch or formant shift.

At first, I actually recorded the vocals myself as the source material for the model. The problem was there was just a bit too much of “me” in there; it sounded like a tiny Jewish man singing as Grimes.

Voice models today still seem to work best when the source voice is closer to the model voice. Cherie’s voice translated much better than mine — and then GrimesAI sounded better than both of our voices for what we were trying to achieve. Few things are as important as the sound of the singer’s voice in a production, so while the workflow still has room for improvement, what we were able to create was game-changing for the song. The juice was certainly worth the squeeze.

Eggroll economics

Compared to the rapid, chaotic pace of generative AI developments, the economics of “Eggroll” are actually quite straightforward.

Through Elf.Tech, Grimes has created a centralized hub where folks can submit their songs to be blessed by her team and properly registered and distributed on streaming services, including a 50% master royalty split with Grimes. We went through this exact same, manual process to get “Eggroll” approved for distribution; the timeline from initial submission to final release was four weeks.

Presumably, this submission process will be more automated down the line as the industry catches up to fan behavior. (Grimes’ recently announced partnership with TuneCore is a step in this direction, scaling the GrimesAI approval process across multiple distributors.) If desired, artists also have an option of making Grimes a secondary or featured artist on their song and distributing it themselves without any red tape. But my sense is most folks will want to try going through Grimes’ official infrastructure, especially knowing that the song will at least be heard by Grimes’ team on the other end.

This is perhaps the most interesting part about working with GrimesAI and Elf Tech: It’s as much about distribution as it is about creation.

As outlined in the previous section, making a song with GrimesAI still feels very similar to a standard music production process, defined by hands-on, human-led iteration and creation. It’s not a full-on “AI record” in the sense of pushing a button and spitting out a commercial-ready song with no effort.

Rather, thanks to the infrastructure Grimes has built around Elf Tech, voice AI acts more as a conduit for us — and any other fan on the Internet — to contribute to and be recognized officially in the Grimes creative universe. Its value-add is as social and participatory as it is aesthetic.

Mat Dryhurst, the artist, technologist, and co-creator of Spawning, speaks often about the incentives that are likely to push folks to want to do things above board with AI — especially the desire for an official blessing from the artist. I can tell you, from experience, sitting in this seat, that rings really true to me. When we submitted our song to the hub, I really hoped Grimes might hear it and love it and share it — all the things that Mat foretold. There was then huge satisfaction from both myself and Cherie hearing positive feedback back from Grimes’ team.

Here’s to using AI to unlock even more bangers, with fully consented parties and organized teams leading the way.