Why music AI companies are not flirting with copyright infringement

By: Yung Spielburg

Published: 2023-01-18

Bit Rate is our member vertical on music and AI. In each issue, we break down a timely music AI development into accessible, actionable language that artists, developers, and rights holders can apply to their careers, backed by our own original research.

This issue was originally sent out under our collab research vertical known as The DAOnload.

Over the past few weeks, our research team has been hard at work interviewing founders of several music companies leveraging creative AI technology.

After a total of seven in-depth interviews, some key differences have emerged in how music AI companies are building out their tech stacks compared to companies building visual- or text-based AI tools.

Many of you may have heard of or played around with visual AI models like Midjourney, DALL-E, and Stable Diffusion, or text AI models like ChatGPT. These models were trained on vast swaths of data scraped from the internet, with no distinction being made between open and copyrighted data.

As it turns out, large portions of these images and texts were copyrighted. The past week has seen not one, but two lawsuits brought against visual AI companies:

Jan. 14, California: Matthew Butterick, joined by litigators at the Joseph Saveri Law Firm LLP and Lockridge Grindal Nauen PLLP, filed a class-action lawsuit against Stability.ai, Midjourney, and DeviantArt alleging copyright infringement and violations of several other laws, including the DMCA and unfair competition laws in California.(It’s worth noting the complaint itself has several points of misinformation about how AI technology actually works, as an anonymous group of engineers have outlined in detail here.)
Jan. 17, London: Getty Images announced its own lawsuit against Stability.ai alleging copyright infringement — specifically the unlawful copying and processing of “millions of images protected by copyright and the associated metadata owned or represented” by Getty. According to their official statement, Getty has licensed out their image catalog to developers for AI training purposes in the past, but Stability bypassed this process.

Everyone knewsuch events were coming — it was just a matter of when or how exactly it would play out.

In contrast, what we’ve heard over and over again from music-based AI companies is the importance of operating on proprietary or licensed training datasets, and even proprietary AI models in some cases.

Some of the companies we spoke to are focusing on building a better user interface, experience, or product around music/audio AI models (e.g. Never Before Heard Sounds). Others are building new generative models and systems from scratch — trained on their own proprietary libraries (e.g. Authentic Artists), or on music that was ethically bought out from their original creators with full permission to be used in training datasets (e.g. Harmonai, Mubert).

What none of these companies opted to do was publicly release or license a music AI model that had been trained on large swaths of copyrighted or unlicensed material. Such models do exist in the world — such as OpenAI’s Jukebox, which was trained on 1.2 million songs and can output audio and lyrics in the style of celebrities like Katy Perry and Frank Sinatra. But several interviewees said they had no intention of integrating these kinds of models into their products, because it was just too risky and unaligned.

Here are a few specific reasons that kept on surfacing in our interviews:

Legal defensibility

Internally, we have coined the phrase “RIAA boogeyman” to demonstrate the brutal history of copyright enforcement around music.

Lawyers arguably wield more power and influence in music than in any other creative industry. Everyone remembers what happened with Napster. In general, whenever a new tech company hits scale on the back of unlicensed music content, the RIAA, NMPA, and other music trade orgs will come knocking — and AI will be no different. (The RIAA has already taken a stance on calling certain AI music mixing tools a form of piracy.)

2. LAWYERS

one could *maybe* pirate all the 10s of millions of tracks across music streaming services and use that as training data — but then the majors + their legal teams will come straight for you.

IMO lawyers have more power in music than in any other creative industry.
— cherie hu (@cheriehu42) November 8, 2022

From the startup perspective, building out this legal defensibility certainly isn’t cheap. Some of our interview sources have told us that creating and training original data for audio composition and synthesis models can cost millions of dollars. And even then, subsequent abuses of Content ID systems on platforms like YouTube or TikTok can create content management and monetization headaches for music AI companies, especially if you have millions of users releasing songs with similar-sounding source material.

But starting from the jump with an AI model trained on copyrighted, unlicensed content is essentially a death knell for a music startup.

Ethical and moral responsibilities

Many music AI founders and their teammates are musicians themselves, and find that the idea of stealing from or violating a fellow creator is untenable.

In fact, visual artists have pointed to music AI companies’ more permissioned data-sourcing practices as a standard for how visual and text AI companies should operate moving forward. The lawsuits cited above, along with educational resources like Have I Been Trained, may further increase momentum around establishing more ethical and transparent training practices for creative AI models across the board.

Dance Diffusion, a music version of Stable Diffusion made by the same company, says they will not train their AI on copyrighted music. They extend this ethical courtesy to musicians but not us because they are not legally afraid of artists. I am furious, we all should be. pic.twitter.com/5Rcnni0LMX
— RJ Palmer (@arvalis) October 21, 2022

Moat creation

If everyone has access to the same large public datasets, how do AI companies meaningfully differentiate?

A company could choose to compete on UI, UX, and network effects — but a surefire way to differentiate is to have data that no one else has. Authentic Artists founder Chris McGarry told us that in order to train the best model that could generate the best product, he felt it was imperative to have proprietary data that was handpicked. The theory being put into practice is that the higher-quality and more targeted the data for a specific product, the better the model is going to perform.

This philosophy is also reflected in the growing interest in fine-tuning AI models on one’s own music samples and back catalog, as companies like Harmonai and Semilla are working directly with independent artists to build.

It’s possible that in the future, there will be another large-scale music model trained on millions upon millions of copyrighted songs that is licensed and widely adopted across the industry. But for now — by necessity and as a long-term strategy — music AI companies are choosing a different path.

EVEN MORE RESOURCES

Didn’t have time to drop into our Discord server this week? No worries. Stay up to date right here in your inbox with the best creative AI links and resources that our community members are sharing each week:

Music/audio AI tools

Kaiber (Visual generator from our very own @stokebuilder)
Voicemod (Voice modulator targeting gamers)
Metavoice (Voice modulator)
Resemble (Voice generator)

Music/audio AI models and datasets

Cantable Diffugesion (uses a fine-tuned model of Stable Diffusion to generate four-part Bach chorales — created by @andreas from our very own community!)
compIAM (repository of datasets, tools, and models for the computational analysis of Carnatic and Hindustani Music — an important contribution in the context of examining potential bias in AI models towards Western music styles)
Msanii (new diffusion-based model focused on generating longer-form, high-fidelity music more efficiently)

Industry case studies and news

The Chainsmokers credited Stability.ai and Stable Diffusion, alongside filmmaker Remi Molettee, in their latest music video for “Make Me Feel.”
Shutterstock and Meta inked a data partnership that will see Meta using Shutterstock’s catalog of “millions of images, videos and music” to train their own AI models. (Shutterstock owns the AI music tool Amper Music, and recently announced a partnership with OpenAI that will see the stock image company integrate DALL-E 2 directly into their platform. Meta previously published AI research on music, specifically using AI to translate music from one instrument sound to another.)

Follow more updates in Discord

Keep an eye on our #ai-news-bulletin — our read-only channel where our research team curates the latest tweets related to creative AI news, tools, and developments, exclusively for members.
Drop the coolest audio AI tools you find in #audio-ai-tools.
Join the general community discussion in #ai-avatars.