Edition : InternationalLanguage : English

How LBB Works Plans & Pricing Editorial Toolkit

Thought Leaders in association withPartners in Crime

Thought Leaders

Robot Rock Part II: Does ‘AI’ Spell the End for the Custom Music House?

23/04/2024

Mophonics

Music & Sound

Culver City, USA

496

ADD TO COLLECTION

23/04/2024

Mophonics' Kristoffer Roggemann demystifies new AI tools

Some things just don’t age well, like that unfortunate Aztec-print shirt you purchased at the flea market, or the song ‘Sixteen Going on Seventeen’...

Similarly, when we sat down with LBB last summer to dissect the hype around generative AI music tools, we were confident that our assessment of what these tools could and couldn’t do would last for at least a year, maybe two. At the time the best generative AI music had to offer was MusicLM from the Google Test Kitchen which was quite underwhelming.

But within a few months, with the relentless forward-march akin to the ever-expanding Star Wars franchise, Moore’s Law struck again. The world was introduced to mind-blowing AI tools like Suno (initially launched 12.20.2023, v3 launched 3.31.24) and now Udio (beta launch 4.10.24)

At first listen, these tools are frighteningly impressive… with a few words as a prompt, you can ‘create’ a fully fleshed-out song with lyrics, vocals, harmonies, and instrumental arrangements. For a lark, I fed Udio the brief ‘write me a 1950’s doo wop style song about Mophonics’ and within a couple of minutes it spit out THIS. I mean… you have a pretty believable Drifters-esque pre-chorus, chorus and verse, complete with true-to-the-era vocal stacks and stylings, legit-sounding backing band, hooky memorable melody, etc. My favourite touch is that the ‘singers’ take a breath before one of their lines.

Admittedly I had a bit of an ‘oh f—-’ moment… I stayed up for hours, making silly songs about everything from my son’s soccer skills to how awful the Jets are by simply feeding the prompts basic info… Then I started plugging in some simplified client briefs from the past year to see what would happen. Not gonna lie, there were actually a few decent idea fragments in there that a composer could conceivably use as a starting point.

We get ‘prompted’ by our clients all the time with similar criteria to the inputs we were giving AI: genre, tempo, instrumentation, lyrical subject matter, etc… So naturally, our next thought was ‘could this technology replace what we do?!’

Before we go into full pearl-clutching mode, let’s take a step back and demystify these new AI tools a little bit.

How They Work

Nerdy (but fascinating) explanation here.

TLDR version: Udio and Suno both utilise what we call a ‘diffusion model’. Essentially, they are trained on ingesting huge amounts of existing popular / copyrighted music tracks from the last 100 years, synthesising them into a ‘corpus’, then spitting out ‘new’ content based on the end user’s textual descriptions.

What They Can Do

- Make pretty realistic-sounding songs quickly. This is evident after playing around with the platforms for a couple of minutes. The engines seem to do best with the more formulaic styles like doo wop, blues and hair metal, and struggle with more open-ended styles like ‘Broadway’, jazz etc. Sure the songs are not perfect; you can hear weird tell tale digital artifacts - the audio equivalent of mangled six-fingered hands we’ve seen with Midjourney and the like. This is especially apparent with generated lyrics - taking the AI-generated 50s doo-wop Mophonics song for example, it quickly figured out Mophonics is a music company and based its lyrical themes on that… but questionable lyrics like ‘looking for that rhythm / sweet as a phrase’?! Interns have been dismissed for less.

What They Can't Do (Yet)

- Iterate. Udio and Suno can generate impressive content, but it’s next to impossible to iterate on a generated idea in a meaningful way. For example, you can’t say to it ‘ooh I like this song you made, but I need a female singer instead’ and have it comply. This type of editing and iteration is likely only a few steps away; reason being, if you have a good enough model like Udio or Suno that can generate music from scratch then they conceivably already have everything they need to repurpose it to do editing and iteration. We saw this on the image generation side with DALL·E 3 rolling out its own editing tool only a few months after its initial release.

So in the meantime, why is lack of iteration a problem? Well, think about it: what advertising client has EVER accepted a first draft idea as final? I’ll wait…

- Recall. Building on the lack of iteration, in this business we need to not only create new versions of tracks but also recall prior versions - this is incredibly limited with these new AI tools.

- Create music with a high emotional quotient. Novelty songs about jock itch are one thing, emotional songs are entirely another. Confirming our intuition, Stephen Arnold Music and SoundOut recently conducted a study of human and AI-generated music and found that, unsurprisingly, human-made music scored way higher in creating emotional connections with the test audience. [Disclaimer: the test audience was comprised entirely of humans… perhaps a mixed human/robot audience would have reacted differently? OK I’ll stop.]

- Provide Transparency on trained sources. Clearly these platforms are trained to ‘learn’ from existing human artists.

Vocal impersonation. This is inherently problematic from not only a vocal likeness/right of publicity perspective, but also from a payment perspective. Can you honestly tell me Trey Anastasio from Phish shouldn’t sue over THIS song I made in Udio, or that Liz Phair shouldn’t be paid for THIS?! These are complete vocal rips. Tsk tsk.
Legality. Not only are there vocal impersonation issues, I quickly noticed that several ‘original’ songs I was creating actually clearly contained recognisably copyrighted material. Take THIS song I made with the prompt ‘an early 2000’s trash rap song about diaper rash’ (an actual client brief from earlier this year - thank you Balmex.) That backing track in the first few seconds is ABSOLUTELY Muse’s ‘Supermassive Black Hole’. This is 100% infringement that would line a client up for a supermassive lawsuit.

- Create copyrightable music. The legal precedent has already been established that AI-created music is not copyrightable. Although one could re-record a song generated by AI and copyright the sound recording, that doesn’t skirt the issue of the song’s publishing still not being copyrightable AND possibly infringing on existing copyrighted works. I find it hard to believe a reputable music house would take this kind of risk with a blue chip client. Not to mention, no copyright means there’s no way to monetise the publishing royalties, or guarantee any kind of exclusivity for your ad clients… things can get messy very quickly.

So, What Does All This Mean for the Custom Music House

- Faster ideation. Creators will potentially be able to turbocharge what we do in the initial stages of brainstorming song ideas (again, with the above caveats and risks.) There are no doubt many who will find it icky/distasteful to build a song off of an AI-generated idea (that’s a whole separate op-ed piece), and the fact remains that it will still take clever writing and smart choices to fix and customise the AI’s limitations.

- Quicker calibration on music briefs. Many of our clients struggle to articulate their musical vision using words. Just like how a well-curated Spotify playlist functions as a ‘sonic mood board’ to align with our clients on tonality, style and mood more easily, these generative AI tools could allow clients to do their own experimenting and create new songs to reference in their briefs. This may turn out to be a good thing; after all, how many times have we gotten the exact same tired music references over and over? We are collaborators with our own artistic POVs but we are also ultimately working to achieve our client’s creative vision, so racking focus on that vision quicker could help the process. Let’s just hope there’s no ‘demo love’ for an AI reference song…

- Devaluation of music as a commodity. Just because AI can do something doesn’t mean the human version isn’t equally or more valuable. An amazing pencil sketch created by a computer vs. one done by a human... which one has more subjective value to us? Don’t @ me NFTs.

- Increased competition in the novelty song market. Is there a place in this world for cheap, mass-produced dumbed down music? Potentially yes - just look at the vacuum created on TikTok when Universal Music pulled their catalogue. We will likely see tons of AI-generated music flooding this medium where, among other things, kids search for 30 second novelty songs about farts. That is, until the novelty wears off.

Conclusion

In the short term, custom music houses have little to fear from diffusion model AI engines like Udio and Suno encroaching on their craft. These programs’ everyday usefulness will likely be somewhat limited until they improve. As usual, true originality and innovation require a human touch.

AI is already a part of our workflow. Say you wanted to mock up a part for a vocalist before your session - you can do that with tools like Kits AI that integrate with your DAW, are trained on actual consenting humans who are paid for their time, and are useful elements in the everyday creative process. We’re likely not far away from tools like Udio and Suno also playing a role.

When leveraged properly, generative AI may well prove to be transformative to the world of music creation. Understanding its strengths, limitations and risks is important to having productive conversations with your clients about the opportunities in this space.

AI is only getting smarter and more powerful; its quality is only going to improve, and rapidly. Can we teach it musicology and legality? We’ll see how well this article ages. BRB….

Credits

Music / Sound

Thought Leaders in association withPartners in Crime

More News from Mophonics