Australian technology news, reviews, and guides to help you
Australian technology news, reviews, and guides to help you

Scripting sound: how AI could change what you listen to

You can get text out of nothing and images out of nothing. And now you can make music that way, as well.

For a long time, music has been regarded as one of those things you needed skill or talent to make, but that could well be changing thanks to everyone’s favourite buzzword “AI”.

In the past few years, we’ve already seen how artificial intelligence can be applied to image generation, with some of these images used in real life for stories, and there’s also the abundance of AI text, as well. Artificial intelligence can do some exciting things, creative things, and it can even clone your voice and make videos seemingly from just an idea.

Something from nothing is largely the point of all of these concepts, which come as a result of algorithms and models being trained on data, working with prompts to create something new. The data is things created by other people, by artists and makers and creators of all sorts, as AI code joins the dots and finds the patterns in all of the work.

It creates these models to understand how to make something based on the work of others, and it’s how we end up with artificial but seemingly real art and photos and voices and video and more, and it is getting growing fast.

So what’s next? Music.

How music is traditionally made

When this journalist was young, he learned how to play cello, and then the bass, and he even learned how to sing. He was even in a band, at one point playing for Skulker, sitting in between the Aussie rock band and Laura Imbruglia, Natalie’s less famous sister.

Many people have credits littered like this. They learn an instrument, and either join a band and make music, or do it all solo.

Music is typically one of those things that you work on, first coming up with the song and then recording each element and track, before mixing it together. Home studios have no doubt made a difference, and thanks to the sheer number of microphones, sound interfaces, and excellent software available, literally anyone with the skills and the talent can make music.

It’s an exciting world if you happen to have a song in your head and some skill, not to mention the time to get it all done. All of those things are important when creating music traditionally, though you can also head to a proper studio with an engineer and do it that way, as well.

These are the standard ways of making music, and are very different from the new world of AI music.

Will a robot play a piano for you? Not really, but AI could make music all the same.

How AI music is made

AI music is a little bit different from the traditional process, and feels more like an interpretation from a film.

Back in 2007, Disney/Pixar’s Ratatouille noted that “anyone can cook”, and AI music is a little bit like that. It’s a little like a recipe, or even like scripting or screenwriting, prefacing blocks of lyrical text with cues for musical styles, as well as some text to define the music style.

Anyone can make AI music: all you need is a prompt, much like the image and text generation systems known for creative AI today.

That prompt will do something similar in music, giving an artificial intelligence some sense of direction, regardless of how small it might be.

It’s like puppeteering sound

It can be very much like puppeteering, except with sound. You’re essentially triggering music with structure, and letting the AI work out what to do. You can even get the lyrics automatically written if you so choose, with ChatGPT’s AI text creation services called upon for lyrical generation.

With the automatic tech in play, you’re basically just filling in a prompt and letting the machine do the math for you. Sometimes the results are brilliant, and other times it’s a bit of a mess.

But if you want some semblance of control, you can script sound almost as if you were puppeteering musical tracks.

Imagine a piece of music if it was written down. Think of it like a piece of theatre: it has an intro, a middle, and an end. The intro could be an instrument playing, or it could even get right into the song with words or sounds. The middle is likely the chorus, while the end is possibly a big finish.

These are structural elements to songs, and in the world of Suno’s music AI system, many can actually be triggered based on their position and scripting.

For instance, we generated music using the following code:

An upbeat double bass blues sung by a deep soulful lady of the blues [Fingerstyle Double Bass Intro] [Verse 1] Every day I have juice and oatmeal And if I'm lucky I read the news It's a horror show outside in the world And you can bet I have the blues [Chorus] I have a blues worth screaming I have a blues worth crying, too I have a blues that makes me angry I have a blues with a deep shade of blue [Fingerstyle Slap Double Bass Solo] [End]

The results were varied, as you can hear below. Some are better than others. Most acknowledge the request of the gender, but the last does not.

We’re not entirely sure all are successful, but they definitely provide an understanding that there’s no one specific way to make a piece of music. The code is giving us multiple interpretations, and each could just be as valid as another.

Musical results

Spend enough time and credits with Suno’s AI music platform, and you’ll hopefully have a few songs you like. They might be instrumentals ideal for background music, particularly if you need tunes for a podcast, but they could also be songs with words.

Suno isn’t alone in what it’s trying to do, either. There are AI music makers in development that do different things, such as the sample generation being built at Beat Shaper, which promises to turn prompts into multi-track audio destined for software like Final Cut Pro.

Another full track generator exists in Stable Audio, which does an interesting job, though the results we found were often more haunting rather than radio friendly.

There are also other services working to build music for background tracks, such as Mubert, Loudly, Aiva, and others. However, in our testing, Suno managed to get the closest to a finished piece you could be proud of from prompting. It’s really a totally different experience to creating music by hand.

We’re not at the point where Suno can take your voice and generate music using harmonies supposedly sung by you — we asked — but with enough prompting, you can get close to the voice of artists. Nameless artists likely trained on others the company isn’t talking about, but artists nonetheless.

Getting the same voice each time isn’t exactly easy, bordering on impossible at times, and you may as well expect your results to be an ensemble of musical players.

But the results are highly listenable, and that gave this journalist an idea: could he create an album of songs nobody wrote?

Internet security music

He went to work being inspired by a style of song no one had ever really tackled: internet security songs.

There’s a great reason no one had tried this before, too: music about security was never guaranteed to be catchy or worth listening to. But if anything could try it, AI certainly could.

For this song, the inspiration was the Software Publishers Association horror show of a song “Don’t Copy That Floppy”, a hip hop music video PSA that evokes violent shudders of disgust and embarrassment for anyone who recalls the thing.

It’s not a good example of tech-based tunes… which also makes it a perfect example to try out.

Instead, this journalist worked on the lyrics himself for a security-based song, but skipped the hip-hop, making it more of a dance rock vibe.

Called “Don’t Ask, Don’t Tell”, the result is a surprisingly catchy track that gives you tips to improve security while detailing why they’re important.

So we managed to make security into a catchy riff. Could we do it with more topics not exactly ready for the top 40 lists?

Hip-hop SEO

If security was a strange enough musical topic, search engine optimisation was just as awkward. And this time, we’d try the hip hop vibe of “Don’t Copy That Floppy”, only slightly more modern.

For this song, we tried using a combination of automatic lyrics via ChatGPT, and our own edits and words, massaging the words into a place where they not only made sense, but worked well with the AI, too.

In “Searchin’ For The Keywords (Funky SEO)”, we may have the first song about SEO, explaining some of what SEO Specialists do in an also-catchy track using elements of hip-hop and soul.

At this point, we’re surprised as to how capable Suno’s AI song generation platform is, but we’re not done.

Reinterpreting sounds

Several tracks in, it’s easy to form judgement on AI music. Some of it is surprisingly great, and others markedly less so.

Comments about the music included acknowledgements that you could tell it was AI largely because of the auto-tune nature of some of the sounds and the weird way some lyrics worked (or didn’t), while someone else admitted it wasn’t bad, yet hated the automated nature of it.

At this point, we kind of felt like the music producers in “Rachel, Jack and Ashley Too”, an episode of Black Mirror that covered elements of AI music when the producers of an artist being held hostage in a coma had her music extracted while she was asleep. This isn’t quite the same, but there are definite lines that can be drawn.

In that Black Mirror episode, series creator Charlie Brooker even used music from Nine Inch Nails for an artist played by Miley Cyrus, reusing some of the same lyrics of songs and giving it a pop twist, almost as if new versions of the tunes had been spun out of AI for a totally different artist.

Suno includes a reference to not using someone else’s lyrics in its terms and conditions, so you definitely shouldn’t do that. But the point is that just like how Brooker did in Black Mirror, you also technically can and create a new song out of it.

More interestingly, you can create new sounds with new lyrics similar to other musicians.

For instance, adding “glam rock” created a vibe more like The Darkness, “metal with overlapping guitars” delivered music like Queens of the Stone Age, while “Texas blues” provided a sound more like Jonny Lang. If you fancy a more operatic sound, “opera” works wonders. You can probably guess the sounds of bands you’re trying to make, or even get Google or an AI system to describe songs for you, and use those processes to make the tracks you’re looking for.

Failed experiments

Some results weren’t as good as others, though, and just felt like they droned on. That could be a combination of us not doing a great job as artist, song scriptwriter, and puppeteer, but it could also be the AI not delivering as great a result. Much of this can feel like a crap-shoot at times.

Results can and will vary, and regardless of whether you opt for instrumentals or with vocals, some pieces of music sound better than others.

We found jazz didn’t always work, and in fairness, it’s a hard genre to get right in real life, let alone for an AI to understand what’s going on.

Rock is a winner

Each time we tried, however, rock proved to be a winner. It was almost as if it was a reliable genre to get right, even with crazy topics and themes

One song, about how we’re all addicted to subscriptions, came out in a style as if the Stone Temple Pilots and Nickelback both went into business creating business-style rock, if there even is such a thing. And it works. Almost confusingly so. We can’t get it out of our head.

Another was a taste of metal about marketing messages gone awry, and it seems to work as well.

It’s almost as if rock just can’t go wrong.

Also quite capable for song generation is electronic music, which delivers solid results most of the time, though you’ll need to work out how you want the song to end, otherwise you’ll essentially have a never-ending piece that doesn’t actually loop.

The elements of song structure become crucial in generating music out of Suno AI, and we expect that’s a good thing, because it allows you to exert some control.

AI music’s uses now

Strangely, because some of these tracks are easy to generate and listen to, it also makes it possible for you to throw them onto a music service, such as Spotify or Apple Music. That’ll depend on the service’s rights, as not all support commercial rights usage like this, but some also do.

What you won’t be able to do is reverse engineer the tracks, at least not by ripping apart the sounds from each layer. In the world of Suno, and indeed some of the others we looked at, the tracks are premixed, almost like a cocktail from a bar. You can’t take the tequila out of a margarita someone has made for you at a bar, and just like that, you also can’t take the vocals out of an AI generated track. It is pre-made.

However, you may be able to use plugins in an audio workstation to remove vocals if needed, and if you’re skilled enough, you can go back to the old school approach of making the music by hand by essentially recreating the music AI has generated for you.

Music writing

In that respect, AI music becomes a perfectly capable song idea generator where you can have it be your amazeballs song writing team. You might not be Holland-Dozier-Holland of the Motown era, but you could still build some inspiration with ease and produce songs from there.

Based on our experience, we’d probably write our own lyrics and generate from that, however, as the AI generated lyrics don’t always deliver a rhythmic verse or chorus. It can take more effort than you might expect.

Background music

One obvious approach for AI music makes a lot of sense right now, and that’s background tracks for podcasts and other programs.

This is probably the easiest approach, since you just need a prompt to set out a vibe, some instruments, a genre, and so on, and can leave the track as an instrumental. You’ll still want to use song structure in your composition, otherwise the song may not ever end.

But if you’re looking for inexpensive background music, AI might have the answer. A musician and composer will do a better job with more customisation and control, but AI could provide at a cost.

Entertainment

There’s also the entertainment angle, which allows anyone to make music without too much effort.

We made songs for our kids simply with a prompt, and even made some cute little comedic jingles akin to wrestling opening anthems.

“Entertainment value” is one value that works, but don’t expect AI to help you to recreate songs. You can’t just say “make me a David Bowie” song, because that’s not going to work.

Case in point, we tried to recreate David Bowie’s “I’m Afraid of Americans” by describing it to the T.

The prompt seemed to match the description of the song:

Hard electronic industrial rock sung by a monotone British voice and mixed by Brian Eno

The result wasn’t quite the same, but the Bowie influence can definitely be heard in the vocals, even if the sound is a little more like hard rock.

Clearly, you’re not going to be recreating music here, nor will you find your inner Weird Al Yankovic by making comedic versions of the same songs. You still need talent and skill for that.

But you can definitely make your own music and share it with the web.

Read next