How to make an AI music track?

April 30, 2023

The song "Heart on My Sleeve", featuring AI-generated voices that sound like Drake and The Weeknd, went viral on TikTok with over 15 million listens. It was subsequently removed by streaming platforms due to a complaint from Universal Music Group (UMG). Since then, we've seen an explosion of much more music created using the voices of famous artists like Tupac, Biggie Smalls, Kanye, etc. There have been many fake tutorials, so how do you do it? 🕵️‍♂️

Step 1: Get So-Vits-SVC

So-Vits-SVC is open-source Singing Voice Conversion (SVC) software. You can get their model from GitHub.

so-vits-svc screenshot

If you're looking for a tutorial on how to install it, here are some resources to help you get started 👇

Step 2: Get The AI Voice

Next, we need to get the AI singing voice model 🤖🎙

There are several open-source repositories with artificial singing voices made to resemble famous artists and public figures (e.g. Alicia Keys, Drake, Nas, Kanye, Tupac, Biggie, Michael Jackson, Donald Trump and even Homer Simpson). For example, this repository found on HuggingFace has collated almost 200 at the time of writing: AI Music Voices. Make sure to download both the model and the config files.

However, you can train it based on your voice or another voice by uploading the audio files to So-Vits-SVC and training a new model. This should take several hours. ⏳

Step 3: Get the vocals and instrumentals

If the track you want to reproduce is you're own, this is easy. If not, you can upload the track to AI software like Vocalremover.org to separate the vocals and the backing track into two files.

Step 4: Create the new sound

Now time for the magic 🪄✨

Open up So-Vits-SVC (instructions found in Step 1), and set the model path and config path to your trained AI voice. You can then set the input audio path to the vocals you want to convert. Then select the "Infer" button at the bottom to create your new sound. Give it a few minutes, and it should convert the original sound into the voice of the input model. 🤯

Step 5: Additional tips

Here are a few tips to get the best sound:

Play with the So-Vits-SVC settings to improve the sound. For example, I would change the default pitch from +12 to 0.
Use input vocals that don't have autotune - otherwise, it can make things sound weird.
Try to match the genres (e.g. don't make Eminem sing an Opera).