Royal Cities
@royalcities.bsky.social
46 followers 72 following 78 posts
Research, Solutions and Model Design @ Audialab. | Stream my new release here! https://open.spotify.com/album/3FnXFAI7IGJukQ94zRWsN2
Posts Media Videos Starter Packs
royalcities.bsky.social
Ya'll haven't lived until you start watching nature documentaries.

They spend like an hour building an entire emotional arch around some random crab and it ALWAYS ends in triumph or total annihilation - no in-betweens.
royalcities.bsky.social
.Webp image format has become the bane of my existence.
royalcities.bsky.social
I hope you enjoy using it!

If there is any suggestions on future models (genres, sound selections etc) feel free to hit me up.

With all that said I hope you all have a great day!

Also please consider sharing this with other producers who may put it to good use!
😊
royalcities.bsky.social
Of course as well this also DOES have VST compatibility in the free Audialab Engine for anyone looking to use this in DAW's!

🎉
audialab.com/products/de...
royalcities.bsky.social
I've also taken the liberty to update the RC Gradio to take full advantage of it so for those of you using it make sure you update to the latest version.

Just some UI enhancements, 16-bit support & better random prompt management.

👇

github.com/RoyalCities...
GitHub - RoyalCities/RC-stable-audio-tools: Generative models for conditional audio generation
Generative models for conditional audio generation - RoyalCities/RC-stable-audio-tools
github.com
royalcities.bsky.social
For convenience I've also quantized it down to 16-bit as well. I've noticed literally NO audio quality loss comparing it with the full model so for anyone using lower spec hardware or just trying to save some VRAM definitely go for the 16-bit version. 😊
royalcities.bsky.social
But of course this is ALSO a LOCAL model and you can download it and use it to your hearts content if you have around 8-10 gigs of VRAM.

It's available for free via its HF page (which has further details and examples)
😀

huggingface.co/adlb/Audial...
adlb/Audialab_EDM_Elements · Hugging Face
huggingface.co
royalcities.bsky.social
So with that said the model is live RIGHT NOW as we test API-gen and you can being playing with it here!

I HIGHLY suggest using the random prompt to get a feel for the metadata structure.

All samples and MIDI are yours to keep 😀
👇
audialab.com/edm/
royalcities.bsky.social
Side note - as a bonus too AI samples never run the risk of copyright strikes or licensing fees.

😉😍
royalcities.bsky.social
We need more tools that work WITH producers and to me a sample is JUST A SAMPLE - AI or not and its what YOU DO WITH IT that sets you apart as a producer.

Mangle them, keep them as is, chop them up, pitch / formant shift them - it's up to you as an artist - just have fun!
royalcities.bsky.social
But as time goes there WILL be more more capable all-in-one models with greater sound selections- until then though the models will be focused.

Think of it sort of like sample packs - but instead of packs of samples you'll get AI packs with sound and melody palettes.

🎨🎵
royalcities.bsky.social
The main goal of this is really to keep experimenting and slowly work up towards larger models. While it's difficult making good models that don't use outside samples it's still a learning process and every model released provides insights into future ones. 😄
royalcities.bsky.social
It's entirely possible to build GOOD sample generators that don't rip off other creators work but it requires much more time, effort and engineering!
royalcities.bsky.social
This isn't a matter of lack of technical expertise - frankly it would be VERY "easy" to do this the wrong way and follow in the footsteps of every other AI solution but it's just not a bridge worth going down.
royalcities.bsky.social
Now...while testing I've gotten some feedback saying "it doesn't know X genre or Y instrument" and this is BY design!

Since the samples are internal data creations (and weren't vacuumed up across the internet like many VC centric audio-gen) its not a swiss army knife.
royalcities.bsky.social
Just 3 gated examples below.

All effects (reverb, gate, and filter sweeps) can be prompted together or separately of course.

And yes all gates ARE Tempo Synced 😎
royalcities.bsky.social
I included the gate effect just to see if it could be done and they DEFINITELY are context specific while writing but they're there if you wanted!

I'll often put gate modules in post for FX but when they're baked in they can be useful for interesting risers or fallers.
royalcities.bsky.social
Not bad!

As you can also tell the model DOES know effects too. It has the usual reverb controls but also rising + falling filter sweeps and also some trance gated effects for good measure.

😊
royalcities.bsky.social
Supersaw Chord Progressions 👇
royalcities.bsky.social
Now this is all fun but how about musicality? Well I've shown off a few in the prior videos BUT to cover it more and speak to both musicality and its knowledge of triplet time lets just generate some identical prompts (with triplet and without) and see what comes out!

Arps/Bass 👇
royalcities.bsky.social
Here is this in action. 😀

Note how "fast speed" just means MORE chord progressions and melody switch ups in the same time intervals while slow is fewer.

We're slowly getting better control with the AI rather than just hoping for the best!

(Both are in G minor / 128BPM btw)
royalcities.bsky.social
While this may seem small it marks the start to useable controls over the AI.

Further it also works for chord progressions and melodies!

The key difference here is you will often find the melodies or chord changes happen at longer intervals for say a "slow speed" vs "fast".
royalcities.bsky.social
For example lets generate a saw arp at 140BPM.

The prompt is IDENTICAL minus 1 change - the text slow, medium & fast speed.

As long as we include the same BPM the model understands the difference relates to how often the notes are being subdivided.

So lets see that in action!
royalcities.bsky.social
Well SPEED and BPM are NOT ALWAYS the same thing.

You can be writing in 140BPM but want a "slow" arp and ALSO want a fast arp for say drops or risers etc.

This model has learned BPM independent of SPEED and it can be prompted to take advantage of this!