AI Retakes - More Control

tsumiiro · 2022 年 9 月 1 日午後 9:57

something i’ve noticed with any synth, ai or concatenative, is that it CANNOT handle rap or otherwise that kind of stylistic type of singing.
if you were able to tune retakes a bit more to be a specific tone like “exhale” or “release” or “vocal fry” would be kind of groundbreaking in how flexible it would make the engine. though, i’m no programmer so i don’t know what kind of work would actually go into working that into the prgram

claire · 2022 年 9 月 2 日午前 12:01

Based on my understanding of the AI process, it’s likely a developer would have to go through certain recorded samples (“learning materials”) and denote which exact portions represented each of the desired expressions - exhale/release/vocal fry/etc.

Once enough of these expressions are indicated by the human developer, presumably the AI (specifically, the machine learning algorithm) would be able to scan the remaining learning materials and find other examples of that expression in order to gain a better understanding of how to synthesize that sound.

So based on that supposition and the fact that the existing system lets you cycle through retakes until you find one that fits, it’s probably possible. I would expect the main obstacles to be isolating the expressions in a way that the AI can understand, as well as having training material with a sufficient number of those expressions to build an adequate profile.

I’m always in favor of better user control over the synthesized output, so hopefully Dreamtonics can pull it off.

tsumiiro · 2022 年 9 月 4 日午前 12:03

if anything i know about ai would apply here, i think it wouldn’t be any harder to pull off than vocal modes were since it’s kinda the same premise of having a specific set of data that the ai needs to learn from. anything that gives more control over retakes is something i’m down with, since as it is, you need to take ~10 retakes to get the sound you’re looking for.