Hi. I saw VocaListener for YAMAHA Vocaloid. I believe that it listens to a real singing and with a help of lyrics it estimates parameters necessary to set a Vocaloid, in order for the Vocaloid to sing a better approximation of the particular singing style and voice characteristics. Is it possible for Synthesizer V to have the same helper function or app?
You could use VocaListener to create a VSQ in Vocaloid, then convert VSQ to SVP with utaformatix (UtaFormatix) since it does a better job of converting pitch bends than the default VSQ import function.
hataori has a toolkit for doing just that. Here’s an example of what’s been done with it:
Check out this playlist for details:
Hi. Thanks for that. It looks promising. I will check on!
Hi. I checked videos for Hataori scripts and Praat. They work very well, it seems, to achieve better singing of AIs. However, Vocalistener was a wrong example for my purpose. I did not know exactly what it did. So-to-speak, these parameters are used POST-synthesis to improve singing. Rather, I was looking for a way to tweak PRE-synthesis parameters of VoiceBanks, such as, Loudness, Tension, Breathiness, Gender, Tone Shift, Pitch variation patterns (Left, Right, Depth and durations), Vibrato patterns (Start Left, Right, Depth, Freq., etc.). The aim is to give a different characteristics to the Voice, so as to suit for a particular style of singing, i.e., classical music, choir, Blues, Jazz, Latin, Belted, Soft Voce, etc. As I had no idea, I was just doing it trial and error, and it is frustrating. So, I thought that there must be a better way to objectively set these parameters so I can get a desired voice, closer to a particular singing voice in mind. If you know how, any help would be appreciated! Cheers.
Unfortunately except the Automatic “Pitch Tuning” there is no way to automatically set up other parameters. You are supposed to tweak it yourself until it matches your idea (no listener can read your mind - yet )
But I think imitating a particular style could be possibly done by AI in the future (as the auto pitch tuning is). AI needs a lot of training data, so first we need to analyse real singing voices to get it.
This is why I am trying to copy real voices to synth voices with as much similarity as possible.
I think this POST and PRE is a little confusing. The synthesis is done FROM prameters, so they are always PRE-synthesis, aren’t they?
The way I did it with synth V 1st edition was to take an “analog” vocal recording in a daw convert that melody to midi, then imported the resulting midi file into synth v. Worked a treat. I then replaced the La Las with words and voila! A song.
I found starting out from scratch it was very easy to not get the note timing correct and one line could become a psych job. The midi import, while needing a tweak here and there did all the heavy lifting.
All that said, it would be nice if you could make synth v record a realtime string of la la blobs tracking the pitch and timing of the incoming signal. I have found that pitch is waaay easier to hand correct than the timing of phrases so we wouldn’t have to be right on pitch but could provide the all important timing into a mic, like so, with our mouth.
The way I do it is completely opposite. No analog recording of singing is involved. I make digital transcript of existing scores. Then, produce MIDI files from them. Next, I import them into Synthesizer V in order to convert them into singing. The current problem is that, despite that MIDI files have lyrics, Sythesizer V ignores them. I use Cmd-A & Cmd-L to batch enter lyrics. The next problem is that the default settings for VoiceBanks aren’t suitable for classical singings. That’s where the tweak is necessary. So, I wanted to know whether it is possible to estimate the parameters that should be adjusted to customise the sythesized voices, by using some analysing software for analog recordings of real singings.
If you don’t want to do the timing with Praat, look at this one script from the suite quantizePitch.lua I think this is what you are talking about.
But it needs the pitch of the wav file to be analyzed by Praat as described in the Project Preparation video (see above)
It converts the pitch curve to notes.
I am trying something - can you send me in PM a short example (~ 5 secs) of your voice recording of the classical singing?
You can analyse the rendered wave file and the original vocal separately in VocalShifter, and then compare the pitch and loudness curves of the result
Hataori Your tutorials and scripts are amazing, I learned a lot with you .
I hope that in the future the tension can also be transcribed in SynthV
For the moment I work with this…
and I’m waiting Cangqiong AI and Muxin AI update, just wanted to say
ps : My english is very bad, I’m french
Aren’t we all?
Your English is fine.
Look at this video, where I tried to auto track tension, gender, tone shift and loudness params to match original voices - see the description of the video.
Compare if with the original (if you can listen to J-POP )
In short - comparing vowel spectra with pre-synthesized database. It’s only for voices I have, difficult to do.
I don’t know yet how to make it simple enough to get it into one script.
Great great works Hataori !!!
I saw your last video tooon youtube, amazing!!! You make SynthV more better, and I’ll pay for what you do, sincerely
@Kanru: look at his work please !!!
Thanks for your works Hataori, you and Kanru will can make SynthV very powerful
Definitely not me
And Kanru has apparently different interests.
Ok, I understand
Amazing work again
Hataori? you don’t have scripts to receive Cangqiong and Muxin ai faster??? Ok, I tried