How to make Saki AI sing in English/German? Do I need a dictionary?

Tim78 · 2022 年 7 月 25 日午後 4:11

Hi, I’m trying to understand how I take an AI voicebank and use it with another language. Do I need to create the dictionaries myself or are there premade ones available. For example using Saki with english and german.

I’m not sure I understand the workflow to accomplish this. (Apologies if the answer exists in another thread.)

claire · 2022 年 7 月 25 日午後 4:39

Custom dictionaries only change the word-to-phoneme mapping. That is to say, when you enter a word in a note the software converts it to phonemes that represent the individual sounds that make up the word.

For example, entering “hello” is converted to hh ax l ow with the default English dictionary:

By entering a user dictionary entry for “hello”, I can change which phonemes are used when I enter that word as a lyric:

It is important to note that each language has a different phoneme list, ie a different list of sounds that the voice is capable of producing. This means that user dictionaries made to sing in other languages are only capable of approximately mimicking another language, since they are still limited to the phoneme list of the default/native language and are therefore missing some of the necessary sounds.

Cross-lingual synthesis is a feature that can be used with non-lite AI voices and requires the Pro edition of SynthV Studio. It allows AI voices to access phoneme lists that are not the voice database’s “native” language by using a large amount of machine learning analysis to fill in the gaps (such as sounds the voice provider never actually recorded). Please note that AI voices still have a “native” language and it is normal for them to have an accent when singing in a different language.

Keep in mind this is not a seamless process. While the cross-lingual synthesis feature is capable of producing sounds from various languages, it has not been “trained” to seamlessly transition between each language (this would require a lot more machine learning analysis). As a result, you can only select one language per track or note group.

This setting can be found in the Voice panel:

Once the “Sing in the following language” setting is changed you will see that you gain access to the selected language’s phoneme list.

Currently the only languages supported are Japanese, Mandarin Chinese, and English. You can attempt to use existing phonemes in these three languages to produce German-sounding results via a user dictionary and a large number of manual phoneme edits, but you will likely have to make compromises where the correct German pronounciation cannot be accurately produced.

Tim78 · 2022 年 7 月 25 日午後 4:56

Amazingly clear response, thank you! I am happy that I have the option of attempting to create alternate languages though the available phonemes.

So just to be clear, English would be supported on all AI voicebanks even if it is non-native through Synth V Pro?

claire · 2022 年 7 月 25 日午後 5:00

As long as you are using a full/paid AI voice (not “Standard” or “lite”) and using the Pro edition, then all three supported languages will be available. If you don’t see the language dropdown from the screenshot above, make sure you’re not using an old version of the product. If you need help updating, check the “Updating your products” section of this tutorial:

Tim78 · 2022 年 7 月 25 日午後 5:03

Understood, thank you!

Saiden · 2023 年 4 月 6 日午後 3:44

Does that mean if DT later decided to add, say, French (which I would definitely appreciate), it would theoretically not be necessary to update voice banks but just let the cross lingual engine do the heavy lifting?

claire · 2023 年 4 月 6 日午後 4:07

In theory, yes. Cantonese cross-lingual capability is currently in development and as far as we know that capability will be added to existing AI voice databases, with varying levels of accent of course.

That said, Cantonese is the first new language to be added to the software, so we don’t know for certain whether or not each voice database will need to be updated to support it, and there’s no way to know exactly how much work will be required (if any) to do those updates from a development perspective.

We at least know that AHS and Eclipsed Sounds are most closely partnered with Dreamtonics, so presumably once the update is released those will be the first third-party voice databases to be updated if necessary.