Tutorial Request: Chinese Voicebanks singing in English

Chinese voicebanks singing English tutorials wanted

To anyone who has had success in using a Chinese voice to sing an English song, could you please make tutorials on how you do it and what helped you do so?

Did you learn a lot of Chinese phonetics before made the song? If you’ve used multiple Chinese voice banks for English songs, which one is your favorite (or list the ones you like using most, difficulty of tuning doesn’t matter)? What are some challenges we should expect when using the voice bank(s) you use? Do you have any preferences in tuning and function?

It would be nice to get tips or have references. I’ve heard Canqiong, Stardust, Haiyi, Longya, Moke and Muxin sing in English. They sound so good and the accents are pleasing to hear! I’m sure it’s not easy but it seems worth the effort. Thank you in advance and thank you also for the songs you’ve made!

1 Like

There’s a feature called “cross-lingual synthesis” with the commercial AI voices. You can select the target language (Japanese, Chinese or English) and proceed as if it were native in that language:


There’s a slight accent to the voices, but you only have to work in the target language.


The aforementioned holds true for AI voicebanks only, doesn’t it? I was desperately trying to make Genbu sing in English and thoroughly failed. Same goes for Muxin, as his upgraded voicebank still hasn’t been released (yet?) for international customers. :sleepy:

I should start by making it clear that this method doesn’t result in perfect pronunciation. That said, it’s the only reasonabe method I’ve found for non-AI voices, and you can usually get a recognizable result.

Some words or phoneme combinations will be especially difficult due to phonemes that exist in English but not in Japanese or Chinese.

Just be aware that especially with JP to EN the best you’ll get is an approximate pronounciation for a lot of words. Don’t expect it to be as easy or understandable as what an AI voice could do or you’ll probably be frustrated and/or disappointed.

  1. Download an interlingual dictionary (there are multiple available, i usually use these ones: GitHub - Slidingwall/synthv-dictionaries: Interlingual user dictionary of Synthesizer V. Synthesizer V的跨语言用户词典。 )
  2. Enter your lyrics as normal, and the dictionary will get you somewhere close to the desired prenounciation. The phonemes from the dictionary probably won’t sound great, but they’re at least a starting point.
  3. Go through note by note and adjust the pronunciation, using the phoneme list in “mandarin-xsampa-phones.txt” as a reference.

That’s really all there is to it. Personally I find this much easier to achieve with Chinese voices than Japanese.

I’m far from the most proficient producer but you can hear an example of what I’ve done with this process here: Afraid of Everything【SynthV cover ft. Cangqiong / 苍穹】 - YouTube