Custom voicebanks

Rocchi · 2020 年 7 月 7 日午前 2:33

The possibility to create custom voicebanks would be greatly appreciated. Before reading the opinions here I thought there should be no discussion, just do it. I see now that there are a lot of caveats to publishing an SDK, so I’m going to talk about a few options I think could be viable. Ultimately, it’ll be up to Kanru to decide based on how likely it is to succeed, of course.

About engineering budget, probably it’s best is to not have one. Creating custom banks should be free, and community-created banks should not be marketable:

Maybe release SDK just the way it is used internally, only providing some documentation and let the interested parties figure out the rest. I, myself, likely wouldn’t need a GUI to work with.
A similar option would be to open source the SDK and let a community grow around it. I’d be willing to bet people would come up with a GUI soon enough.

Maybe using a custom bank could require a premium version of synthv. Just enough to offset the cost of writing the aforementioned documentation, since there wouldn’t be ongoing support for the SDK.

Someone raised the issue of the amount of low quality banks that could flood the space. That’s another very valid concern, but I think the solution to it is already implemented. Similarly to Vocaloid: Pro/Licensed banks should be few, feature-rich, and adequately priced, so that when browsing through synthv songs I know to only click songs that tag Eleanor Forte or the other official voices (hence why they should be few, I will want to remember all official singers). Basically: when you see the title of a song what guarantees the recording is good is the name of the digital singer instead of what software was used.

I don’t know enough to have an opinion on the workings of phonemes, but I think going through with releasing an SDK that is not language-agnostic is a huge waste of potential regardless of it’s eventual funding model. Having community-created voicebanks in many languages opens up the market for Synthesizer V to the whole world, just like that. I arrived at this thread exactly because I want to make a Br Portuguese bank, not because I’m unsatisfied with the voices that are available. Let’s say I did make my own voicebank (regardless of quality), I have quite a few friends who would want to play around with synthv just for that.

Anyways. I’m really impressed to see how Kanru deals with this community. I know that even if synthv goes with the exact opposite of what I suggested, it’ll be because I failed to understand something about this environment, and not because of greed or lack of imagination by the dev.

WintemintP · 2020 年 12 月 19 日午後 10:00

My stance on it remains unchanged, there should not be a public SDK at all, period.

The main problem with low quality banks flooding the space is the flooding of the space itself, it has nothing to do with making people guessing which ones are good and which ones aren’t. The reason why stuff with UTAU doesn’t get that much attention vs Vocaloid mainly has to do with the fact that the only actual official UTAU is Defoko. The rest are all fan-made, so sure, you have some stars like Teto but then the rest of the “loids” are like, there’s a large number of them and you wouldn’t have heard of half of them and the rabbit hole of diving into more various UTAUloids ends up going down further than the other side of the planet. That, along with the fact that voicebanks recorded by individuals are never going to be that good unless you either DIY your own anechoic chamber and spend $5,000,000,000 on like, various pro-level microphones because every voice calls for a completely different mic altogether (I have a diploma on this subject) and most will settle for something like the Blue Yeti which is like, no, you need something a lot more substantial for recording (and even if you’re only trying to record yourself you’re still going to end up buying a bunch of different mics because you still have to figure out which one works for you, so good luck). Not only that, one website isn’t going to have the budget to store links to nine septillion different voicebanks, even if each of them had its own site/portal. It just costs too much for all that space. You’ll need a supercomputer more than 9,000 times bigger than the ones they have at NASA. Nobody has the budget for that, not even the Queen or the President of the United States.

(TL;DR - Even if you don’t wind up spending a ton of money getting studio time and getting the banks recorded professionally, which is how you should be doing this if you were to, the space on the web that these banks have to be made available on is going to cost way too much, and nobody has the money for that.)

As for different languages, there are only four voicebanks that aren’t Chinese. Three are Japanese and one is English. The rest are all Chinese banks, and while I’m not against any language in particular, that gives me the impression that the developers are going to focus strictly on the Chinese language alone and nothing else, and there’s no hope of any globalisation, so… yea, you might as well give up on that too.

LinR_PN · 2020 年 12 月 20 日午前 4:38

Because Chinese mainland 「Quadimension」 is too powerful… They have a whole set of plans for characters, and more singers will come out in the future. And 「AHS (AH-Software)」 in Japan also has a lot of singer characters that need to be iterative.
(Translate with a translator)

corasundae · 2020 年 12 月 20 日午前 8:29

Was there any point at all in reviving this dead thread? It’s obviously not going to happen.

WintemintP · 2020 年 12 月 22 日午後 7:12

I just never saw a notification on it until today. And good riddance on the idea.

corasundae · 2020 年 12 月 23 日午前 5:06

It costs zero cents to not be rude about something other people would’ve liked.

kai8780 · 2021 年 11 月 8 日午前 2:01

i am very late

I have an idea for how a custom voicebank system could be implemented. Of course, Kanri will not get into this topic, but it would be nice to know if this is possible.

My idea is that a separate synthesizer v could be developed that would be the same as studio basic, but that would only support custom voicebanks
I think in this way it will be possible to make the line between paid voicebanks and custom.
To place custom voice banks, it would be possible to make a separate general site where people can upload their voice banks and, using a rating system, show which voicebanks are of higher quality

The idea of a single site was invented in order not to send voice banks to developers for verification, because the ability to create your own voice banks can attract a lot of people and physically cannot check each Dreamtonics

But this is difficult to implement, since the question also remains with the interface for creating custom voice banks so that the user understands how to configure, the costs of developing a separate version of Synthesizer V Studio for custom voice banks and the cost of a site for hosting will go away.

dcuny · 2021 年 11 月 8 日午後 6:25

Information about voicebanks is considered proprietary information. For example, Eclipsed Sounds had to sign NDAs about the process. I’m sure the format and technologies are also considered proprietary, and won’t be released.

You’re not even going to get a reclist. Even the price of creating a voicebank is considered proprietary.

So I doubt that Dreamtonics will be supporting free voicebanks in the near future.

Kris_Miller · 2023 年 2 月 18 日午前 8:09

TLDNR most… Would love the ability to make my own custom banks. NI Kontakt model comes to mind. Weeding out third party garbage would be as simple as Dreamtonics providing an official list of banks.

dcuny · 2023 年 2 月 18 日午後 6:01

TL;DR: Not going to happen, for reasons you apparently didn’t read.

Why would Dreamtonics make their tools available? At a minimum, it would remove a proprietary advantage they have over competitors, and provide little in terms of increased revenue.

Weeding out third party garbage is as simple as Dreamtonics not allowing them to be built in the first place.

claire · 2023 年 2 月 18 日午後 6:24

It’s important to note that Dreamtonics is very protective of their proprietary tech.

Third party developers like Eclipsed Sounds, Audiologie, etc. do a lot of manual work when creating a voice database, but all of the machine learning is done by Dreamtonics themselves. They don’t even give their partners access to that process.

Additionally, there’s not really anything to gain by opening the platform up. They’d only be creating competition for themselves without any real benefit.

As much as it would be great for the vocal synthesis field and userbase as a whole, it’s somewhat unrealistic.

dcuny · 2023 年 2 月 18 日午後 6:26

So Dreamtonics should willingly lose market share, competitive advantage, release their source openly, all for something that ultimately provides no benefit to them.

Good luck with that.

As for your criticism of the Ninezero library, unless you’re in a position to actually provide a better product yourself, you might want to hold off on that.

I don’t see anyone else creating a similar voicebank, which might provide a clue that it’s a bit harder to accomplish than you imagine.

AleDZMusicProd · 2023 年 9 月 11 日午前 4:23

Would be great If a Development Kit already existed…

Scott_Mire · 2023 年 9 月 19 日午前 3:04

I think it’s unrealistic to expect custom voicebanks. Having said that, I think what people want is a virtual singer that sounds that how they want it to sound. I think the perfect solution is available now. The solution is to create a vocal in Synth V and then convert it using RVC voice models. This is super super easy in something like Kits.ai or in one of the Google Colab RVC options (which I hate btw). If you aren’t familiar with this tech, here’s an example…Billie Eilish singing Radiohead:

Now, what would be really slick is if Synthsizer V would allow you to load and apply your own RVC voice models to a synth v track. My understanding is that there are realtime RVC voice changer solutions.

AleDZMusicProd · 2023 年 9 月 19 日午前 4:56

I don’t give up thinking of custom voicebanks, I would also offer what is in my huge database to be part of DT’s store products, for more Latinos songs…

stoac · 2023 年 9 月 21 日午後 3:41

I have been doing exactly that for the past few months, and it sometimes works great and sometimes not at all. It always depends on the quality of the data the model has been trained with and also on the vocals range and style of the original voice. Some notes cannot be converted correctly.

AleDZMusicProd · 2023 年 9 月 21 日午後 8:14

My fortune is that my real singer has an incredible vocal range, from low to high and I do prepare the SVP as perfect as possible…

eusti · 2024 年 1 月 3 日午前 8:11

Could you elaborate on this a bit?

The background to my interest is that I would love to basically have my voice as voicebank in Synth V. As I understand that is not possible and not likely to happen. (I would like that to figure out other ways to come up with interesting melodies and test out lyrics, etc to create guide vocals that I can use to get a better vocal performance from myself…)

So, is my understanding correct that a work around would be to create a RVC voice model of my voice, make a vocal performance in Synth V with one of the existing voicebanks and then use a RVC voice changer to convert Synth V’s voice to mine?

AleDZMusicProd · 2024 年 1 月 3 日午前 11:32

@eusti , yes! This is the way it is for my work in collaboration with real singers.

None of the songs will be released with this technique, but it is a real good help for the singers who are working with me.

It has turned out to be a great way for us to save production costs, rehearsal overheads, and travel from their home in the studio to the final recordings.

The reason I do it this way is because my productions are purely in Italian and the voice we use is from our vocalist in charge of preparing the auditions, who unfortunately is currently unable to do so for health reasons, but he is delighted that we are able to use his timbre, with excellent results.

So our lifelong dream would be to be able to have our Custom Voice in the software, either with the help of Dreamtonics or outside developers…

Maybe I should start a crowdfunding, who knows, and hope that a lot of people out there would join in.

eusti · 2024 年 1 月 3 日午前 11:59

Thank you so much @AleDZMusicProd ! Appreciated.

Great. That’s a lot of new info to consider. Will see how far I get.