Or more accurately, would it be possible to release voicebanks that simulate how it would sound if animals, such as cats or wolves, could sing in human languages?
No. The voice databases are based on real humans who are contracted to record a number of songs.
The songs need to have a sufficient phonetic makeup in order to resynthesize the sounds required for the target language. This is why we get audible accents with cross-lingual synthesis, and why those accents are less noticeable if the original recordings include multiple languages (like in the case of Weina).
Applying a generalized “voice print” or timbre to the wide variety of sounds that can be synthesized is beyond the scope of the software.
It’s a singing synthesizer, not a voice changer.