I’ll start with Instant Mode because it’s a bit more esoteric, and then move on to the parameters.
Instant Mode will add AI-generated pitch changes to your entire project as you work. It can be turned on at any time, so you don’t need to worry about having it on right at the start. If you don’t like the results after turning it on you can revert it by pressing undo (ctrl+Z), but once Instant Mode is enabled and you’ve started making other changes, turning it off with a button click will cause all of the AI-generated pitch changes to show up in the Pitch Deviation parameter panel (as a way of “locking them in”). This isn’t usually a problem, but might mean some additional “cleanup” work later if you don’t like what Instant Mode did.
Instant Mode uses AI to mimic the original voice provider’s singing style. This means it can be a quick way to get a reasonably good-sounding result, but you don’t have as much control over it as you would “tuning” the voice manually.
It also unlocks the AI Retakes feature, which lets you generate multiple versions of the same note and pick your favorite. It should be noted that AI tuning is context-specific, so you shouldn’t generate any retakes until you’re certain the notes before and after the selected one aren’t going to change (this includes adding
br notes for breaths!).
Instant Mode can be very useful, but is purely optional. If you want to learn how to add pitch deviations manually and develop a personal “tuning style”, it might get in your way, but also if you want to focus on other parts of music production you might find it produces a good result with less time investment, allowing you to focus more on learning those other skills.
I’ve written more about Instant Mode and Pitch Deviation here:
I expect your main question is about where and when to use these parameters, and honestly nobody can tell you that because it will be different for every genre and song, and each individual producer can have a personal “tuning style” just like vocalists have a personal “singing style”.
That said, the important thing is that you know how each of the parameters behaves. Generally speaking, it’s better to simply leave them alone unless you’re implementing a specific and intentional change, or fixing a problem. It’s a common mistake for people to arbitrarily change things just because they think they need to.
The most notable parameter is Pitch Deviation, which is covered in the above link, but put simply it’s used to introduce more “human-like” behaviors, since real singers don’t sing exactly on pitch.
Vibrato Envelope will change the amplitude (depth) of any slider-based vibrato set in the Voice or Note Properties panels (accessible from the right side of the application). This will have no effect on “vibrato” that is AI-generated, since those curves aren’t treated any differently from normal pitch deviations.
Loudness will simply increase or decrease the volume of the track. Using this to modify dynamics is usually better than simply changing the volume in your DAW since it emulates a singer singing different parts louder or quieter, rather than having an entire recording at the same loudness and adjusting later.
Tension represents how relaxed or tense the vocal cords are. This can be a bit hard to describe since there are multiple factors here, so play around with it a bit. Some producers will increase and decrease tension throughout a phrase, others will just change it between phrases but not during the phrase, and others will simply set it once for the entire track (in the Voice panel) and not touch the parameter otherwise.
Breathiness is probably self-explanatory. If you want a more breathy sound, increase it, if you want a “cleaner” sound, reduce it.
Voicing - A “devoiced” sound is what happens if you make the same mouth shape and produce the same airflow to make a sound, but without engaging the vocal cords. This isn’t as useful for English lyrics, but devoicing い (“i”) and う (“u”) sounds is a normal part of Japanese speech, depending on the context of the phoneme.
Gender will shift the voice between a “cool/masculine” tone and a “cute/feminine” tone. It’s essentially a formant slider, but applied at the time of synthesis rather than afterward in the DAW. You can use this to subtly change the sound of a voice, but you usually won’t want to go any further than 0.2 in either direction (depending on the voice you’re using).
Tone Shift is Pro-edition only, but offers a simple way to adjust the tone of a voice. That’s a bit of a vague way to put it, but it’s another example where it’s usually easier to just try it out and see what changes.