This is going to vary a lot with each song and voice, but also based on which kind of emotion is being expressed. It’s unlikely to be a single set of settings for an entire track and will likely require much more surgical adjustment to get a truly emotive result.
Additionally, “sad” isn’t just one thing; is the vocalist quiet, or shouting? Are they crying? Crying and shouting at the same time? Is their voice trembling? There are a number of widely varying sounds that could be interpreted as “sad”.
I can’t provide a single “do this to make it sound sad”, but this is a general list of things you can consider:
One clear option is to use any Dark or Bright vocal mode available to you. Those will be the simplest approach since they’re specifically mimicking the original voice provider’s tonal flexibility.
Brightness (or lack thereof) beyond vocal modes
When people are excited they will often unconsciously adjust the resonance of their vocal tract to sound more “bright”.
In addition to vocal modes, this can be accomplished with subtle use of Tone Shift, Gender, and different EQ in the mixing phase.
Higher Tone Shift and lower Gender can often sound more energetic or positive, while the opposite can sound more melancholy. These are things that correlate with the natural changes to vocal tract resonance that tend to happen when people are either excited or sad.
Also, a lot of modern mixing focuses on brightening a voice, so try to find a different EQ curve that suits the mood.
Breathiness can play a big part too. If a vocalist is emotional or crying, they’ll probably run out of breath sooner.
This doesn’t mean to reduce the breathiness slider for the entire track mind you, but pay attention to the breathiness at the end of phrases or sustained notes, and maybe increase the loudness of your breath notes (and include more of them). You could also have a sudden increase in breathiness at the end of a sustained note to sound like the vocalist is expending their last bit of breath.
Tension and loudness
Tension is pretty self-explanatory, but if you’ve never experimented with your own vocals I’d recommend saying “aaaaahhhh” or other vowel sounds with varying levels or relaxedness. Listen to the results, both in terms of airflow but also in how it changes to tone.
If it’s an emotion that could cause a “wavering” voice, consider adding some fluctuations in both Tension and Loudness at certain points.
Also, if the vocalist is shouting in an emotional manner, they might have more aggressive onsets which could benefit from a slight increase in loudness at the start of the note.
But of course pitch is going to be one of the biggest factors. It is more difficult to sing accurately while emotional, so a vocalist might be more prone to overshooting a note and needing to correct it, or the pitch might go more flat at the end of a phrase.
If you have any vocal samples for reference, drop them into Melodyne and look for those sort of pitch patterns.
And for sad songs, consider whether the voice breaks or cracks at points. You can usually achieve these with quick spikes in pitch or voicing, but it can be rather situational, so give it a bit of trial and error. Again, if you have any vocal samples of this sort of thing, dropping them into Melodyne or another audio editor can help identify the traits of that sort of sound.
Of course vibrato is a one of the biggest parts of what makes music sound emotional. Despite the classic robotic sound, many Vocaloid and UTAU songs and covers touted as sounding “emotional” primarily have vibrato used at appropriate moments to achieve that end goal, of course along with various types of pitch transition. (Cillia’s cover of Kokoro Nashi is a clear example of an UTAU cover that has a sad/melancholic sound to the vocals)
Study reference material
The best thing you can do is train your ear to pick up on all these things in actual music, so that you can then translate those things into software terms. “Ear training” is commonly done to learn to transcribe music faster or more accurately, but usually this is focused on the notes as a whole rather than the nuances within each note. It’s rare that people pay attention to the specific factors relevant to vocal synthesis like moment-to-moment pitch changes and vocal tension.
Find a (human) vocalist you like that has released both happy and sad songs, listen to each, and try to pay attention to each of these points and how the two styles differ.