In my opinion, the most important element in developing pronunciation is training your ears well enough so you can self-correct when you say things out loud. If you can hear the differences between, say, “caro” “carro” and “Car-O”, then when you say them out loud, you’re going to *notice* when you say “caro” but meant to say “carro”; you can correct yourself.   Once you develop that skill, everything else can follow, primarily through mimicry.  

In practice, I go through the trainer and repeat each example word out loud as I learn it from the spelling rules. Every time I repeat those example words in future repetitions, I’m going to be doing it more and more accurately, since my ability to self-correct is getting better.
As for minimal pairs, those can be harder to repeat at first. I may have some trouble hearing the differences (perhaps it’s the first day I’m encountering that pair, or for a hard pair, the 2nd or 3rd day that I’ve seen it). If that’s happening, then I won’t repeat those pairs out loud until the 2nd, 3rd or 4th repetition, but I'll ultimately start repeating them as soon as they start becoming somewhat recognizable to my ears.

For my own studies, that’s all I do, but I’m also very familiar with the IPA. For someone less familiar with what the IPA means (in terms of mouth/tongue positions), it might be valuable to go through the videos one more time, 10-14 days after starting the trainer, just to review that content.

Did this answer your question?