First thing to clarify: you are making separate cards in Anki for each word, and you’re using new images to learn those words (from Google Images, doing that Spot-the-Differences thing that’s discussed in the book)
Given that, your followup question is totally reasonable: how on earth do I link these words together into this story, and what’s the point of that image on top?
If you’re using my model deck (http://fluent-forever.com/gallery/ ), then you’re going to be encountering new cards “In order added,” basically meaning that if you spend 15 minutes creating some flashcards for Tierra, cielo, arriba, luna, uno, blanco (/-a), punto, and estrella, then the first new flashcard you’re going to encounter the next time you study is going to be a flashcard for Tierra. Then you’ll run into the one for cielo, then arriba, etc. Even if you spent 5 hours creating hundreds of flashcards, you’re still going to encounter new cards during your next study session in that particular order.
Practically, that means that on whatever day you actually study those flashcards, you’re going to learn “Tierra”, “cielo”, “arriba”, etc. on the same day. Also, once you mark them as correct, they’re going to come back - together - 2 days later. Those two repetitions of those words are going to be enough to link them together, just by virtue of the fact that you’re learning them at the same time.
In addition, the fact that they fit together into a story will add additional associations between those words. The illustration is designed to further reinforce those associations by showing you that story and how all the words can interact. You’re not going to be constantly reviewing that image, but the fact that you looked at it, processed what was going on in that image and how it related to the words above it, and then continued to see it as you looked through your word list to make your flashcards - all of that stuff helps build associations between your words.