What AI Learning to Drive in Trackmania Can Tell Us About Human Behavior at Work
I am watching weird Youtube videos at night so you don’t have to. Since every second video/article is now about AI, I have stumbled upon this one which fascinates me with an almost hidden message that has nothing to do with AI. Hear me out: It’s about an AI being trained to drive in a computer game called Trackmania. Sounds kinda…meh, right? Buckle up.
Watching this AI evolve from a clueless beginner to a confident racer feels like watching a child take their first steps. It's a fascinating journey that draws parallels between machine learning and human learning. Right off the bat, the video introduces several AIs competing against each other, all going through a sort of “natural selection.” But here’s the catch: this method only compares “winners” and “losers” at the end of each race, and none of the AI’s progress nearly enough to complete the track. So, the new approach is introduced - the one emphasizing the magic of feedback—specifically, reinforcement learning. Instead of juggling multiple AIs, the focus is on one AI, giving it constant feedback and pushing it to improve.
The reason why I find it remarkable it’s because that is exactly what many leaders (and HR’s) miss when dealing with new talent. The concept of “hire more - filter out” might not be so beneficial after all, and the AI is here to prove that instead of spreading resources thin across a wide array of employees, the true way it to strive to maximize each one’s potential with proper guidance and feedback. The AI in the video is given corrective feedback literally every second and have a “feel free to fail” card while in it’s explorations stage. Yet even AI’s are not moving far in their learning without a proper reward system. Yes, even an AI needs incentives to keep going. Without a reward system, all the corrective feedback in the world won’t cut it. The AI in the video gets rewards based on the distance it travels, which encourages it to drive faster and more efficiently. This idea mirrors our need for motivation—whether it’s praise, promotions, or just a pat on the back.
Then comes the exploration stage. The AI starts with random actions—just like a newbie trying out different things without a clue. Imagine a new employee or a fresh entrepreneur—full of enthusiasm but making tons of mistakes. The AI is no different. It stumbles, falls off the road, and makes wrong turns. But here’s the key: maintaining a positive attitude and encouraging persistence. This phase is all about patience. The AI, much like a human, learns from its mistakes, gradually improving its skills. This phase is crucial because it helps the AI (or a person in a workplace) gather data about what works and what doesn’t. After thousands of runs, the AI has collected a wealth of information. Now it’s time for the reinforcement learning algorithm to shine and the neural network begins to predict the expected reward for each action based on the data collected. This is where it gets interesting—the AI isn’t just looking at immediate rewards but also considering long-term consequences. For instance, slowing down before a sharp turn might mean sacrificing short-term gains for a bigger payoff down the line.
One of my favorite parts of the video is when the AI just stops, almost as if it’s scared to keep going. This must sound familiar. It’s a lot like us humans when we hit our comfort zone and wants to hold it’s gains without pushing forward. After achieving some success, we often become hesitant to take risks. The AI, having learned the initial part of the track, starts slowing down, taking more time to adapt to new challenges. New part of the track look and almost “feel” scary and unattainable. The solution for this problem? Randomization. By randomizing starting point on the track in each race, AI is forced to quickly adapt to new consequences. This approach is genius—it teaches the AI to handle a variety of situations, just like exposing people to diverse & new experiences outside of their comfort zone, helps them become more adaptable and learn faster.
And then, the AI reaches a breakthrough: with over 20,000 attempts, it finally completes the track - a testament to the power of persistence, learning and grit. But the journey doesn’t stop there. The AI continues to improve, albeit slowly. It’s now capable of handling different surfaces—road, grass, dirt—even though it didn’t encounter them during training. The AI have truly “mastered it’s craft”. This ability to generalize (more about it here) is what sets apart a well-trained AI (or person) from the rest. It’s about knowing how to transfer & apply knowledge from one situation to another and using it to adapt much quicker.
For me, the powerful message behind this video is hidden in plain sight - It’s not just about racing in Trackmania, neither is not about AI in the first place. it’s about the broader implications for AI and human learning. It’s about continuous progress, adaptability, and the willingness to step out of comfort zones being key to achieving greatness. It’s about the parallels between machine learning and human behavior. Whether it’s in the realm of AI development or personal growth, the principles of feedback, rewards, and continuous adaptation remain crucial and basically the same. By embracing new experiences and pushing beyond comfort zones, both AIs and humans can unlock their full potential and achieve remarkable success. So, let’s take a page out of this AI’s book and keep pushing forward, one step (or race) at a time.