Diversity in Autonomous Playtesting: a Case Study on Reinforcement Learning for NHL26

About the talk
Reinforcement learning (RL) is already successfully used for playtesting games without human intervention, allowing game studios to reduce the significant time and cost spent for this important task in game development. We present a case study on testing the goalie AI in the hockey game NHL26, which is using a traditional rule-based algorithm, by training a forward player with RL to find exploits in the goalie’s behavior. We show that out of the box RL algorithms only provide limited value for this kind of testing scenario, because they converge to a single exploit strategy. With some simple steps we can enhance the algorithm to instead provide a set of qualitative and diverse solutions that provide additional value to game designers, since they allow finding multiple fix-worthy exploits in a single experiment iteration. For our first deployment of our approach, within a single experiment we were able to find six exploit strategies that were qualitatively similar to those that game testers found in hour long manual testing sessions.

Takeaway
– Reinforcement Learning is good at learning one solution to a problem by overfitting to the environment. With some simple tricks we can create a diverse set of solutions which is more valuable for game testing, since game designers can for example find multiple game exploits in a single development iteration.
– My presentation shows that simple solutions are often the better way to go for AI development. Specifically for the case of NHL26, I show that an easy trick allows us to learn a diverse set of qualitative solutions, without using a cutting edge Quality Diversity algorithm, which are known to be unstable, hard to tune, and debug.
– RL for playtesting can serve as as a non-invasive way to start a research relationship with a game studio. RL for game AI on the other hand requires more trust, because there is still a lack of authorial control for such agents, removing some of the control that game AI designers are used to from more traditional algorithms.

Experience level needed: Beginner, Intermediate