Description: Training RL agents in games requires collecting lots of data from various game states and trajectories. As games grow in complexity, it is easy for some unintended functionality to affect the distribution of data that an RL agent would be trained on. This means an agent’s behaviour may be affected and ultimately be a signal that some property of the game does not match intentions. This matches the criteria for a property based test and is an inspiration for future game testing mechanisms.
Takeaways:
- An intuition of property based testing.
- A concrete example of how RL training helped identify a bug
- An inspiration for how RL training can be more incorporated into property tests