Author:
Victor Ulisses Pugliese
Affiliation:
Federal University of São Paulo, Avenida Cesare Mansueto Giulio Lattes, 1201, São José dos Campos, Brazil
Keyword(s):
Reinforcement Learning, Proximal Policy Optimization, Curriculum Learning, Video Games.
Abstract:
We conducted an investigative study of Policy Gradient methods using Curriculum Learning applied in Video Games, as professors at the Federal University of Goiás created a customized SoccerTwos environment to evaluate the Machine Learning agents of students in a Reinforcement Learning course. We employed the PPO and SAC as state-of-arts in on-policy and off-policy contexts, respectively. Also, the Curriculum could improve the performance based on it is easier to teach people in a complex gradual order than randomly. So, combining them, we propose our agents win more matches than their adversaries. We measured the results by minimum, maximum, mean rewards, and the mean length per episode in checkpoints. Finally, PPO achieved the best result with Curriculum Learning, modifying players’ (position and rotation) and ball’s (speed and position) settings in time intervals. Also, It used fewer training hours than other experiments.