ppo

PPO

Original paper: https://arxiv.org/abs/1707.06347
cd playground and then python3 learn.py configs/data/ppo-cartpole-v1.json runs the algorithm with default parameters.
p.s. this is a simplified version of the implementation of open ai baseline. Some classes are also borrowd from https://github.com/lilianweng/deep-reinforcement-learning-gym