-
-
Notifications
You must be signed in to change notification settings - Fork 5
LazyZero
Jacob Marshall edited this page Jan 21, 2024
·
1 revision
Before implementing vectorized AlphaZero, I built a simpler implementation called LazyZero. LazyZero forgoes Monte Carlo Tree Search due to its complexity, and instead only utilizes the PUCT algorithm at the root node. Exploration from the root node is done via n fixed-depth, policy-based rollouts of depth d.
LazyZero can be effective in stochastic environments like 2048, as a critical mass of policy rollouts will capture the environment's stochasticity. Needless to say, more sophisticated techniques such as stochastic AlphaZero are much more effective.
You can find the source code for LazyZero and LazyMCTS at: