- Policy gradients in TensorFlow for CartPole environment in OpenAI gym
- Partial implementation of the paper Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning (https://arxiv.org/pdf/1708.02596.pdf)
- Implementation of the bit-flipping example in the paper Hindsight Experience Replay (https://arxiv.org/pdf/1707.01495.pdf)
- Proximal Policy Optimization (PPO) Algorithms
- Conservative Q-Learning for Offline Reinforcement Learning (https://arxiv.org/abs/2006.04779) and Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning (https://arxiv.org/abs/2303.05479)
-
Notifications
You must be signed in to change notification settings - Fork 1
mehdimashayekhi/Some-RL-Implementation
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published