abcdRL (Implement a RL algorithm in four simple steps)
abcdRL is a Modular Single-file RL Algorithms Library🗄 that provides modular🏗 design without strict🚥 and clean single-file📜 implementation.
When reading📖 the code, understand the full implementation details of the algorithm in the single file📜 quickly; When modifying🖌 the algorithm, benefiting from a lightweight🍃 modular design, only need to focus on a small number of modules.
Note
abcdRL mainly references the single-file design philosophy of vwxyzjn/cleanrl and the module design of PaddlePaddle/PARL.
🗽 Design Philosophy
- "Copy📋",
not "Inheritance🧬" - "Single-file📜",
not "Multi-file📚" - "Features reuse🛠",
not "Algorithms reuse🖨" - "Unified logic🤖",
not "Unified interface🔌"
✅ Implemented Algorithms
- Deep Q Network (DQN)
- Deep Deterministic Policy Gradient (DDPG)
- Twin Delayed Deep Deterministic Policy Gradient (TD3)
- Soft Actor-Critic (SAC)
- Proximal Policy Optimization (PPO)