NeatRL

Deep Reinforcement Learning Algorithms Library

One-file implementations of deep RL algorithms in PyTorch. Each algorithm is self-contained — readable, runnable, and stripped of unnecessary abstraction.

NeatRL Library

The neatrl/ package provides reusable training utilities built on top of the individual implementations. Install via pip:

pip install neatrl"[classic,box2d,atari]"

from neatrl import train_dqn

model = train_dqn(env_id="CartPole-v1", total_timesteps=10000, seed=42)

Full source: github.com/YuvrajSingh-mist/NeatRL/tree/master/neatrl

Implementations

Value-Based

DQN — Deep Q-Network for CartPole and LunarLander
DQN Atari — DQN with conv nets on Breakout
DQN Flappy — DQN on Flappy Bird
DQN Lunar — DQN tuned for Lunar Lander
DQN Taxi — DQN for discrete Taxi-v3
DQN FrozenLake — DQN on FrozenLake
Dueling DQN — Separate value and advantage streams
Q-Learning — Tabular Q-Learning and Value Iteration
VizDoom RL — DQN in a 3D first-person environment

Policy-Based

REINFORCE — Monte Carlo policy gradient
A2C — Advantage Actor-Critic
PPO — Proximal Policy Optimization
FlappyBird PPO — PPO on Flappy Bird
GRPO — Group Relative Policy Optimization (DeepSeek-R1)

Continuous Control

DDPG — Deep Deterministic Policy Gradient
TD3 — Twin Delayed DDPG
SAC — Soft Actor-Critic

Exploration & Multi-Agent

RND — Random Network Distillation + PPO
Imitation Learning — Behavioral cloning
MARL — Multi-Agent RL (IPPO, MAPPO, Self-Play)

References

Sutton & Barto — Reinforcement Learning: An Introduction
CleanRL — primary inspiration for the one-file style

Yuvraj Singh