A comprehensive, interactive browser-based playground for experimenting with and visualizing various Reinforcement Learning (RL) algorithms in grid-world environments.
- Dynamic Environment Editor: Customize grid worlds by placing walls, start/end states, rewards, and penalties.
- Multi-Algorithm Support:
- Q-Learning: Off-policy Temporal Difference learning.
- SARSA: On-policy Temporal Difference learning.
- Monte Carlo: Episodic updates based on full returns.
- Real-time Visualization:
- Grid World: See the agent move in real-time.
- Value Heatmaps: Color-coded visualization of Q-values (Green for positive, Red for negative).
- Policy Arrows: Visual indicators of the current best action for each state.
- Path Tracing: Visual trail showing the agent's path.
- Episode Statistics: Real-time line chart of cumulative rewards per episode.
- Interactive Controls:
- Play, Pause, Step-by-step execution.
- Playback: Replay the last finished episode to analyze behavior.
- Adjustable simulation speed.
- Resettable agent and environment.
- Hyperparameter Tuning: Configurable Learning Rate (Alpha), Discount Factor (Gamma), and Exploration Rate (Epsilon).
To run the project locally:
-
Clone or Download the repository.
-
Serve the files using a local web server. Since this project uses ES modules, you cannot simply open
index.htmlfile directly in the browser.Using Python (if installed):
# Run in the project root directory python3 -m http.server 8000Using Node.js (via
http-serveror similar):npx http-server . -
Open your browser and navigate to
http://localhost:8000.
- Edit Environment:
- Use the panel on the left to select a tool (Wall, Start, End, Penalty, Reward, Eraser).
- Click or drag on the grid to modify the environment.
- Configure Agent:
- Select an algorithm (Q-Learning, SARSA, Monte Carlo).
- Adjust hyperparameters if desired.
- Run Simulation:
- Click Start Learning to begin the training process.
- Use the Speed slider to control execution speed.
- Use Step for frame-by-frame analysis (when paused).
- Analyze:
- Observe the heatmap developing.
- Watch the episode statistics chart.
- Click Replay Last Episode to review the most recent run.
- Hover over cells to see detailed info (Coordinates, Q-Values).
index.html: Main entry point and layout.src/: Source code.core/: Core RL logic (Environment, Agent, Algorithms).vis/: Visualization engine (Canvas rendering).ui/: UI management and interaction logic.
assets/: Static assets (CSS).
- JavaScript (ES6+): Core logic and DOM manipulation.
- HTML5 Canvas: High-performance grid rendering.
- CSS3: Styling and layout.
- No external runtime dependencies (Vanilla JS).