Warehouse Agents Navigation

Comparison of DQN and PPO algorithms in a warehouse. Switch between algorithms, adjust parameters, and visualize training results in real-time.

TensorFlow

Deep Q-Network

PPO Algorithm

Training Controls

Configure and run your RL algorithms

Environment

Algorithm

Dimension Selection

Episodes: 1000

Learning Rate: 0.001

Package Deliver Goal: 146

Default: 146 packages (all the packages currently exist in this warehouse)

Deep Q-Network uses experience replay and target networks for stable learning.

Environment Visualization

Real-time view of warehouse robots managing packages

Ready to Start Training

Click "Start Training" to begin the dqn simulation

Training Metrics

Real-time performance metrics during training

Current: 0.0

update per episode

Current: 0.000

Current: 1.000