Deep Q-Networks (DQN) Algorithm
Definition:
Deep Q-Networks (DQN) is a reinforcement learning algorithm that extends Q-Learning by using deep neural networks to approximate the Q-function. This allows DQN to handle high-dimensional state spaces that are not feasible with traditional tabular Q-Learning. The approach was popularized by DeepMind's success in applying DQN to play Atari games at a superhuman level.
Characteristics:
-
Combines Deep Learning with Reinforcement Learning:
DQN leverages neural networks to estimate Q-values, enabling agents to make decisions in environments with complex, high-dimensional state representations. -
Experience Replay:
To improve training stability, DQN stores experiences (state, action, reward, next state) in a replay buffer and samples mini-batches from this buffer to train the network. This reduces correlations between consecutive experiences. -
Fixed Target Network:
DQN uses a separate target network to provide stable Q-value updates. This network is periodically updated with the weights of the main Q-network, preventing harmful feedback loops during training.
How It Works:
DQN follows the same principles as Q-Learning but uses a deep neural network parameterized by weights to approximate Q-values. The network is trained to minimize the loss function:
- : Weights of the current Q-network
- : Weights of the target network (held fixed for stability)
- : Discount factor for future rewards
Steps Involved:
-
Initialize Replay Buffer and Networks:
Initialize the replay buffer, the Q-network with weights