-
Book Overview & Buying
-
Table Of Contents
-
Feedback & Rating

Deep Reinforcement Learning Hands-On
By :

Our first examples won't be around speeding up the baseline, but will show one common, and not always obvious, situation that can cost you performance. In Chapter 3, Deep Learning with PyTorch, we discussed the way PyTorch calculates gradients: it builds the graph of all operations that you perform on tensors, and when you call the backward()
method of the final loss, all gradients in the model parameters are automatically calculated.
This works well, but RL code is normally much more complex than traditional supervised learning models, so the RL model that we are currently training is also being applied to get the actions that the agent needs to perform in the environment. The target network discussed in Chapter 6 makes it even more tricky. So, in DQN, a neural network (NN) is normally used in three different situations: