-
Book Overview & Buying
-
Table Of Contents
-
Feedback & Rating

The Reinforcement Learning Workshop
By :

In the previous sections, we learned about the state-value function, which tells us how rewarding it is to be in a particular state for an agent. Now we will learn about another function where we can combine the state with actions. The action-value function will tell us how good it is for the agent to take any given action from a given state. We also call the action value the Q value. The equation can be written as follows:
Figure 9.13: Expression for the Q value function
The preceding equation can be written in an iterative fashion, as follows:
Figure 9.14: Expression for the Q value function with iterations
This equation is also known as the bellman equation. From the equation, we can express . A Bellman equation can be described as follows:
"The total expected reward being in state s and taking action a is the sum of two components: the reward (which is r) that we can...