Topic | v1 | created by janarez |

Q-learning is a model-free reinforcement learning algorithm to learn the value of an action in a particular state. It does not require a model of the environment (hence "model-free"), and it can handle problems with stochastic transitions and rewards without requiring adaptations. For any finite Markov decision process (FMDP), Q-learning finds an optimal policy in the sense of maximizing the expected value of the total reward over any and all successive steps, starting from the current state. Q-learning can identify an optimal action-selection policy for any given FMDP, given infinite exploration time and a partly-random policy. "Q" refers to the function that the algorithm computes - the expected rewards for an action taken in a given state.


used by Reinforcement learning

Reinforcement learning (RL) is an area of machine learning concerned with how intelligent agents ough...

Edit details Edit relations Attach new author Attach new topic Attach new resource

treated in Intro to Q-learning in RL

10.0 rating 1.0 level 10.0 clarity 4.0 background – 1 rating

In this tutorial, we aim to provide readers with a high-level overview of the fundamentals of RL as w...