Actor Critic Methods

The principal idea is to split the model in two: one for computing an action based on a state and another one to produce the Q values of the action.

The actor takes as input the state and outputs the best action. It essentially controls how the agent behaves by learning the optimal policy (policy-based). The critic, on the other hand, evaluates the action by computing the value function (value based). Those two models participate in a game where they both get better in their own role as the time passes. The result is that the overall architecture will learn to play the game more efficiently than the two methods separately.

Code

python sample.py

Usefull Resources:

https://towardsdatascience.com/understanding-actor-critic-methods-931b97b6df3f
https://www.youtube.com/watch?v=LawaN3BdI00
https://theaisummer.com/Actor_critics/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Actor Critic Methods

Code

Usefull Resources:

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Actor Critic Methods

Code

Usefull Resources: