Skip to content

J-Neuwford/RL-Maze

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RL Maze (WIP)

An reinforcement learning project to learn the fundamentals and underlying algorithms behind my favourite form of AI.

- Screenshots

- Todo

- The Q-Learning Algorithm

Screenshots

heatmap example

Todo

  • refactor Q-Learning script to a monobehavior to enable simultaneous agents.
  • update readme
  • add an arrow visualisation to show preferred direction at each state.

The Q-Learning Algorithm

This took me so much time to understand, and still might be wrong, but here is my understanding:

Q-Learning algorithm

Q = Q-value The quality score.

Q(s,a) = The Q-value of a state and action pair.

s = current state. In this project, position on the grid.

a = action. All the possible actions an agent can make.

α = alpha. How quickly the agent learns, or the degree to which an action influences future decisions (0-1).

r = reward. The reward for an action in the current state.

g = gamma. The discount factor. How much does the agent value future rewards (1) over immediate rewards (0)? Or, the battle between watching YT shorts about RL learning vs. actually finishing this project.

About

Grid-based pathfinding reinforcement learning project.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages