Maze_Flow

MAZE USING REINFORCEMENT LEARNING

In this project, we use the Bellman equation, which utilizes the state value function based on the concept of Dynamic Programming.

[V(s) = \max(R(s, a) + \gamma V(s'))]

(V(s)): State value function of the current state
(V(s')): State value function of the next state
(R(s, a)): Reward obtained upon performing action (a) from state (s)
(\gamma): Discount factor (It is a hyperparameter that determines the amount of importance we give to future rewards)

WHITE: Agent | GREEN: Final Destination | BLUE: Wall | RED: Danger

VISUALIZATION OF VALUE FUNCTION MATRIX

We visualize the matrix using the matplotlib library. The agent must move in the direction of more heat color in order to reach the destination.

The purple blocks trace the pathway to the destination.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
__pycache__		__pycache__
images		images
README.md		README.md
config.py		config.py
grid.py		grid.py
main.py		main.py
maze.py		maze.py