Q-learning and RL implementation-526互联

Aim: Train a model to properly play vintage video games...

Deep Q-learning Algo~

Very short Brief of Notations:

{A,pi(Policy),Q(quality of action-at a state),R ((s,a,s') - Reward, s state doing a to go to s' and get a specific r)}

So, if we want to train a model to play a video game like master. Modules are to be implemented as minimum, listed. below:

a class that can catch enough frames(typically consequtive) for game env analysis -> might need preprocessing to lower the memory overhead
a class of NN based model for training, weights init/update/storage/write/fork/reset; also the actions in a single play is recorded for optimization
a class that utilize the possible actions and abstrct to humble level to do anything player is going to do w/o generative issue at the beginning(can go general when model matured)
game to model/pre-processing module

This is just the minimum...