gym

Project maintained by masalskyi Hosted on GitHub Pages — Theme by mattgraham

Cart pole v1

Cart pole is a classical control environment. The main purpose of player is balancing the stick. All that player can do is to move right or left. Actions are discrete. The game ends when:

Pole Angle is greater than ±12°
Cart Position is greater than ±2.4 (center of the cart reaches the edge of the display)
Episode length is greater than 500

Video results:

The solution is based on deep q-learning algorithm. The model has the structure:

from tensorflow.keras.models import Sequential
import tensorflow.keras.layers as layers
def build_model(states, actions):
    model = Sequential()
    model.add(layers.Flatten(input_shape=(1,states)))
    model.add(layers.Dense(64, activation="relu"))
    model.add(layers.Dense(32, activation="relu"))
    model.add(layers.Dense(actions, activation="linear"))
    return model

The model was trained on 100000 steps with Adam optimizer(lr=1e-3) and BoltzmannQPolicy, also using checkpoint callback to save the model that achieve the best rewards. Callbacks were taken from here.