Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lernen Gymnasium Basics | RL Core Theory
Introduction to Reinforcement Learning
course content

Kursinhalt

Introduction to Reinforcement Learning

Introduction to Reinforcement Learning

1. RL Core Theory
2. Multi-Armed Bandit Problem
3. Dynamic Programming
4. Monte Carlo Methods
5. Temporal Difference Learning

book
Gymnasium Basics

Gymnasium is an open-source toolkit designed for developing and evaluating reinforcement learning (RL) agents. It provides a collection of standard environments for testing algorithms and training agents efficiently.

Key Features

  • Standardized API: ensures compatibility across different environments;
  • Variety of environments: supports classic control problems, Atari games, and robotics simulations;
  • Easy integration: compatible with deep learning frameworks like TensorFlow and PyTorch.

Workflow

A typical workflow in Gymnasium looks like this:

1. Importing a Library

python

gym is a common alias for this library.

2. Creating an Environment

python

gym.make() requires an environment ID or specification, and it can accept additional parameters for further configuration of the environment.

3. Resetting the Environment

python

env.reset() is required before taking the first step. It resets the environment to its initial state and returns it as an observation.

4. Taking an Action

python

In the first line, a random action is chosen from the action space using env.action_space.sample(). The action space defines the set of all possible actions the agent can take in the environment. Additionally, the environment provides the observation space, which can be accessed via env.observation_space and represents the set of all possible observations(states) the agent can encounter.

In the second line, the chosen action is passed to env.step(action), which executes the action and returns the following:

  • observation: the agent's new state after taking the action;
  • reward: the reward received for the action taken;
  • terminated: a boolean indicating whether the episode has ended (i.e., the task is complete);
  • truncated: a boolean indicating whether the episode was prematurely stopped (due to time or other constraints);
  • info: additional diagnostic information, often used for debugging or logging purposes.

5. Closing the Environment

python

Use env.close() when environment is no longer needed

War alles klar?

Wie können wir es verbessern?

Danke für Ihr Feedback!

Abschnitt 1. Kapitel 7
Wir sind enttäuscht, dass etwas schief gelaufen ist. Was ist passiert?
some-alt