Kursinhalt
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Gymnasium Basics
Gymnasium is an open-source toolkit designed for developing and evaluating reinforcement learning (RL) agents. It provides a collection of standard environments for testing algorithms and training agents efficiently.
Key Features
Standardized API: ensures compatibility across different environments;
Variety of environments: supports classic control problems, Atari games, and robotics simulations;
Easy integration: compatible with deep learning frameworks like TensorFlow and PyTorch.
Workflow
A typical workflow in Gymnasium looks like this:
1. Import the Library
python
After the original gym
library was discontinued, it is now recommended to use gymnasium
— a well-maintained and actively developed fork of gym. Despite the change in name, the library is still commonly imported using the alias gym
for backward compatibility and convenience.
2. Create an Environment
python
The gym.make()
function instantiates an environment using its unique identifier (e.g., "CartPole-v1"
). You can also pass additional configuration parameters depending on the environment's requirements.
3. Reset the Environment
python
Before interacting with the environment, you must reset it to its initial state using env.reset()
. This returns:
observation
: the initial state of the environment;info
: auxiliary data that may include metadata or state-specific configuration.
4. Interact with the Environment
python
In the first line, a random action is chosen from the action space using env.action_space.sample()
. The action space defines the set of all possible actions the agent can take in the environment. Additionally, the environment provides the observation space, which can be accessed via env.observation_space
and represents the set of all possible observations(states) the agent can encounter.
In the second line, the chosen action is passed to env.step(action)
, which executes the action and returns the following:
observation
: the agent's new state after taking the action;reward
: the reward received for the action taken;terminated
: a boolean indicating whether the episode has ended (i.e., the task is complete);truncated
: a boolean indicating whether the episode was prematurely stopped (due to time or other constraints);info
: additional diagnostic information, often used for debugging or logging purposes.
5. Close the Environment
python
If your environment consumes external resources (e.g., rendering windows or simulations), you should close it using env.close()
.
If you want to know more about features provided by Gymnasium library, you should visit their website.
Danke für Ihr Feedback!