gym_gz.base#

gym_gz.base.runtime#

class gym_gz.base.runtime.Runtime(task, agent_rate)#

Bases: Env, ABC

Base class for defining executors of Task objects.

Task classes are supposed to be generic and are not tied to any specific runtime. Implementations of a runtime class contain all the logic to define how to execute the task, allowing to run the same Task class on different simulators or in a real-time setting.

Runtimes are the real gym.Env objects returned to the users when they call the gym.make factory method.

Parameters:
  • task (Task) – the Task object to handle.

  • agent_rate (float) – the rate at which the environment will be called. Sometimes tasks need to know this information.

Example

Here is minimal example of how the Runtime, gym.Env and Task could be integrated:

class FooRuntime(Runtime):

    def __init__(self, task):
        super().__init__(task=task, agent_rate=agent_rate)
        self.action_space, self.observation_space = self.task.create_spaces()

    def reset(self, seed=None, options={}, **kwargs):
        self.task.reset_task()
        return self.task.get_observation(), self.task.get_info()

    def step(self, action):
        self.task.set_action(action)

        # [...] code that performs the real step [...]

        terminated = self.task.is_terminated()
        truncated = self.task.is_truncated()
        reward = self.task.get_reward()
        observation = self.task.get_observation()

        return observation, reward, terminated, truncated, {}

    def close(self):
        pass

Note

Runtimes can handle only one Task object.

agent_rate#

Rate of environment execution.

task: Task#

Task handled by the runtime.

abstract timestamp()#

Return the timestamp associated to the execution of the environment.

In real-time environments, the timestamp is the time read from the host system. In simulated environments, the timestamp is the simulated time, which might not match the real-time in the case of a real-time factor different than 1.

Return type:

float

Returns:

The current environment timestamp.

gym_gz.base.task#

class gym_gz.base.task.Task(agent_rate)#

Bases: ABC

Interface to define a decision-making task.

The Task is the central interface of each environment implementation. It defines the logic of the environment in a format that is agnostic of both the runtime (either simulated or real-time) and the models it operates on.

Runtime instances are the real objects returned to the users when they call gym.make. Depending on the type of the runtime, it could contain one or more Task objects. The Runtime is a relay class that calls the logic of the Task from its interface methods and implements the real gym.Env.step(). Simulated runtimes step the physics engine, instead, real-time runtimes, enforce real-time execution.

A Task object is meant to be:

  • Independent from the selected Runtime. In fact, it defines only the decision making logic;

  • Independent from the Model objects it operates on. This is achieved thanks to the model abstraction provided by scenario::core::Model.

The population of the world where the task operates is demanded to a gym.Wrapper object, that acts as an environment randomizer.

action_space: Space = None#
agent_rate#

Rate of the agent. It matches the rate at which the Gym.Env methods are called.

abstract create_spaces()#

Create the action and observations spaces.

Note

This method does not currently have access to the Models part of the environment. If the Task is meant to work on different models, we recommend using their URDF / SDF model to extract the information you need (e.g. number of DoFs, joint position limits, etc). Since actions and observations are often normalized, in many cases there’s no need to extract a lot of information from the model file.

Raises:

RuntimeError – In case of failure.

Return type:

Tuple[Space, Space]

Returns:

A tuple containing the action and observation spaces.

get_info()#

Return the info dictionary.

Return type:

Dict

Returns:

A dict with extra information of the task.

abstract get_observation()#

Return the task observation.

This method contains the logic for constructing the environment observation. It is called in the end of both gym.Env.reset() and gym.Env.step() methods.

Raises:

RuntimeError – In case of failure.

Return type:

ndarray

Returns:

The task observation.

abstract get_reward()#

Return the task reward.

This method contains the logic for computing the environment reward. It is called in the end of the gym.Env.step() method.

Raises:

RuntimeError – In case of failure.

Return type:

float

Returns:

The scalar reward.

has_world()#

Check if the world was stored.

Return type:

bool

Returns:

True if the task has a valid world, False otherwise.

abstract is_terminated()#

Return the task termination flag.

This method contains the logic for defining when the environment has terminated. Subsequent calls to Task.set_action() should be preceded by a task reset through Task.reset_task().

It is called in the end of the gym.Env.step() method.

Raises:

RuntimeError – In case of failure.

Return type:

bool

Returns:

True if the environment terminated, False otherwise.

abstract is_truncated()#

Return the task truncation flag.

This method contains the logic for defining when the environment has truncated. Subsequent calls to Task.set_action() should be preceded by a task reset through Task.reset_task().

It is called in the end of the gym.Env.step() method.

Raises:

RuntimeError – In case of failure.

Return type:

bool

Returns:

True if the environment truncated, False otherwise.

np_random: RandomState#

RNG available to the object to ensure reproducibility. Use it for all the random resources.

observation_space: Space = None#
abstract reset_task()#

Reset the task.

This method contains the logic for resetting the task. It is called in the gym.Env.reset() method of the corresponding environment.

Raises:

RuntimeError – In case of failure.

Return type:

None

seed: int#

The seed of the task

seed_task(seed=None)#

Seed the task.

This method configures the Task.np_random RNG.

Parameters:

seed (Optional[int]) – The seed number.

Return type:

list[int]

Returns:

The list of seeds used by the task.

abstract set_action(action)#

Set the task action.

This method contains the logic for setting the environment action. It is called in the beginning of the gym.Env.step() method.

Parameters:

action (ndarray | number) – The action to set.

Raises:

RuntimeError – In case of failure.

Return type:

None

property world: World#

Get the world where the task is operating.

Returns:

The world object.