gym_gz.base#
gym_gz.base.runtime#
- class gym_gz.base.runtime.Runtime(task, agent_rate)#
Bases:
Env,ABCBase class for defining executors of
Taskobjects.Taskclasses are supposed to be generic and are not tied to any specific runtime. Implementations of a runtime class contain all the logic to define how to execute the task, allowing to run the sameTaskclass on different simulators or in a real-time setting.Runtimes are the real
gym.Envobjects returned to the users when they call thegym.makefactory method.- Parameters:
Example
Here is minimal example of how the
Runtime,gym.EnvandTaskcould be integrated:class FooRuntime(Runtime): def __init__(self, task): super().__init__(task=task, agent_rate=agent_rate) self.action_space, self.observation_space = self.task.create_spaces() def reset(self, seed=None, options={}, **kwargs): self.task.reset_task() return self.task.get_observation(), self.task.get_info() def step(self, action): self.task.set_action(action) # [...] code that performs the real step [...] terminated = self.task.is_terminated() truncated = self.task.is_truncated() reward = self.task.get_reward() observation = self.task.get_observation() return observation, reward, terminated, truncated, {} def close(self): pass
Note
Runtimes can handle only one
Taskobject.- agent_rate#
Rate of environment execution.
- task: Task#
Task handled by the runtime.
- abstract timestamp()#
Return the timestamp associated to the execution of the environment.
In real-time environments, the timestamp is the time read from the host system. In simulated environments, the timestamp is the simulated time, which might not match the real-time in the case of a real-time factor different than 1.
- Return type:
float- Returns:
The current environment timestamp.
gym_gz.base.task#
- class gym_gz.base.task.Task(agent_rate)#
Bases:
ABCInterface to define a decision-making task.
The Task is the central interface of each environment implementation. It defines the logic of the environment in a format that is agnostic of both the runtime (either simulated or real-time) and the models it operates on.
Runtimeinstances are the real objects returned to the users when they callgym.make. Depending on the type of the runtime, it could contain one or moreTaskobjects. TheRuntimeis a relay class that calls the logic of theTaskfrom its interface methods and implements the realgym.Env.step(). Simulated runtimes step the physics engine, instead, real-time runtimes, enforce real-time execution.A
Taskobject is meant to be:Independent from the selected
Runtime. In fact, it defines only the decision making logic;Independent from the
Modelobjects it operates on. This is achieved thanks to the model abstraction provided byscenario::core::Model.
The population of the world where the task operates is demanded to a
gym.Wrapperobject, that acts as an environment randomizer.-
action_space:
Space= None#
- agent_rate#
Rate of the agent. It matches the rate at which the
Gym.Envmethods are called.
- abstract create_spaces()#
Create the action and observations spaces.
Note
This method does not currently have access to the Models part of the environment. If the Task is meant to work on different models, we recommend using their URDF / SDF model to extract the information you need (e.g. number of DoFs, joint position limits, etc). Since actions and observations are often normalized, in many cases there’s no need to extract a lot of information from the model file.
- Raises:
RuntimeError – In case of failure.
- Return type:
Tuple[Space,Space]- Returns:
A tuple containing the action and observation spaces.
- get_info()#
Return the info dictionary.
- Return type:
Dict- Returns:
A
dictwith extra information of the task.
- abstract get_observation()#
Return the task observation.
This method contains the logic for constructing the environment observation. It is called in the end of both
gym.Env.reset()andgym.Env.step()methods.- Raises:
RuntimeError – In case of failure.
- Return type:
ndarray- Returns:
The task observation.
- abstract get_reward()#
Return the task reward.
This method contains the logic for computing the environment reward. It is called in the end of the
gym.Env.step()method.- Raises:
RuntimeError – In case of failure.
- Return type:
float- Returns:
The scalar reward.
- has_world()#
Check if the world was stored.
- Return type:
bool- Returns:
True if the task has a valid world, False otherwise.
- abstract is_terminated()#
Return the task termination flag.
This method contains the logic for defining when the environment has terminated. Subsequent calls to
Task.set_action()should be preceded by a task reset throughTask.reset_task().It is called in the end of the
gym.Env.step()method.- Raises:
RuntimeError – In case of failure.
- Return type:
bool- Returns:
True if the environment terminated, False otherwise.
- abstract is_truncated()#
Return the task truncation flag.
This method contains the logic for defining when the environment has truncated. Subsequent calls to
Task.set_action()should be preceded by a task reset throughTask.reset_task().It is called in the end of the
gym.Env.step()method.- Raises:
RuntimeError – In case of failure.
- Return type:
bool- Returns:
True if the environment truncated, False otherwise.
-
np_random:
RandomState# RNG available to the object to ensure reproducibility. Use it for all the random resources.
-
observation_space:
Space= None#
- abstract reset_task()#
Reset the task.
This method contains the logic for resetting the task. It is called in the
gym.Env.reset()method of the corresponding environment.- Raises:
RuntimeError – In case of failure.
- Return type:
None
-
seed:
int# The seed of the task
- seed_task(seed=None)#
Seed the task.
This method configures the
Task.np_randomRNG.- Parameters:
seed (
Optional[int]) – The seed number.- Return type:
list[int]- Returns:
The list of seeds used by the task.
- abstract set_action(action)#
Set the task action.
This method contains the logic for setting the environment action. It is called in the beginning of the
gym.Env.step()method.- Parameters:
action (
ndarray|number) – The action to set.- Raises:
RuntimeError – In case of failure.
- Return type:
None