Environment templates

Both MultiROS and RealROS ship template files you can copy as the starting point for a new environment. Each follows the two-layer Robot env → Task env pattern that the framework expects.

The two-layer pattern

Reinforcement-learning environments are split into two responsibilities:

Robot env — the layer that knows about the robot.

What controllers it exposes (joint position, joint velocity, end-effector pose).
How to read its joint states / sensors.
How to wait for it to be ready.

This layer is reusable across tasks: the same robot env can back a reach task, a push task, a pick-and-place task.

Task env — the layer that knows about the task.

The gymnasium observation_space and action_space.
The reward function.
When an episode terminates.
What “reset to a sensible starting state” means for this task.

It composes the robot env (via inheritance) and adds task-specific behaviour on top.

Base classes that wire this together live in the framework:

Template files

MultiROS (simulation)

Located at multiros/src/multiros/templates/:

robot_envs/MyRobotEnv.py — non-goal robot env template.
robot_envs/MyRobotGoalEnv.py — goal-conditioned robot env.
task_envs/MyTaskEnv.py — non-goal task env.
task_envs/MyTaskGoalEnv.py — goal-conditioned task env (use with HER algorithms via sb3_ros_support).

RealROS (real hardware)

Located at realros/src/realros/templates/:

robot_envs/MyRealRobotEnv.py
robot_envs/MyRealRobotGoalEnv.py
task_envs/MyRealTaskEnv.py
task_envs/MyRealTaskGoalEnv.py

Recommended workflow

Pick the layer. Start from a robot env if you’re adding a new robot; start from a task env if you’re adding a new task on an existing robot.
Copy, don’t symlink. Templates evolve over time; you want a snapshot you can edit freely.
Rename the class. Match the file name to keep imports tidy.
Fill in the abstract methods. The template comments mark the methods you must implement. Each base class documents what each hook is expected to do.

Register with gymnasium. In your package’s __init__.py:

from gymnasium.envs.registration import register

register(
    id="MyTask-v0",
    entry_point="my_pkg.task_envs.MyTaskEnv:MyTask",
    max_episode_steps=1000,
)

Test under :mod:`uniros`. Use import uniros as gym for the multiprocessing proxy so each env runs in its own worker.

For concrete examples of the pattern in production, look under rl_environments/src/rl_environments/rx200/ — the working RX200 reach environment is built from the same templates.

Common pitfalls

Quaternion default. The base envs default robot_ori_w=1.0 (identity orientation). Don’t pass 0.0 — that’s a zero quaternion which Gazebo will normalise but is non-portable across physics engines.
Seeded RNG. Use self.np_random (populated by super().reset(seed=seed)) for any random sampling. Don’t use the global np.random — that’s not seedable from the env API and breaks reproducibility.
MoveIt timeouts. rospy.wait_for_service calls in the framework already have 30 s timeouts (see uniros.utils.ros_controllers). If you add your own service calls in a task env, follow the same pattern.