Environment templates
Both MultiROS and RealROS ship template files you can copy as the starting point for a new environment. Each follows the two-layer Robot env → Task env pattern that the framework expects.
The two-layer pattern
Reinforcement-learning environments are split into two responsibilities:
Robot env — the layer that knows about the robot.
What controllers it exposes (joint position, joint velocity, end-effector pose).
How to read its joint states / sensors.
How to wait for it to be ready.
This layer is reusable across tasks: the same robot env can back a reach task, a push task, a pick-and-place task.
Task env — the layer that knows about the task.
The gymnasium
observation_spaceandaction_space.The reward function.
When an episode terminates.
What “reset to a sensible starting state” means for this task.
It composes the robot env (via inheritance) and adds task-specific behaviour on top.
Base classes that wire this together live in the framework:
Template files
MultiROS (simulation)
Located at multiros/src/multiros/templates/:
robot_envs/MyRobotEnv.py— non-goal robot env template.robot_envs/MyRobotGoalEnv.py— goal-conditioned robot env.task_envs/MyTaskEnv.py— non-goal task env.task_envs/MyTaskGoalEnv.py— goal-conditioned task env (use with HER algorithms viasb3_ros_support).
RealROS (real hardware)
Located at realros/src/realros/templates/:
robot_envs/MyRealRobotEnv.pyrobot_envs/MyRealRobotGoalEnv.pytask_envs/MyRealTaskEnv.pytask_envs/MyRealTaskGoalEnv.py
Recommended workflow
Pick the layer. Start from a robot env if you’re adding a new robot; start from a task env if you’re adding a new task on an existing robot.
Copy, don’t symlink. Templates evolve over time; you want a snapshot you can edit freely.
Rename the class. Match the file name to keep imports tidy.
Fill in the abstract methods. The template comments mark the methods you must implement. Each base class documents what each hook is expected to do.
Register with gymnasium. In your package’s
__init__.py:from gymnasium.envs.registration import register register( id="MyTask-v0", entry_point="my_pkg.task_envs.MyTaskEnv:MyTask", max_episode_steps=1000, )
Test under :mod:`uniros`. Use
import uniros as gymfor the multiprocessing proxy so each env runs in its own worker.
For concrete examples of the pattern in production, look under
rl_environments/src/rl_environments/rx200/ — the working RX200
reach environment is built from the same templates.
Common pitfalls
Quaternion default. The base envs default
robot_ori_w=1.0(identity orientation). Don’t pass0.0— that’s a zero quaternion which Gazebo will normalise but is non-portable across physics engines.Seeded RNG. Use
self.np_random(populated bysuper().reset(seed=seed)) for any random sampling. Don’t use the globalnp.random— that’s not seedable from the env API and breaks reproducibility.MoveIt timeouts.
rospy.wait_for_servicecalls in the framework already have 30 s timeouts (seeuniros.utils.ros_controllers). If you add your own service calls in a task env, follow the same pattern.