MuJoCo backend (experimental)
Available (experimental)
The MuJoCo simulation backend is available but experimental. It is usable today (reach, push, and pick-and-place envs run on it), but it is not yet in a stable release, and its APIs, defaults, and structure may still change. The default, supported backend remains Gazebo (see Creating a simulation environment). For install steps, see Trying it below.
MultiROS is gaining a MuJoCo simulation backend (via mujoco_ros_pkgs) as a drop-in alternative to Gazebo. The goal is architectural parity: the MuJoCo tooling mirrors the Gazebo tooling so that a robot/task environment is built the same way regardless of which simulator runs underneath. If you know how to build a Gazebo env (Creating a simulation environment), the MuJoCo workflow should feel familiar — only the simulator-specific pieces differ.
How it maps onto the Gazebo design
Each Gazebo utility/base class has a MuJoCo sibling with the same role:
Gazebo |
MuJoCo |
Role |
|---|---|---|
|
|
Launch the server, pause/unpause, reset, step. |
|
|
Robot bring-up, scene management. |
|
|
Real-time factor, gravity, timestep. |
|
Standard env base class. |
|
|
Goal-conditioned (HER) base class. |
The shared, simulator-agnostic utilities (ros_common,
ros_controllers, ros_kinematics, ros_markers, the
gym-proxy and the SB3 wrappers) are used unchanged. As with Gazebo,
a robot/task package only carries what is genuinely robot- or
task-specific: the URDF, the MJCF scene, controller/plugin configs,
and the action/observation/reward logic.
What differs from Gazebo
MuJoCo is not Gazebo, so a few backend-specific points are worth knowing when porting an env:
Scene model. Gazebo spawns models into a running world; MuJoCo loads a single MJCF scene at server start. The robot geometry comes from that MJCF;
mujoco_modelsbrings up the ROS interfaces (robot description,robot_state_publisher, controllers) on top.Control via
mujoco_ros_control. Joints are driven through the usualros_controlinterfaces, so the trajectory-controller API (/<ns>/arm_controller/command) matches the Gazebo setup.Transmission filtering.
mujoco_ros_controlaborts if the URDF declares a<transmission>for a joint that is not in the MJCF (e.g. a gripper or mimic fingers absent from the model). The backend strips those automatically — a robot env passescontrolled_jointsto the base class, and the launch helperscripts/mujoco_filtered_description.pydoes the same for launch files. The manufacturer URDF is reused unmodified.Stepping regimes.
MujocoBaseEnvsupports the same real-time loop as Gazebo (physics never pauses, a timer refreshes observations) and a paused-MDP loop, plus a deterministic fast-step mode (sim_step_mode=2) that advances the simulation explicitly with no wall-clock sleep, for training faster than real time.
Trying it
Install the backend with the MultiROS installer’s opt-in flag (it
fetches a prebuilt MuJoCo and clones mujoco_ros_pkgs plus the
example vx300s_mujoco_envs):
./install_uniros_stack.sh -m
Or, for the Docker image, bake the backend in at build time:
cd UniROS/docker && ./build.sh --mujoco
A worked example lives in the standalone package vx300s_mujoco_envs, which validates the backend on the Trossen VX300S reach, push, and pick-and-place tasks. It mirrors the Gazebo VX300S envs but is built entirely from the MuJoCo tooling. The one-command workflow matches the Gazebo envs:
# self-launches roscore + the MuJoCo server + controllers
rosrun vx300s_mujoco_envs vx300s_mujoco_reach_test.py
or in Python:
import uniros as gym
# Importing the task module registers its env id with gymnasium.
from vx300s_mujoco_envs.task_envs.reach import vx300s_mujoco_reach # noqa: F401
env = gym.make("VX300SMujocoReacherSim-v0")
obs, info = env.reset(seed=0)
for _ in range(50):
obs, reward, terminated, truncated, info = env.step(env.action_space.sample())
if terminated or truncated:
obs, info = env.reset()
env.close()
Training reuses the same sb3_ros_support wrappers as the Gazebo
envs (SAC, TD3, …), with optional TensorBoard and Weights & Biases
monitoring.
Status and limitations
Not yet in a stable release; available on an experimental MultiROS branch (see install steps above).
Implemented: VX300S reach, push, and pick-and-place, each with a goal-conditioned (HER) variant.
Additional robots/tasks may follow.
APIs may change before this lands in a release.
Differences from the Gazebo backend
The MuJoCo and Gazebo backends are deliberately implemented as
sibling sub-systems (mirroring the way RealROS sits next to
MultiROS). They share the public Task/Robot env hooks
(_set_action, _get_observation, compute_reward,
_compute_terminated, _compute_truncated,
_set_init_params, _check_connection_and_readiness), so a
Task env that only overrides those ports unchanged between
backends. The differences below are at the utility layer
underneath, where each backend talks to its own simulator. They
mirror real differences in what each simulator offers, not parity
gaps to be papered over.
Object spawn / remove.
Gazebo exposes module-level free functions
(gazebo_models.spawn_sdf_model_gazebo,
gazebo_models.remove_model_gazebo) backed by stateless service
calls. MuJoCo has no runtime spawn / delete service; the only way to
change scene topology is to regenerate the MJCF and call
~reload. The framework wraps this in a stateful
MujocoSceneManager class
(multiros.utils.mujoco_models.MujocoSceneManager): construct it
with your base MJCF scene and call spawn_primitive,
spawn_model, and remove_model on the instance. Each call
re-initialises the whole simulation, so it is suited to per-episode
scene changes rather than per-step updates. To reposition an object
that is already in the scene (declared with a free joint), use
mujoco_set_body_state — much cheaper than a reload.
Model state vs body state.
Gazebo’s accessors are named gazebo_get_model_state /
gazebo_set_model_state because the Gazebo service operates on
named models with a configurable reference frame. MuJoCo’s
underlying service operates on bodies (with the constraint that
set_pose only applies to bodies whose parent joint is a free
joint), so the framework’s accessors are
mujoco_get_body_state / mujoco_set_body_state. The names
match the simulator vocabulary; do not assume one maps to the
other transparently.
Concurrency / per-instance isolation.
Gazebo’s parallelism uses distinct ROS_MASTER_URI and
GAZEBO_MASTER_URI ports per instance. MuJoCo’s parallelism
uses a single ROS_MASTER_URI per instance plus a server_name
graph identifier (mujoco_server by default) — the MuJoCo server
has no master-URI concept. Code that threads gazebo_port to
identify a Gazebo instance has to thread server_name for
MuJoCo.
No model enumeration service.
Gazebo’s /gazebo/get_world_properties returns the list of model
names in the world; the framework uses this for spawn / remove
verify loops. mujoco_ros_pkgs exposes no equivalent service —
the scene topology is fixed in the MJCF and known to the env code.
Query a body’s state by name with mujoco_get_body_state; do not
attempt to enumerate the scene.
Robot placement.
Gazebo’s spawn_robot_in_gazebo accepts the robot’s spawn pose
as pos_x/y/z + ori_w/x/y/z kwargs because the robot is
spawned at runtime. MuJoCo loads the robot from the MJCF, so the
robot’s pose lives in the MJCF <worldbody>. The
spawn_robot_in_mujoco signature does not take placement kwargs;
edit the MJCF to change the pose.
Reset semantics.
Gazebo distinguishes reset_world (data only, time preserved)
and reset_simulation (data + time) via the reset_mode
kwarg. MuJoCo has a single ~reset that calls mj_resetData
and re-applies the configured initial joint states — there is no
world / simulation distinction. The MuJoCo backend therefore has no
reset_mode kwarg; pass nothing.