MuJoCo backend (experimental) ============================= .. admonition:: Available (experimental) :class: warning The MuJoCo simulation backend is **available but experimental**. It is usable today (reach, push, and pick-and-place envs run on it), but it is not yet in a stable release, and its APIs, defaults, and structure may still change. The default, supported backend remains **Gazebo** (see :doc:`env_creation_sim`). For install steps, see :ref:`mujoco-install-steps` below. MultiROS is gaining a `MuJoCo `_ simulation backend (via `mujoco_ros_pkgs `_) as a drop-in alternative to Gazebo. The goal is **architectural parity**: the MuJoCo tooling mirrors the Gazebo tooling so that a robot/task environment is built the same way regardless of which simulator runs underneath. If you know how to build a Gazebo env (:doc:`env_creation_sim`), the MuJoCo workflow should feel familiar — only the simulator-specific pieces differ. How it maps onto the Gazebo design ---------------------------------- Each Gazebo utility/base class has a MuJoCo sibling with the same role: .. list-table:: :widths: 40 40 20 :header-rows: 1 * - Gazebo - MuJoCo - Role * - ``multiros.utils.gazebo_core`` - ``multiros.utils.mujoco_core`` - Launch the server, pause/unpause, reset, step. * - ``multiros.utils.gazebo_models`` - ``multiros.utils.mujoco_models`` - Robot bring-up, scene management. * - ``multiros.utils.gazebo_physics`` - ``multiros.utils.mujoco_physics`` - Real-time factor, gravity, timestep. * - :class:`multiros.envs.GazeboBaseEnv.GazeboBaseEnv` - ``multiros.envs.MujocoBaseEnv.MujocoBaseEnv`` - Standard env base class. * - :class:`multiros.envs.GazeboGoalEnv.GazeboGoalEnv` - ``multiros.envs.MujocoGoalEnv.MujocoGoalEnv`` - Goal-conditioned (HER) base class. The shared, simulator-agnostic utilities (``ros_common``, ``ros_controllers``, ``ros_kinematics``, ``ros_markers``, the gym-proxy and the SB3 wrappers) are used unchanged. As with Gazebo, a robot/task package only carries what is genuinely robot- or task-specific: the URDF, the MJCF scene, controller/plugin configs, and the action/observation/reward logic. What differs from Gazebo ------------------------ MuJoCo is not Gazebo, so a few backend-specific points are worth knowing when porting an env: * **Scene model.** Gazebo spawns models into a running world; MuJoCo loads a single MJCF scene at server start. The robot geometry comes from that MJCF; ``mujoco_models`` brings up the ROS interfaces (robot description, ``robot_state_publisher``, controllers) on top. * **Control via** ``mujoco_ros_control``. Joints are driven through the usual ``ros_control`` interfaces, so the trajectory-controller API (``//arm_controller/command``) matches the Gazebo setup. * **Transmission filtering.** ``mujoco_ros_control`` aborts if the URDF declares a ```` for a joint that is not in the MJCF (e.g. a gripper or mimic fingers absent from the model). The backend strips those automatically — a robot env passes ``controlled_joints`` to the base class, and the launch helper ``scripts/mujoco_filtered_description.py`` does the same for launch files. The manufacturer URDF is reused unmodified. * **Stepping regimes.** ``MujocoBaseEnv`` supports the same real-time loop as Gazebo (physics never pauses, a timer refreshes observations) and a paused-MDP loop, plus a deterministic fast-step mode (``sim_step_mode=2``) that advances the simulation explicitly with no wall-clock sleep, for training faster than real time. .. _mujoco-install-steps: Trying it --------- Install the backend with the MultiROS installer's opt-in flag (it fetches a prebuilt MuJoCo and clones ``mujoco_ros_pkgs`` plus the example ``vx300s_mujoco_envs``):: ./install_uniros_stack.sh -m Or, for the Docker image, bake the backend in at build time:: cd UniROS/docker && ./build.sh --mujoco A worked example lives in the standalone package `vx300s_mujoco_envs `_, which validates the backend on the Trossen VX300S reach, push, and pick-and-place tasks. It mirrors the Gazebo VX300S envs but is built entirely from the MuJoCo tooling. The one-command workflow matches the Gazebo envs:: # self-launches roscore + the MuJoCo server + controllers rosrun vx300s_mujoco_envs vx300s_mujoco_reach_test.py or in Python:: import uniros as gym # Importing the task module registers its env id with gymnasium. from vx300s_mujoco_envs.task_envs.reach import vx300s_mujoco_reach # noqa: F401 env = gym.make("VX300SMujocoReacherSim-v0") obs, info = env.reset(seed=0) for _ in range(50): obs, reward, terminated, truncated, info = env.step(env.action_space.sample()) if terminated or truncated: obs, info = env.reset() env.close() Training reuses the same ``sb3_ros_support`` wrappers as the Gazebo envs (SAC, TD3, …), with optional TensorBoard and Weights & Biases monitoring. Status and limitations ---------------------- * Not yet in a stable release; available on an experimental MultiROS branch (see install steps above). * Implemented: VX300S reach, push, and pick-and-place, each with a goal-conditioned (HER) variant. * Additional robots/tasks may follow. * APIs may change before this lands in a release. Differences from the Gazebo backend ----------------------------------- The MuJoCo and Gazebo backends are deliberately implemented as sibling sub-systems (mirroring the way RealROS sits next to MultiROS). They share the public Task/Robot env hooks (``_set_action``, ``_get_observation``, ``compute_reward``, ``_compute_terminated``, ``_compute_truncated``, ``_set_init_params``, ``_check_connection_and_readiness``), so a Task env that only overrides those ports unchanged between backends. The differences below are at the **utility layer** underneath, where each backend talks to its own simulator. They mirror real differences in what each simulator offers, not parity gaps to be papered over. **Object spawn / remove.** Gazebo exposes module-level free functions (``gazebo_models.spawn_sdf_model_gazebo``, ``gazebo_models.remove_model_gazebo``) backed by stateless service calls. MuJoCo has no runtime spawn / delete service; the only way to change scene topology is to regenerate the MJCF and call ``~reload``. The framework wraps this in a stateful ``MujocoSceneManager`` class (``multiros.utils.mujoco_models.MujocoSceneManager``): construct it with your base MJCF scene and call ``spawn_primitive``, ``spawn_model``, and ``remove_model`` on the instance. Each call re-initialises the whole simulation, so it is suited to per-episode scene changes rather than per-step updates. To reposition an object that is already in the scene (declared with a free joint), use ``mujoco_set_body_state`` — much cheaper than a reload. **Model state vs body state.** Gazebo's accessors are named ``gazebo_get_model_state`` / ``gazebo_set_model_state`` because the Gazebo service operates on named models with a configurable reference frame. MuJoCo's underlying service operates on bodies (with the constraint that ``set_pose`` only applies to bodies whose parent joint is a free joint), so the framework's accessors are ``mujoco_get_body_state`` / ``mujoco_set_body_state``. The names match the simulator vocabulary; do not assume one maps to the other transparently. **Concurrency / per-instance isolation.** Gazebo's parallelism uses distinct ``ROS_MASTER_URI`` *and* ``GAZEBO_MASTER_URI`` ports per instance. MuJoCo's parallelism uses a single ``ROS_MASTER_URI`` per instance plus a ``server_name`` graph identifier (``mujoco_server`` by default) — the MuJoCo server has no master-URI concept. Code that threads ``gazebo_port`` to identify a Gazebo instance has to thread ``server_name`` for MuJoCo. **No model enumeration service.** Gazebo's ``/gazebo/get_world_properties`` returns the list of model names in the world; the framework uses this for spawn / remove verify loops. ``mujoco_ros_pkgs`` exposes no equivalent service — the scene topology is fixed in the MJCF and known to the env code. Query a body's state by name with ``mujoco_get_body_state``; do not attempt to enumerate the scene. **Robot placement.** Gazebo's ``spawn_robot_in_gazebo`` accepts the robot's spawn pose as ``pos_x/y/z`` + ``ori_w/x/y/z`` kwargs because the robot is spawned at runtime. MuJoCo loads the robot from the MJCF, so the robot's pose lives in the MJCF ````. The ``spawn_robot_in_mujoco`` signature does not take placement kwargs; edit the MJCF to change the pose. **Reset semantics.** Gazebo distinguishes ``reset_world`` (data only, time preserved) and ``reset_simulation`` (data + time) via the ``reset_mode`` kwarg. MuJoCo has a single ``~reset`` that calls ``mj_resetData`` and re-applies the configured initial joint states — there is no world / simulation distinction. The MuJoCo backend therefore has no ``reset_mode`` kwarg; pass nothing.