VX300S — Push ============= The arm must push a 4 cm cube across the cafe-table to a goal point on the table top. The closed gripper acts as a flat paddle (gripper not in the action vector). Achieved goal is the cube position. Env IDs: ``VX300SPushSim-v0`` / ``VX300SPushGoalSim-v0`` / ``VX300SPushReal-v0`` / ``VX300SPushGoalReal-v0``. Description ----------- VX300S flush-mounted on the cafe-table (z = 0.78). At reset the arm goes to home pose, the gripper closes (``init_close_gripper = [0.025, -0.025]`` m), and a red cube spawns on the table top at z = 0.795. The agent commands joint-space or EE-space deltas; contact with the closed gripper slides the cube toward the goal. Action Space ------------ **Joint mode** (default). Box(6,) — same 6-joint command as :doc:`reach` (waist / shoulder / elbow / forearm_roll / wrist_angle / wrist_rotate). Gripper is NOT in the action vector for push. **EE mode** (``ee_action_type=True``). Box(3,) — Δ EE position. Observation Space ----------------- **Standard env (``VX300SPushSim-v0`` / ``VX300SPushReal-v0``).** Box. Extends the reach obs with cube state: .. list-table:: :widths: 8 14 36 30 12 :header-rows: 1 * - Idx - Dim - Component - Source - Unit * - 0–2 - 3 - EE position - MoveIt FK - m * - 3–5 - 3 - EE rpy - MoveIt - rad * - 6–8 - 3 - Unit vector cube → goal - normalized - unitless * - 9 - 1 - Distance cube → goal - ‖goal − cube‖ - m * - 10–18 - 9 - Current joint positions - ``/vx300s/joint_states`` - rad / m * - 19–24 - 6 (or 3) - Previous action - cached - matches action space * - 25–33 - 9 - Current joint velocities - ``/vx300s/joint_states`` - rad/s / m/s * - 34–36 - 3 - Cube position (base frame) - Gazebo / ``/cube_pose`` - m * - 37–39 - 3 - Cube rpy - same source - rad * - 40–42 - 3 - Cube linear velocity (finite-diff) - cached + dt - m/s * - 43–45 - 3 - Cube angular velocity - cached + dt - rad/s * - 46–48 - 3 - Cube relative to EE - cube − ee - m **Goal env (``VX300SPushGoalSim-v0`` / ``VX300SPushGoalReal-v0``).** Dict. ``desired_goal`` = Box(3,) on table top (``x ∈ [0.20, 0.35]``, ``y ∈ [-0.15, 0.15]``, ``z ≈ 0.015`` from the on-table-base perspective — the goal sampling is in the vx300s/base_link frame). ``achieved_goal`` = Box(3,) cube XYZ. Rewards ------- **Sparse**: ``0.0`` if ``‖cube − goal‖ < reach_tolerance`` else ``-1.0``. **Dense**: same reward components as :doc:`reach` but distance is measured from the **cube** to the goal (not EE to goal). Defaults from ``config/vx300s_push_task_config.yaml`` match the reach defaults plus push-specific reset / shaping. Starting State -------------- Joint pose: zeros (Interbotix URDF home; safe on-table mount). Gripper closed at ``init_close_gripper = [0.025, -0.025]`` m. **Cube spawn.** 4 cm red cube at default ``cube_init_pos = [0.25, 0.0, 0.015]`` in the base frame (= ``[0.45, 0.0, 0.795]`` in world). Randomised within ``random_cube_spawn`` box if enabled. **Goal sampling.** Push goal ∈ Box(3,) on the table top within ``position_goal_min/max``. The ``goal_x_min`` was bumped from 0.15 to 0.20 — the VX300S's longer reach pushes the near-base dead-zone further out than the RX200's. Episode End ----------- **Truncation.** ``max_episode_steps`` (default 100). Real env also on stale ``/joint_states``. **Termination.** ``‖cube − goal‖ < reach_tolerance`` (sparse only). Arguments --------- Inherits :doc:`reach` kwargs plus push-specific: ``random_cube_spawn`` (bool), ``random_goal`` (bool), ``cube_pose_topic`` *(real only, default ``/cube_pose``)*, ``cube_pose_timeout_s`` *(real only, default 1.0 s)*, ``auto_launch_cube_tracker`` / ``cube_tracker_camera`` / ``cube_tracker_target_frame`` *(real only)*. Version History --------------- * ``v0`` — first release (``rl_environments`` v0.1.0). Closed-gripper paddle. ``goal_x_min=0.20`` (bumped from 0.15 due to VX300S near-base dead-zone).