VX300S — Push

The arm must push a 4 cm cube across the cafe-table to a goal point on the table top. The closed gripper acts as a flat paddle (gripper not in the action vector). Achieved goal is the cube position.

Env IDs: VX300SPushSim-v0 / VX300SPushGoalSim-v0 / VX300SPushReal-v0 / VX300SPushGoalReal-v0.

Description

VX300S flush-mounted on the cafe-table (z = 0.78). At reset the arm goes to home pose, the gripper closes (init_close_gripper = [0.025, -0.025] m), and a red cube spawns on the table top at z = 0.795. The agent commands joint-space or EE-space deltas; contact with the closed gripper slides the cube toward the goal.

Action Space

Joint mode (default). Box(6,) — same 6-joint command as VX300S — Reach (waist / shoulder / elbow / forearm_roll / wrist_angle / wrist_rotate). Gripper is NOT in the action vector for push.

EE mode (ee_action_type=True). Box(3,) — Δ EE position.

Observation Space

Standard env (``VX300SPushSim-v0`` / ``VX300SPushReal-v0``). Box. Extends the reach obs with cube state:

Idx

Dim

Component

Source

Unit

0–2

3

EE position

MoveIt FK

m

3–5

3

EE rpy

MoveIt

rad

6–8

3

Unit vector cube → goal

normalized

unitless

9

1

Distance cube → goal

‖goal − cube‖

m

10–18

9

Current joint positions

/vx300s/joint_states

rad / m

19–24

6 (or 3)

Previous action

cached

matches action space

25–33

9

Current joint velocities

/vx300s/joint_states

rad/s / m/s

34–36

3

Cube position (base frame)

Gazebo / /cube_pose

m

37–39

3

Cube rpy

same source

rad

40–42

3

Cube linear velocity (finite-diff)

cached + dt

m/s

43–45

3

Cube angular velocity

cached + dt

rad/s

46–48

3

Cube relative to EE

cube − ee

m

Goal env (``VX300SPushGoalSim-v0`` / ``VX300SPushGoalReal-v0``). Dict. desired_goal = Box(3,) on table top (x [0.20, 0.35], y [-0.15, 0.15], z 0.015 from the on-table-base perspective — the goal sampling is in the vx300s/base_link frame). achieved_goal = Box(3,) cube XYZ.

Rewards

Sparse: 0.0 if ‖cube goal‖ < reach_tolerance else -1.0.

Dense: same reward components as VX300S — Reach but distance is measured from the cube to the goal (not EE to goal). Defaults from config/vx300s_push_task_config.yaml match the reach defaults plus push-specific reset / shaping.

Starting State

Joint pose: zeros (Interbotix URDF home; safe on-table mount). Gripper closed at init_close_gripper = [0.025, -0.025] m.

Cube spawn. 4 cm red cube at default cube_init_pos = [0.25, 0.0, 0.015] in the base frame (= [0.45, 0.0, 0.795] in world). Randomised within random_cube_spawn box if enabled.

Goal sampling. Push goal ∈ Box(3,) on the table top within position_goal_min/max. The goal_x_min was bumped from 0.15 to 0.20 — the VX300S’s longer reach pushes the near-base dead-zone further out than the RX200’s.

Episode End

Truncation. max_episode_steps (default 100). Real env also on stale /joint_states.

Termination. ‖cube goal‖ < reach_tolerance (sparse only).

Arguments

Inherits VX300S — Reach kwargs plus push-specific: random_cube_spawn (bool), random_goal (bool), cube_pose_topic (real only, default ``/cube_pose``), cube_pose_timeout_s (real only, default 1.0 s), auto_launch_cube_tracker / cube_tracker_camera / cube_tracker_target_frame (real only).

Version History

  • v0 — first release (rl_environments v0.1.0). Closed-gripper paddle. goal_x_min=0.20 (bumped from 0.15 due to VX300S near-base dead-zone).