rl_environments API reference
Pre-built gymnasium environments for the robots and tasks the
ecosystem currently supports. The package is organised as
rl_environments/{robot}/{sim,real}/{robot_envs,task_envs}/....
For the user-facing list of working environments see Ready-made environments.
Top-level package
RX200 — Simulation (Gazebo)
Robot envs
- class rl_environments.rx200.sim.robot_envs.rx200_robot_sim.RX200RobotEnv(ros_port=None, gazebo_port=None, gazebo_pid=None, seed=None, real_time=False, action_cycle_time=0.0, load_cube=False, load_table=False, use_kinect=False)[source]
Bases:
GazeboBaseEnvSuperclass for all RX200 Robot environments.
Initializes a new Robot Environment
Describe the robot and the sensors used in the env.
- Sensor Topic List:
MoveIt: To get the pose and rpy of the robot. /joint_states: JointState received for the joints of the robot /head_mount_kinect2/depth/image_raw: Depth image from the kinect sensor /head_mount_kinect2/rgb/image_raw: RGB image from the kinect sensor
- Actuators Topic List:
MoveIt: Send the joint positions to the robot. /rx200/arm_controller/command: Send the joint positions to the robot. /rx200/gripper_controller/command: Send the joint positions to the robot.
- Parameters:
- ros_kin
Finished __init__ method
- get_model_pose(model_name='red_cube', rpy=True)[source]
Get the pose of an object in Gazebo
- Parameters:
model_name – name of the object whose pose is to be retrieved
rpy – True if the orientation is to be returned as euler angles (default: True)
- Returns:
True if the pose is retrieved successful position: position of the object as a numpy array orientation: orientation of the object as a numpy array (rpy or quaternion)
- Return type:
success
- spawn_cube_in_gazebo(model_pos_x, model_pos_y)[source]
Spawn a cube in Gazebo
- Parameters:
model_pos_x – x-coordinate of the cube
model_pos_y – y-coordinate of the cube
- Returns:
True if the cube is spawned successfully
- Return type:
done
- fk_pykdl(action)[source]
Function to calculate the forward kinematics of the robot arm. We are using pykdl_utils.
- Parameters:
action – joint positions of the robot arm (in radians)
- Returns:
end-effector position as a numpy array
- Return type:
ee_position
- calculate_fk(joint_positions, euler=True)[source]
Calculate the forward kinematics of the robot arm using the ros_kinematics package.
- Parameters:
joint_positions – joint positions of the robot arm (in radians)
euler – True if the orientation is to be returned as euler angles (default: True)
- Returns:
True if the FK calculation is successful ee_position: end-effector position as a numpy array ee_rpy: end-effector orientation as a list of rpy or quaternion values
- Return type:
done
- calculate_ik(target_pos, ee_ori=array([0., 0., 0., 1.]))[source]
Calculate the inverse kinematics of the robot arm using the ros_kinematics package.
- Parameters:
target_pos – target end-effector position as a numpy array
ee_ori – end-effector orientation as a list of quaternion values (default: [0.0, 0.0, 0.0, 1.0])
- Returns:
True if the IK calculation is successful joint_positions: joint positions of the robot arm (in radians)
- Return type:
done
- move_arm_joints(q_positions, time_from_start=0.5)[source]
Set a joint position target only for the arm joints using low-level ros controllers.
- move_gripper_joints(q_positions, time_from_start=0.5)[source]
Set a joint position target only for the gripper joints using low-level ros controllers.
- smooth_trajectory(q_positions, time_from_start, multiplier=100)[source]
Smooth the trajectory by interpolating between the current and target positions.
- Parameters:
q_positions – target joint positions
time_from_start – time from start of the trajectory (set the speed to complete the trajectory)
multiplier – number of steps to interpolate between the current and target positions
- publish_trajectory(trajectory_points)[source]
Publish the entire trajectory at once.
- Parameters:
trajectory_points – List of tuples containing joint positions and time_from_start
- set_trajectory_joints(q_positions)[source]
Set a joint position target only for the arm joints using moveit.
- set_trajectory_ee(pos)[source]
Set a pose target for the end effector of the robot arm using moveit.
- get_ee_pose()[source]
Returns the end-effector pose as a geometry_msgs/PoseStamped message
This gives us the best pose if we are using the moveit config of the ReactorX repo They are getting the pose with ee_gripper_link
- get_ee_rpy()[source]
Returns the end-effector orientation as a list of roll, pitch, and yaw angles.
- class rl_environments.rx200.sim.robot_envs.rx200_robot_goal_sim.RX200RobotGoalEnv(ros_port=None, gazebo_port=None, gazebo_pid=None, seed=None, real_time=False, action_cycle_time=0.0, load_cube=False, load_table=False, use_kinect=False)[source]
Bases:
GazeboGoalEnvSuperclass for all RX200 Robot Goal environments.
Initializes a new Robot Goal Environment
Describe the robot and the sensors used in the env.
- Sensor Topic List:
MoveIt: To get the pose and rpy of the robot. /joint_states: JointState received for the joints of the robot /head_mount_kinect2/depth/image_raw: Depth image from the kinect sensor /head_mount_kinect2/rgb/image_raw: RGB image from the kinect sensor
- Actuators Topic List:
MoveIt: Send the joint positions to the robot. /rx200/arm_controller/command: Send the joint positions to the robot. /rx200/gripper_controller/command: Send the joint positions to the robot.
- Parameters:
- get_model_pose(model_name='red_cube')[source]
Get the pose of an object in Gazebo
- Parameters:
model_name – name of the object whose pose is to be retrieved
- Returns:
pose of the object as a geometry_msgs/PoseStamped message
- Return type:
pose
- spawn_cube_in_gazebo(model_pos_x, model_pos_y)[source]
Spawn a cube in Gazebo
- Parameters:
model_pos_x – x-coordinate of the cube
model_pos_y – y-coordinate of the cube
- Returns:
True if the cube is spawned successfully
- Return type:
done
- fk_pykdl(action)[source]
Function to calculate the forward kinematics of the robot arm. We are using pykdl_utils.
- Parameters:
action – joint positions of the robot arm (in radians)
- Returns:
end-effector position as a numpy array
- Return type:
ee_position
- calculate_fk(joint_positions, euler=True)[source]
Calculate the forward kinematics of the robot arm using the ros_kinematics package.
- Parameters:
joint_positions – joint positions of the robot arm (in radians)
euler – True if the orientation is to be returned as euler angles (default: True)
- Returns:
True if the FK calculation is successful ee_position: end-effector position as a numpy array ee_rpy: end-effector orientation as a list of rpy or quaternion values
- Return type:
done
- calculate_ik(target_pos, ee_ori=array([0., 0., 0., 1.]))[source]
Calculate the inverse kinematics of the robot arm using the ros_kinematics package.
- Parameters:
target_pos – target end-effector position as a numpy array
ee_ori – end-effector orientation as a list of quaternion values (default: [0.0, 0.0, 0.0, 1.0])
- Returns:
True if the IK calculation is successful joint_positions: joint positions of the robot arm (in radians)
- Return type:
done
- move_arm_joints(q_positions, time_from_start=0.5)[source]
Set a joint position target only for the arm joints using low-level ros controllers.
- move_gripper_joints(q_positions, time_from_start=0.5)[source]
Set a joint position target only for the gripper joints using low-level ros controllers.
- smooth_trajectory(q_positions, time_from_start, multiplier=100)[source]
Smooth the trajectory by interpolating between the current and target positions.
- Parameters:
q_positions – target joint positions
time_from_start – time from start of the trajectory (set the speed to complete the trajectory)
multiplier – number of steps to interpolate between the current and target positions
- publish_trajectory(trajectory_points)[source]
Publish the entire trajectory at once.
- Parameters:
trajectory_points – List of tuples containing joint positions and time_from_start
- set_trajectory_joints(q_positions)[source]
Set a joint position target only for the arm joints using moveit.
- set_trajectory_ee(pos)[source]
Set a pose target for the end effector of the robot arm using moveit.
- get_ee_pose()[source]
Returns the end-effector pose as a geometry_msgs/PoseStamped message
This gives us the best pose if we are using the moveit config of the ReactorX repo They are getting the pose with ee_gripper_link
- get_ee_rpy()[source]
Returns the end-effector orientation as a list of roll, pitch, and yaw angles.
Vision-augmented robot envs
- class rl_environments.rx200.sim.robot_envs.rx200_robot_sim_zed2.RX200RobotEnv(ros_port=None, gazebo_port=None, gazebo_pid=None, seed=None, real_time=False, action_cycle_time=0.0, load_cube=False, load_table=False, use_zed2=False)[source]
Bases:
GazeboBaseEnv- Superclass for all RX200 Robot environments.
Uses a ZED2 camera for RGB and Depth images
Initializes a new Robot Environment
Describe the robot and the sensors used in the env.
- Sensor Topic List:
MoveIt: To get the pose and rpy of the robot. /joint_states: JointState received for the joints of the robot /rx200/zed2/depth/depth_registered: Depth image from ZED2 camera /rx200/zed2/left/image_rect_color: RGB image from ZED2 camera
- Actuators Topic List:
MoveIt: Send the joint positions to the robot. /rx200/arm_controller/command: Send the joint positions to the arm controller /rx200/gripper_controller/command: Send the joint positions to the gripper controller
- Parameters:
- ros_kin
Finished __init__ method
- get_model_pose(model_name='red_cube')[source]
Get the pose of an object in Gazebo
- Parameters:
model_name – name of the object whose pose is to be retrieved
- Returns:
pose of the object as a geometry_msgs/PoseStamped message
- Return type:
pose
- spawn_cube_in_gazebo(model_pos_x, model_pos_y)[source]
Spawn a cube in Gazebo
- Parameters:
model_pos_x – x-coordinate of the cube
model_pos_y – y-coordinate of the cube
- Returns:
True if the cube is spawned successfully
- Return type:
done
- fk_pykdl(action)[source]
Function to calculate the forward kinematics of the robot arm. We are using pykdl_utils.
- Parameters:
action – joint positions of the robot arm (in radians)
- Returns:
end-effector position as a numpy array
- Return type:
ee_position
- calculate_fk(joint_positions, euler=True)[source]
Calculate the forward kinematics of the robot arm using the ros_kinematics package.
- Parameters:
joint_positions – joint positions of the robot arm (in radians)
euler – True if the orientation is to be returned as euler angles (default: True)
- Returns:
True if the FK calculation is successful ee_position: end-effector position as a numpy array ee_rpy: end-effector orientation as a list of rpy or quaternion values
- Return type:
done
- calculate_ik(target_pos, ee_ori=array([0., 0., 0., 1.]))[source]
Calculate the inverse kinematics of the robot arm using the ros_kinematics package.
- Parameters:
target_pos – target end-effector position as a numpy array
ee_ori – end-effector orientation as a list of quaternion values (default: [0.0, 0.0, 0.0, 1.0])
- Returns:
True if the IK calculation is successful joint_positions: joint positions of the robot arm (in radians)
- Return type:
done
- move_arm_joints(q_positions, time_from_start=0.5)[source]
Set a joint position target only for the arm joints using low-level ros controllers.
- move_gripper_joints(q_positions, time_from_start=0.5)[source]
Set a joint position target only for the gripper joints using low-level ros controllers.
- set_trajectory_joints(q_positions)[source]
Set a joint position target only for the arm joints using moveit.
- set_trajectory_ee(pos)[source]
Set a pose target for the end effector of the robot arm using moveit.
- get_ee_pose()[source]
Returns the end-effector pose as a geometry_msgs/PoseStamped message
This gives us the best pose if we are using the moveit config of the ReactorX repo They are getting the pose with ee_gripper_link
- get_ee_rpy()[source]
Returns the end-effector orientation as a list of roll, pitch, and yaw angles.
- class rl_environments.rx200.sim.robot_envs.rx200_robot_goal_sim_zed2.RX200RobotGoalEnv(ros_port=None, gazebo_port=None, gazebo_pid=None, seed=None, real_time=False, action_cycle_time=0.0, load_cube=False, load_table=False, use_zed2=False)[source]
Bases:
GazeboGoalEnv- Superclass for all RX200 Robot Goal environments.
Uses a ZED2 camera for RGB and Depth images
Initializes a new Robot Goal Environment
Describe the robot and the sensors used in the env.
- Sensor Topic List:
MoveIt: To get the pose and rpy of the robot. /joint_states: JointState received for the joints of the robot /rx200/zed2/depth/depth_registered: Depth image from the ZED2 camera /rx200/zed2/left/image_rect_color: RGB image from the ZED2 camera
- Actuators Topic List:
MoveIt: Send the joint positions to the robot. /rx200/arm_controller/command: Send the joint positions to the robot. /rx200/gripper_controller/command: Send the joint positions to the robot.
- Parameters:
- get_model_pose(model_name='red_cube')[source]
Get the pose of an object in Gazebo
- Parameters:
model_name – name of the object whose pose is to be retrieved
- Returns:
pose of the object as a geometry_msgs/PoseStamped message
- Return type:
pose
- spawn_cube_in_gazebo(model_pos_x, model_pos_y)[source]
Spawn a cube in Gazebo
- Parameters:
model_pos_x – x-coordinate of the cube
model_pos_y – y-coordinate of the cube
- Returns:
True if the cube is spawned successfully
- Return type:
done
- fk_pykdl(action)[source]
Function to calculate the forward kinematics of the robot arm. We are using pykdl_utils.
- Parameters:
action – joint positions of the robot arm (in radians)
- Returns:
end-effector position as a numpy array
- Return type:
ee_position
- calculate_fk(joint_positions, euler=True)[source]
Calculate the forward kinematics of the robot arm using the ros_kinematics package.
- Parameters:
joint_positions – joint positions of the robot arm (in radians)
euler – True if the orientation is to be returned as euler angles (default: True)
- Returns:
True if the FK calculation is successful ee_position: end-effector position as a numpy array ee_rpy: end-effector orientation as a list of rpy or quaternion values
- Return type:
done
- calculate_ik(target_pos, ee_ori=array([0., 0., 0., 1.]))[source]
Calculate the inverse kinematics of the robot arm using the ros_kinematics package.
- Parameters:
target_pos – target end-effector position as a numpy array
ee_ori – end-effector orientation as a list of quaternion values (default: [0.0, 0.0, 0.0, 1.0])
- Returns:
True if the IK calculation is successful joint_positions: joint positions of the robot arm (in radians)
- Return type:
done
- move_arm_joints(q_positions, time_from_start=0.5)[source]
Set a joint position target only for the arm joints using low-level ros controllers.
- move_gripper_joints(q_positions, time_from_start=0.5)[source]
Set a joint position target only for the gripper joints using low-level ros controllers.
- set_trajectory_joints(q_positions)[source]
Set a joint position target only for the arm joints using moveit.
- set_trajectory_ee(pos)[source]
Set a pose target for the end effector of the robot arm using moveit.
- get_ee_pose()[source]
Returns the end-effector pose as a geometry_msgs/PoseStamped message
This gives us the best pose if we are using the moveit config of the ReactorX repo They are getting the pose with ee_gripper_link
- get_ee_rpy()[source]
Returns the end-effector orientation as a list of roll, pitch, and yaw angles.
Reach task envs (Kinect v2 vision)
- class rl_environments.rx200.sim.task_envs.reach.rx200_kinect_reach_sim.RX200ReacherEnv(launch_gazebo=True, new_roscore=True, roscore_port=None, gazebo_paused=False, gazebo_gui=False, seed=None, reward_type='Dense', delta_action=True, delta_coeff=0.05, ee_action_type=False, environment_loop_rate=10, action_cycle_time=0.1, use_smoothing=False, rgb_obs_only=False, normal_obs_only=True, rgb_plus_normal_obs=False, rgb_plus_depth_plus_normal_obs=False, load_table=True, debug=False, action_speed=0.5, simple_dense_reward=True, log_internal_state=False)[source]
Bases:
RX200RobotEnvThis Task env is for a simple Reach Task with the RX200 robot.
- The task is done if
The robot reached the goal
- Here
Action Space - Continuous (5 actions for joints or 3 xyz position of the end effector)
Observation - Continuous (28 obs or rgb/depth image or a combination)
- Init Args:
launch_gazebo: Whether to launch Gazebo or not. If False, it is assumed that Gazebo is already running.
new_roscore: Whether to launch a new roscore or not. If False, it is assumed that a roscore is already running.
roscore_port: Port of the roscore to be launched. If None, a random port is chosen.
gazebo_paused: Whether to launch Gazebo in a paused state or not.
gazebo_gui: Whether to launch Gazebo with the GUI or not.
seed: Seed for the random number generator.
reward_type: Type of reward to be used. It Can be “Sparse” or “Dense”.
delta_action: Whether to use the delta actions or the absolute actions.
delta_coeff: Coefficient to be used for the delta actions.
ee_action_type: Whether to use the end effector action space or the joint action space.
environment_loop_rate: Rate at which the environment should run. (in Hz) - default 10 Hz (default operating frequency of the robot)
action_cycle_time: Time to wait between two consecutive actions. (in seconds) - default 100 ms (should be equal to larger than the environment loop time “1/environment_loop_rate”)
use_smoothing: Whether to use smoothing for actions or not.
rgb_obs_only: Whether to use only the RGB image as the observations or not.
normal_obs_only: Whether to use only the traditional observations or not.
rgb_plus_normal_obs: Whether to use RGB image and traditional observations or not.
rgb_plus_depth_plus_normal_obs: Whether to use RGB image, Depth image and traditional observations or not.
load_table: Whether to load the table model or not.
debug: Whether to print debug messages or not.
action_speed: set the speed to complete the trajectory. default in 0.5 seconds
simple_dense_reward: Whether to use a simple dense reward or not.
log_internal_state: Whether to log the internal state of the environment or not.
variables to keep track of ros, gazebo ports and gazebo pid
- Parameters:
launch_gazebo (bool)
new_roscore (bool)
roscore_port (str)
gazebo_paused (bool)
gazebo_gui (bool)
seed (int)
reward_type (str)
delta_action (bool)
delta_coeff (float)
ee_action_type (bool)
environment_loop_rate (float)
action_cycle_time (float)
use_smoothing (bool)
rgb_obs_only (bool)
normal_obs_only (bool)
rgb_plus_normal_obs (bool)
rgb_plus_depth_plus_normal_obs (bool)
load_table (bool)
debug (bool)
action_speed (float)
simple_dense_reward (bool)
log_internal_state (bool)
- log_internal_state
Reward Architecture * Dense - Default * Sparse - -1 if not done and 1 if done * All the others and misspellings - default to “Dense” reward
- simple_dense_reward
Use action as deltas
- delta_coeff
Use smoothing for actions
- action_cycle_time
Action speed - Time to complete the trajectory
- action_speed
Observation space * RGB image only * traditional observations only (default) * RGB image and traditional observations (combined) * RGB image, Depth image and traditional observations (combined)
- rgb_plus_depth_plus_normal_obs
Action space * Joint action space (default) * End effector action space
- ee_action_type
Debug
- debug
Load YAML param file
- goal_space
Workspace so we can check if the action is within the workspace
- workspace_space
Define subscribers/publishers and Markers as needed.
- goal_marker
Init super class.
- within_goal_space
Finished __init__ method
- execute_action(action)[source]
Function to apply an action to the robot.
This method should be implemented here to apply the given action to the robot. The action could be a joint position command, a velocity command, or any other type of command that can be applied to the robot.
- Parameters:
action – The action to be applied to the robot.
- sample_observation()[source]
Function to get an observation from the environment.
# traditional observations 01. EE pos - 3 02. Vector to the goal (normalized linear distance) - 3 03. Euclidian distance (ee to reach goal)- 1 04. Current Joint values - 8 05. Previous action - 5 or 3 (joint or ee) 06. Joint velocities - 8
total: (3x2) + 1 + (5 or 3) + (8x2) = 28 or 26
# depth image 480x640 32FC1
# rgb image 480x640X3 RGB images
- Returns:
An observation representing the current state of the environment.
- calculate_reward()[source]
Function to get a reward from the environment.
Sparse Reward: float => 1.0 for success, -1.0 for failure
- Dense Reward:
if reached: self.reached_goal_reward (positive reward) else: - self.mult_dist_reward * distance_to_the_goal
and as always, negative rewards for each step, non-execution and actions not within joint limits
- Returns:
A scalar reward value representing how well the agent is doing in the current episode.
- check_if_done()[source]
Function to check if the episode is done.
The Task is done if the EE is close enough to the goal
- Returns:
A boolean value indicating whether the episode has ended (e.g., because a goal has been reached or a failure condition has been triggered)
- get_random_goal(max_tries=100)[source]
Function to get a reachable goal
- Parameters:
max_tries (int)
- check_action_within_goal_space_fk(action)[source]
Function to check if the given action is within the goal space
- class rl_environments.rx200.sim.task_envs.reach.rx200_kinect_reach_goal_sim.RX200ReacherGoalEnv(launch_gazebo=True, new_roscore=True, roscore_port=None, gazebo_paused=False, gazebo_gui=False, seed=None, reward_type='sparse', delta_action=True, delta_coeff=0.05, ee_action_type=False, environment_loop_rate=10.0, action_cycle_time=0.8, use_smoothing=False, rgb_obs_only=False, normal_obs_only=True, rgb_plus_normal_obs=False, rgb_plus_depth_plus_normal_obs=False, load_table=True, debug=False, log_internal_state=False, action_speed=0.5, simple_dense_reward=True)[source]
Bases:
RX200RobotGoalEnvThis Task env is for a simple Reach Task with the RX200 robot. This env is made to work in a real-time environment.
- The task is done if
The robot reached the goal
- Here
Action Space - Continuous (5 actions for joints or 3 xyz position of the end effector)
Observation - Continuous (28 obs or rgb/depth image or a combination)
Desired Goal - Goal we are trying to reach
Achieved Goal - Position of the EE
- Init Args:
launch_gazebo: Whether to launch Gazebo or not. If False, it is assumed that Gazebo is already running.
new_roscore: Whether to launch a new roscore or not. If False, it is assumed that a roscore is already running.
roscore_port: Port of the roscore to be launched. If None, a random port is chosen.
gazebo_paused: Whether to launch Gazebo in a paused state or not.
gazebo_gui: Whether to launch Gazebo with the GUI or not.
seed: Seed for the random number generator.
reward_type: Type of reward to be used. It Can be “Sparse” or “Dense”.
delta_action: Whether to use the delta actions or the absolute actions.
delta_coeff: Coefficient to be used for the delta actions.
ee_action_type: Whether to use the end effector action space or the joint action space.
environment_loop_rate: Rate at which the environment should run. (in Hz) default is 10 Hz - (default operating frequency of the robot)
action_cycle_time: Time to wait between two consecutive actions. (in seconds) - default 100 ms (should be equal to larger than the environment loop time “1/environment_loop_rate”)
use_smoothing: Whether to use smoothing for actions or not.
rgb_obs_only: Whether to use only the RGB image as the observations or not.
normal_obs_only: Whether to use only the traditional observations or not.
rgb_plus_normal_obs: Whether to use RGB image and traditional observations or not.
rgb_plus_depth_plus_normal_obs: Whether to use RGB image, Depth image and traditional observations or not.
load_table: Whether to load the table model or not.
debug: Whether to print debug messages or not.
log_internal_state: Whether to log the internal state of the environment or not.
action_speed: set the speed to complete the trajectory. default in 0.5 seconds
simple_dense_reward: Whether to use a simple dense reward or not.
variables to keep track of ros, gazebo ports and gazebo pid
- Parameters:
launch_gazebo (bool)
new_roscore (bool)
roscore_port (str)
gazebo_paused (bool)
gazebo_gui (bool)
seed (int)
reward_type (str)
delta_action (bool)
delta_coeff (float)
ee_action_type (bool)
environment_loop_rate (float)
action_cycle_time (float)
use_smoothing (bool)
rgb_obs_only (bool)
normal_obs_only (bool)
rgb_plus_normal_obs (bool)
rgb_plus_depth_plus_normal_obs (bool)
load_table (bool)
debug (bool)
log_internal_state (bool)
action_speed (float)
simple_dense_reward (bool)
- log_internal_state
Reward Architecture * Dense - Default * Sparse - -1 if not done and 1 if done * All the others and misspellings - default to “Dense” reward
- simple_dense_reward
Use action as deltas
- delta_coeff
Use smoothing for actions
- action_cycle_time
Action speed - Time to complete the trajectory
- action_speed
Observation space * RGB image only * traditional observations only (default) * RGB image and traditional observations (combined) * RGB image, Depth image and traditional observations (combined)
- rgb_plus_depth_plus_normal_obs
Action space * Joint action space (default) * End effector action space
- ee_action_type
Debug
- debug
Load YAML param file
- rgb_image_space
Achieved goal (EE pose) - 3
- achieved_goal_space
Desired goal (Goal pose) - 3
- desired_goal_space
Define the overall observation space
- goal_space
Workspace so we can check if the action is within the workspace
- workspace_space
Define subscribers/publishers and Markers as needed.
- goal_marker
Init super class.
- within_goal_space
Finished __init__ method
- compute_reward(achieved_goal, desired_goal, info)[source]
Compute the reward for achieving a given goal.
- compute_terminated(achieved_goal, desired_goal, info)[source]
Function to check if the episode is terminated.
- Parameters:
achieved_goal – EE position
desired_goal – Reach Goal
info (dict) – Additional information about the environment.
- Returns:
A boolean value indicating whether the episode has ended (e.g. because a goal has been reached or a failure condition has been triggered)
- compute_truncated(achieved_goal, desired_goal, info)[source]
Function to check if the episode is truncated.
Mainly hard coded here since we are using a wrapper that sets the max number of steps and truncates the episode.
- Parameters:
achieved_goal – EE position
desired_goal – Reach Goal
info (dict) – Additional information about the environment.
- Returns:
A boolean value indicating whether the episode has been truncated (e.g. because the maximum number of steps has been reached)
- execute_action(action)[source]
Function to apply an action to the robot.
This method should be implemented here to apply the given action to the robot. The action could be a joint position command, a velocity command, or any other type of command that can be applied to the robot.
- Parameters:
action – The action to be applied to the robot.
- sample_observation()[source]
Function to get an observation from the environment.
# traditional observations 01. EE pos - 3 02. Vector to the goal (normalized linear distance) - 3 03. Euclidian distance (ee to reach goal)- 1 04. Current Joint values - 8 05. Previous action - 5 or 3 (joint or ee) 06. Joint velocities - 8
total: (3x2) + 1 + (5 or 3) + (8x2) = 28 or 26
# depth image 480x640 32FC1
# rgb image 480x640X3 RGB images
- Returns:
An observation representing the current state of the environment.
- calculate_reward(achieved_goal=None, desired_goal=None, info=None)[source]
Compute the reward for achieving a given goal.
Sparse Reward: float => 1.0 for success, -1.0 for failure
- Dense Reward:
if reached: self.reached_goal_reward (positive reward) else: - self.mult_dist_reward * distance_to_the_goal
And as always negative rewards for each step, non-execution and actions not within joint limits
- check_if_done()[source]
Function to check if the episode is done.
The Task is done if the EE is close enough to the goal
- Returns:
A boolean value indicating whether the episode has ended (e.g. because a goal has been reached or a failure condition has been triggered)
- get_random_goal(max_tries=100)[source]
Function to get a reachable goal
- Parameters:
max_tries (int)
- check_action_within_goal_space_fk(action)[source]
Function to check if the given action is within the goal space
Reach task envs (ZED 2 vision)
- class rl_environments.rx200.sim.task_envs.reach.rx200_zed2_reach_sim.RX200ReacherEnv(launch_gazebo=True, new_roscore=True, roscore_port=None, gazebo_paused=False, gazebo_gui=False, seed=None, reward_type='Dense', delta_action=True, delta_coeff=0.05, ee_action_type=False, environment_loop_rate=10, action_cycle_time=0.1, use_smoothing=False, rgb_obs_only=False, normal_obs_only=True, rgb_plus_normal_obs=False, rgb_plus_depth_plus_normal_obs=False, load_table=True, debug=False, action_speed=0.5, simple_dense_reward=True, log_internal_state=False)[source]
Bases:
RX200RobotEnvThis Task env is for a simple Reach Task with the RX200 robot.
- The task is done if
The robot reached the goal
- Here
Action Space - Continuous (5 actions for joints or 3 xyz position of the end effector)
Observation - Continuous (28 obs or rgb/depth image or a combination)
- Init Args:
launch_gazebo: Whether to launch Gazebo or not. If False, it is assumed that Gazebo is already running.
new_roscore: Whether to launch a new roscore or not. If False, it is assumed that a roscore is already running.
roscore_port: Port of the roscore to be launched. If None, a random port is chosen.
gazebo_paused: Whether to launch Gazebo in a paused state or not.
gazebo_gui: Whether to launch Gazebo with the GUI or not.
seed: Seed for the random number generator.
reward_type: Type of reward to be used. It Can be “Sparse” or “Dense”.
delta_action: Whether to use the delta actions or the absolute actions.
delta_coeff: Coefficient to be used for the delta actions.
ee_action_type: Whether to use the end effector action space or the joint action space.
environment_loop_rate: Rate at which the environment should run. (in Hz) - default 10 Hz (default operating frequency of the robot)
action_cycle_time: Time to wait between two consecutive actions. (in seconds) - default 100 ms (should be equal to larger than the environment loop time “1/environment_loop_rate”)
use_smoothing: Whether to use smoothing for actions or not.
rgb_obs_only: Whether to use only the RGB image as the observations or not.
normal_obs_only: Whether to use only the traditional observations or not.
rgb_plus_normal_obs: Whether to use RGB image and traditional observations or not.
rgb_plus_depth_plus_normal_obs: Whether to use RGB image, Depth image and traditional observations or not.
load_table: Whether to load the table model or not.
debug: Whether to print debug messages or not.
action_speed: set the speed to complete the trajectory. default in 0.5 seconds
simple_dense_reward: Whether to use a simple dense reward or not.
log_internal_state: Whether to log the internal state of the environment or not.
variables to keep track of ros, gazebo ports and gazebo pid
- Parameters:
launch_gazebo (bool)
new_roscore (bool)
roscore_port (str)
gazebo_paused (bool)
gazebo_gui (bool)
seed (int)
reward_type (str)
delta_action (bool)
delta_coeff (float)
ee_action_type (bool)
environment_loop_rate (float)
action_cycle_time (float)
use_smoothing (bool)
rgb_obs_only (bool)
normal_obs_only (bool)
rgb_plus_normal_obs (bool)
rgb_plus_depth_plus_normal_obs (bool)
load_table (bool)
debug (bool)
action_speed (float)
simple_dense_reward (bool)
log_internal_state (bool)
- log_internal_state
Reward Architecture * Dense - Default * Sparse - -1 if not done and 1 if done * All the others and misspellings - default to “Dense” reward
- simple_dense_reward
Use action as deltas
- delta_coeff
Use smoothing for actions
- action_cycle_time
Action speed - Time to complete the trajectory
- action_speed
Observation space * RGB image only * traditional observations only (default) * RGB image and traditional observations (combined) * RGB image, Depth image and traditional observations (combined)
- rgb_plus_depth_plus_normal_obs
Action space * Joint action space (default) * End effector action space
- ee_action_type
Debug
- debug
Load YAML param file
- goal_space
Workspace so we can check if the action is within the workspace
- workspace_space
Define subscribers/publishers and Markers as needed.
- goal_marker
Init super class.
- within_goal_space
Finished __init__ method
- execute_action(action)[source]
Function to apply an action to the robot.
This method should be implemented here to apply the given action to the robot. The action could be a joint position command, a velocity command, or any other type of command that can be applied to the robot.
- Parameters:
action – The action to be applied to the robot.
- sample_observation()[source]
Function to get an observation from the environment.
# traditional observations 01. EE pos - 3 02. Vector to the goal (normalized linear distance) - 3 03. Euclidian distance (ee to reach goal)- 1 04. Current Joint values - 8 05. Previous action - 5 or 3 (joint or ee) 06. Joint velocities - 8
total: (3x2) + 1 + (5 or 3) + (8x2) = 28 or 26
# depth image 480x640 32FC1
# rgb image 480x640X3 RGB images
- Returns:
An observation representing the current state of the environment.
- calculate_reward()[source]
Function to get a reward from the environment.
Sparse Reward: float => 1.0 for success, -1.0 for failure
- Dense Reward:
if reached: self.reached_goal_reward (positive reward) else: - self.mult_dist_reward * distance_to_the_goal
and as always, negative rewards for each step, non-execution and actions not within joint limits
- Returns:
A scalar reward value representing how well the agent is doing in the current episode.
- check_if_done()[source]
Function to check if the episode is done.
The Task is done if the EE is close enough to the goal
- Returns:
A boolean value indicating whether the episode has ended (e.g., because a goal has been reached or a failure condition has been triggered)
- get_random_goal(max_tries=100)[source]
Function to get a reachable goal
- Parameters:
max_tries (int)
- check_action_within_goal_space_fk(action)[source]
Function to check if the given action is within the goal space
- class rl_environments.rx200.sim.task_envs.reach.rx200_zed2_reach_goal_sim.RX200ReacherGoalEnv(launch_gazebo=True, new_roscore=True, roscore_port=None, gazebo_paused=False, gazebo_gui=False, seed=None, reward_type='sparse', delta_action=True, delta_coeff=0.05, ee_action_type=False, environment_loop_rate=10, action_cycle_time=0.8, use_smoothing=False, rgb_obs_only=False, normal_obs_only=True, rgb_plus_normal_obs=False, rgb_plus_depth_plus_normal_obs=False, load_table=True, debug=False, log_internal_state=False, action_speed=0.5, simple_dense_reward=True)[source]
Bases:
RX200RobotGoalEnvThis Task env is for a simple Reach Task with the RX200 robot. This env is made to work in a real-time environment.
- The task is done if
The robot reached the goal
- Here
Action Space - Continuous (5 actions for joints or 3 xyz position of the end effector)
Observation - Continuous (28 obs or rgb/depth image or a combination)
Desired Goal - Goal we are trying to reach
Achieved Goal - Position of the EE
- Init Args:
launch_gazebo: Whether to launch Gazebo or not. If False, it is assumed that Gazebo is already running.
new_roscore: Whether to launch a new roscore or not. If False, it is assumed that a roscore is already running.
roscore_port: Port of the roscore to be launched. If None, a random port is chosen.
gazebo_paused: Whether to launch Gazebo in a paused state or not.
gazebo_gui: Whether to launch Gazebo with the GUI or not.
seed: Seed for the random number generator.
reward_type: Type of reward to be used. It Can be “Sparse” or “Dense”.
delta_action: Whether to use the delta actions or the absolute actions.
delta_coeff: Coefficient to be used for the delta actions.
ee_action_type: Whether to use the end effector action space or the joint action space.
environment_loop_rate: Rate at which the environment should run. (in Hz) default is 10 Hz - (default operating frequency of the robot)
action_cycle_time: Time to wait between two consecutive actions. (in seconds) - default 100 ms (should be equal to larger than the environment loop time “1/environment_loop_rate”)
use_smoothing: Whether to use smoothing for actions or not.
rgb_obs_only: Whether to use only the RGB image as the observations or not.
normal_obs_only: Whether to use only the traditional observations or not.
rgb_plus_normal_obs: Whether to use RGB image and traditional observations or not.
rgb_plus_depth_plus_normal_obs: Whether to use RGB image, Depth image and traditional observations or not.
load_table: Whether to load the table model or not.
debug: Whether to print debug messages or not.
log_internal_state: Whether to log the internal state of the environment or not.
action_speed: set the speed to complete the trajectory. default in 0.5 seconds
simple_dense_reward: Whether to use a simple dense reward or not.
variables to keep track of ros, gazebo ports and gazebo pid
- Parameters:
launch_gazebo (bool)
new_roscore (bool)
roscore_port (str)
gazebo_paused (bool)
gazebo_gui (bool)
seed (int)
reward_type (str)
delta_action (bool)
delta_coeff (float)
ee_action_type (bool)
environment_loop_rate (float)
action_cycle_time (float)
use_smoothing (bool)
rgb_obs_only (bool)
normal_obs_only (bool)
rgb_plus_normal_obs (bool)
rgb_plus_depth_plus_normal_obs (bool)
load_table (bool)
debug (bool)
log_internal_state (bool)
action_speed (float)
simple_dense_reward (bool)
- log_internal_state
Reward Architecture * Dense - Default * Sparse - -1 if not done and 1 if done * All the others and misspellings - default to “Dense” reward
- simple_dense_reward
Use action as deltas
- delta_coeff
Use smoothing for actions
- action_cycle_time
Action speed - Time to complete the trajectory
- action_speed
Observation space * RGB image only * traditional observations only (default) * RGB image and traditional observations (combined) * RGB image, Depth image and traditional observations (combined)
- rgb_plus_depth_plus_normal_obs
Action space * Joint action space (default) * End effector action space
- ee_action_type
Debug
- debug
Load YAML param file
- rgb_image_space
Achieved goal (EE pose) - 3
- achieved_goal_space
Desired goal (Goal pose) - 3
- desired_goal_space
Define the overall observation space
- goal_space
Workspace so we can check if the action is within the workspace
- workspace_space
Define subscribers/publishers and Markers as needed.
- goal_marker
Init super class.
- within_goal_space
Finished __init__ method
- compute_reward(achieved_goal, desired_goal, info)[source]
Compute the reward for achieving a given goal.
- compute_terminated(achieved_goal, desired_goal, info)[source]
Function to check if the episode is terminated.
- Parameters:
achieved_goal – EE position
desired_goal – Reach Goal
info (dict) – Additional information about the environment.
- Returns:
A boolean value indicating whether the episode has ended (e.g. because a goal has been reached or a failure condition has been triggered)
- compute_truncated(achieved_goal, desired_goal, info)[source]
Function to check if the episode is truncated.
Mainly hard coded here since we are using a wrapper that sets the max number of steps and truncates the episode.
- Parameters:
achieved_goal – EE position
desired_goal – Reach Goal
info (dict) – Additional information about the environment.
- Returns:
A boolean value indicating whether the episode has been truncated (e.g. because the maximum number of steps has been reached)
- execute_action(action)[source]
Function to apply an action to the robot.
This method should be implemented here to apply the given action to the robot. The action could be a joint position command, a velocity command, or any other type of command that can be applied to the robot.
- Parameters:
action – The action to be applied to the robot.
- sample_observation()[source]
Function to get an observation from the environment.
# traditional observations 01. EE pos - 3 02. Vector to the goal (normalized linear distance) - 3 03. Euclidian distance (ee to reach goal)- 1 04. Current Joint values - 8 05. Previous action - 5 or 3 (joint or ee) 06. Joint velocities - 8
total: (3x2) + 1 + (5 or 3) + (8x2) = 28 or 26
# depth image 480x640 32FC1
# rgb image 480x640X3 RGB images
- Returns:
An observation representing the current state of the environment.
- calculate_reward(achieved_goal=None, desired_goal=None, info=None)[source]
Compute the reward for achieving a given goal.
Sparse Reward: float => 1.0 for success, -1.0 for failure
- Dense Reward:
if reached: self.reached_goal_reward (positive reward) else: - self.mult_dist_reward * distance_to_the_goal
And as always negative rewards for each step, non-execution and actions not within joint limits
- check_if_done()[source]
Function to check if the episode is done.
The Task is done if the EE is close enough to the goal
- Returns:
A boolean value indicating whether the episode has ended (e.g. because a goal has been reached or a failure condition has been triggered)
- get_random_goal(max_tries=100)[source]
Function to get a reachable goal
- Parameters:
max_tries (int)
- check_action_within_goal_space_fk(action)[source]
Function to check if the given action is within the goal space
RX200 — Real hardware
Robot envs
- class rl_environments.rx200.real.robot_envs.rx200_robot_real.RX200RobotEnv(ros_port=None, seed=None, close_env_prompt=False, action_cycle_time=0.0, use_kinect=False, use_zed2=False)[source]
Bases:
RealBaseEnvSuperclass for all real RX200 Robot environments.
Initializes a new Robot Environment
Describe the robot and the sensors used in the env.
- Sensor Topic List:
MoveIt: To get the pose and rpy of the robot. /joint_states: JointState received for the joints of the robot /head_mount_kinect2/depth/image_raw: Depth image from the kinect sensor /head_mount_kinect2/rgb/image_raw: RGB image from the kinect sensor
- Actuators Topic List:
MoveIt!: Send the joint positions to the robot. /rx200/arm_controller/command: Send the joint positions to the robot. /rx200/gripper_controller/command: Send the joint positions to the robot
- ros_kin
Finished __init__ method
- fk_pykdl(action)[source]
Function to calculate the forward kinematics of the robot arm. We are using pykdl_utils.
- Parameters:
action – joint positions of the robot arm (in radians)
- Returns:
end-effector position as a numpy array
- Return type:
ee_position
- calculate_fk(joint_positions, euler=True)[source]
Calculate the forward kinematics of the robot arm using the ros_kinematics package.
- Parameters:
joint_positions – joint positions of the robot arm (in radians)
euler – True if the orientation is to be returned as euler angles (default: True)
- Returns:
True if the FK calculation is successful ee_position: end-effector position as a numpy array ee_rpy: end-effector orientation as a list of rpy or quaternion values
- Return type:
done
- calculate_ik(target_pos, ee_ori=array([0., 0., 0., 1.]))[source]
Calculate the inverse kinematics of the robot arm using the ros_kinematics package.
- Parameters:
target_pos – target end-effector position as a numpy array
ee_ori – end-effector orientation as a list of quaternion values (default: [0.0, 0.0, 0.0, 1.0])
- Returns:
True if the IK calculation is successful joint_positions: joint positions of the robot arm (in radians)
- Return type:
done
- move_arm_joints(q_positions, time_from_start=0.5)[source]
Set a joint position target only for the arm joints using low-level ros controllers.
- move_gripper_joints(q_positions, time_from_start=0.5)[source]
Set a joint position target only for the gripper joints using low-level ros controllers.
- smooth_trajectory(q_positions, time_from_start, multiplier=100)[source]
Smooth the trajectory by interpolating between the current and target positions.
- Parameters:
q_positions – target joint positions
time_from_start – time from start of the trajectory (set the speed to complete the trajectory)
multiplier – number of steps to interpolate between the current and target positions
- publish_trajectory(trajectory_points)[source]
Publish the entire trajectory at once.
- Parameters:
trajectory_points – List of tuples containing joint positions and time_from_start
- get_ee_pose()[source]
Returns the end-effector pose as a geometry_msgs/PoseStamped message
This gives us the best pose if we are using the moveit config of the ReactorX repo They are getting the pose with ee_gripper_link
- get_ee_rpy()[source]
Returns the end-effector orientation as a list of roll, pitch, and yaw angles.
- class rl_environments.rx200.real.robot_envs.rx200_robot_goal_real.RX200RobotGoalEnv(ros_port=None, seed=None, close_env_prompt=False, action_cycle_time=0.0, use_kinect=False, use_zed2=False)[source]
Bases:
RealGoalEnvSuperclass for all real RX200 Robot environments. - For goal-conditioned tasks
Initializes a new Robot Goal Environment
Describe the robot and the sensors used in the env.
- Sensor Topic List:
MoveIt: To get the pose and rpy of the robot. /joint_states: JointState received for the joints of the robot /head_mount_kinect2/depth/image_raw: Depth image from the kinect sensor /head_mount_kinect2/rgb/image_raw: RGB image from the kinect sensor
- Actuators Topic List:
MoveIt!: Send the joint positions to the robot. /rx200/arm_controller/command: Send the joint positions to the robot. /rx200/gripper_controller/command: Send the joint positions to the robot
- ros_kin
Finished __init__ method
- fk_pykdl(action)[source]
Function to calculate the forward kinematics of the robot arm. We are using pykdl_utils.
- Parameters:
action – joint positions of the robot arm (in radians)
- Returns:
end-effector position as a numpy array
- Return type:
ee_position
- calculate_fk(joint_positions, euler=True)[source]
Calculate the forward kinematics of the robot arm using the ros_kinematics package.
- Parameters:
joint_positions – joint positions of the robot arm (in radians)
euler – True if the orientation is to be returned as euler angles (default: True)
- Returns:
True if the FK calculation is successful ee_position: end-effector position as a numpy array ee_rpy: end-effector orientation as a list of rpy or quaternion values
- Return type:
done
- calculate_ik(target_pos, ee_ori=array([0., 0., 0., 1.]))[source]
Calculate the inverse kinematics of the robot arm using the ros_kinematics package.
- Parameters:
target_pos – target end-effector position as a numpy array
ee_ori – end-effector orientation as a list of quaternion values (default: [0.0, 0.0, 0.0, 1.0])
- Returns:
True if the IK calculation is successful joint_positions: joint positions of the robot arm (in radians)
- Return type:
done
- move_arm_joints(q_positions, time_from_start=0.5)[source]
Set a joint position target only for the arm joints using low-level ros controllers.
- move_gripper_joints(q_positions, time_from_start=0.5)[source]
Set a joint position target only for the gripper joints using low-level ros controllers.
- smooth_trajectory(q_positions, time_from_start, multiplier=100)[source]
Smooth the trajectory by interpolating between the current and target positions.
- Parameters:
q_positions – target joint positions
time_from_start – time from start of the trajectory (set the speed to complete the trajectory)
multiplier – number of steps to interpolate between the current and target positions
- publish_trajectory(trajectory_points)[source]
Publish the entire trajectory at once.
- Parameters:
trajectory_points – List of tuples containing joint positions and time_from_start
- get_ee_pose()[source]
Returns the end-effector pose as a geometry_msgs/PoseStamped message
This gives us the best pose if we are using the moveit config of the ReactorX repo They are getting the pose with ee_gripper_link
- get_ee_rpy()[source]
Returns the end-effector orientation as a list of roll, pitch, and yaw angles.
Niryo Ned2 — Simulation
Robot envs
- class rl_environments.ned2.sim.robot_envs.ned2_robot_sim.NED2RobotEnv(ros_port=None, gazebo_port=None, gazebo_pid=None, seed=None, real_time=False, action_cycle_time=0.0, load_cube=False, load_table=False, use_camera=False, gripper=False)[source]
Bases:
GazeboBaseEnvSuperclass for all NED2 Robot environments - gazebo-based.
Initializes a new Robot Environment
Describe the robot and the sensors used in the env.
- Sensor Topic List:
MoveIt: To get the pose and rpy of the robot. /joint_states: JointState received for the joints of the robot /gazebo_camera/image_raw: RGB image from the robot camera.
- Actuators Topic List:
MoveIt: Send the joint positions to the robot. /ned2/niryo_robot_follow_joint_trajectory_controller/command: Send the joint positions to the robot. /ned2/niryo_robot_tools_commander/action_server: Action server to control the robot tools.
- Parameters:
- ros_kin
Finished __init__ method
- get_model_pose(model_name='red_cube', rpy=True)[source]
Get the pose of an object in Gazebo
- Parameters:
model_name – name of the object whose pose is to be retrieved
rpy – True if the orientation is to be returned as euler angles (default: True)
- Returns:
True if the pose is retrieved successful position: position of the object as a numpy array orientation: orientation of the object as a numpy array (rpy or quaternion)
- Return type:
success
- spawn_cube_in_gazebo(model_pos_x, model_pos_y)[source]
Spawn a cube in Gazebo
- Parameters:
model_pos_x – x-coordinate of the cube
model_pos_y – y-coordinate of the cube
- Returns:
True if the cube is spawned successfully
- Return type:
done
- fk_pykdl(action)[source]
Function to calculate the forward kinematics of the robot arm. We are using pykdl_utils.
- Parameters:
action – joint positions of the robot arm (in radians)
- Returns:
end-effector position as a numpy array
- Return type:
ee_position
- calculate_fk(joint_positions, euler=True)[source]
Calculate the forward kinematics of the robot arm using the ros_kinematics package.
- Parameters:
joint_positions – joint positions of the robot arm (in radians)
euler – True if the orientation is to be returned as euler angles (default: True)
- Returns:
True if the FK calculation is successful ee_position: end-effector position as a numpy array ee_rpy: end-effector orientation as a list of rpy or quaternion values
- Return type:
done
- calculate_ik(target_pos, ee_ori=array([0., 0., 0., 1.]))[source]
Calculate the inverse kinematics of the robot arm using the ros_kinematics package.
- Parameters:
target_pos – target end-effector position as a numpy array
ee_ori – end-effector orientation as a list of quaternion values (default: [0.0, 0.0, 0.0, 1.0])
- Returns:
True if the IK calculation is successful joint_positions: joint positions of the robot arm (in radians)
- Return type:
done
- move_arm_joints(q_positions, time_from_start=0.5)[source]
Set a joint position target only for the arm joints using low-level ros controllers.
- move_gripper_joints(action)[source]
Set a joint position target only for the gripper joints using low-level ros controllers. - ros action server
- smooth_trajectory(q_positions, time_from_start, multiplier=100)[source]
Smooth the trajectory by interpolating between the current and target positions.
- Parameters:
q_positions – target joint positions
time_from_start – time from start of the trajectory (set the speed to complete the trajectory)
multiplier – number of steps to interpolate between the current and target positions
- publish_trajectory(trajectory_points)[source]
Publish the entire trajectory at once.
- Parameters:
trajectory_points – List of tuples containing joint positions and time_from_start
- set_trajectory_joints(q_positions)[source]
Set a joint position target only for the arm joints using moveit.
- set_trajectory_ee(pos)[source]
Set a pose target for the end effector of the robot arm using moveit.
- get_ee_pose()[source]
Returns the end-effector pose as a geometry_msgs/PoseStamped message
This gives us the best pose if we are using the moveit config of the ReactorX repo They are getting the pose with ee_gripper_link
- get_ee_rpy()[source]
Returns the end-effector orientation as a list of roll, pitch, and yaw angles.
- class rl_environments.ned2.sim.robot_envs.ned2_robot_goal_sim.NED2RobotGoalEnv(ros_port=None, gazebo_port=None, gazebo_pid=None, seed=None, real_time=False, action_cycle_time=0.0, load_cube=False, load_table=False, use_camera=False, gripper=False)[source]
Bases:
GazeboGoalEnvSuperclass for all NED2 Robot Goal environments.
Initializes a new Robot Goal Environment
Describe the robot and the sensors used in the env.
- Sensor Topic List:
MoveIt: To get the pose and rpy of the robot. /joint_states: JointState received for the joints of the robot /gazebo_camera/image_raw: RGB image from the robot camera.
- Actuators Topic List:
MoveIt: Send the joint positions to the robot. /ned2/niryo_robot_follow_joint_trajectory_controller/command: Send the joint positions to the robot. /ned2/niryo_robot_tools_commander/action_server: Action server to control the robot tools.
- Parameters:
- ros_kin
Finished __init__ method
- get_model_pose(model_name='red_cube')[source]
Get the pose of an object in Gazebo
- Parameters:
model_name – name of the object whose pose is to be retrieved
- Returns:
pose of the object as a geometry_msgs/PoseStamped message
- Return type:
pose
- spawn_cube_in_gazebo(model_pos_x, model_pos_y)[source]
Spawn a cube in Gazebo
- Parameters:
model_pos_x – x-coordinate of the cube
model_pos_y – y-coordinate of the cube
- Returns:
True if the cube is spawned successfully
- Return type:
done
- fk_pykdl(action)[source]
Function to calculate the forward kinematics of the robot arm. We are using pykdl_utils.
- Parameters:
action – joint positions of the robot arm (in radians)
- Returns:
end-effector position as a numpy array
- Return type:
ee_position
- calculate_fk(joint_positions, euler=True)[source]
Calculate the forward kinematics of the robot arm using the ros_kinematics package.
- Parameters:
joint_positions – joint positions of the robot arm (in radians)
euler – True if the orientation is to be returned as euler angles (default: True)
- Returns:
True if the FK calculation is successful ee_position: end-effector position as a numpy array ee_rpy: end-effector orientation as a list of rpy or quaternion values
- Return type:
done
- calculate_ik(target_pos, ee_ori=array([0., 0., 0., 1.]))[source]
Calculate the inverse kinematics of the robot arm using the ros_kinematics package.
- Parameters:
target_pos – target end-effector position as a numpy array
ee_ori – end-effector orientation as a list of quaternion values (default: [0.0, 0.0, 0.0, 1.0])
- Returns:
True if the IK calculation is successful joint_positions: joint positions of the robot arm (in radians)
- Return type:
done
- move_arm_joints(q_positions, time_from_start=0.5)[source]
Set a joint position target only for the arm joints using low-level ros controllers.
- move_gripper_joints(action)[source]
Set a joint position target only for the gripper joints using low-level ros controllers. - ros action server
- smooth_trajectory(q_positions, time_from_start, multiplier=100)[source]
Smooth the trajectory by interpolating between the current and target positions.
- Parameters:
q_positions – target joint positions
time_from_start – time from start of the trajectory (set the speed to complete the trajectory)
multiplier – number of steps to interpolate between the current and target positions
- publish_trajectory(trajectory_points)[source]
Publish the entire trajectory at once.
- Parameters:
trajectory_points – List of tuples containing joint positions and time_from_start
- set_trajectory_joints(q_positions)[source]
Set a joint position target only for the arm joints using moveit.
- set_trajectory_ee(pos)[source]
Set a pose target for the end effector of the robot arm using moveit.
- get_ee_pose()[source]
Returns the end-effector pose as a geometry_msgs/PoseStamped message
This gives us the best pose if we are using the moveit config of the ReactorX repo They are getting the pose with ee_gripper_link
- get_ee_rpy()[source]
Returns the end-effector orientation as a list of roll, pitch, and yaw angles.
Reach task envs
- class rl_environments.ned2.sim.task_envs.reach.ned2_reach_sim.NED2ReacherEnv(launch_gazebo=True, new_roscore=True, roscore_port=None, gazebo_paused=False, gazebo_gui=False, seed=None, reward_type='Dense', delta_action=True, delta_coeff=0.05, ee_action_type=False, environment_loop_rate=10, action_cycle_time=0.1, use_smoothing=False, rgb_obs_only=False, normal_obs_only=True, rgb_plus_normal_obs=False, rgb_plus_depth_plus_normal_obs=False, load_table=True, debug=False, action_speed=0.5, simple_dense_reward=True, log_internal_state=False, extra_smoothing=False)[source]
Bases:
NED2RobotEnvThis Task env is for a simple Reach Task with the NED2 robot.
- The task is done if
The robot reached the goal
- Here
Action Space - Continuous (6 actions for joints or 3 xyz position of the end effector)
Observation - Continuous (simp obs or rgb/depth image or a combination)
- Init Args:
launch_gazebo: Whether to launch Gazebo or not. If False, it is assumed that Gazebo is already running.
new_roscore: Whether to launch a new roscore or not. If False, it is assumed that a roscore is already running.
roscore_port: Port of the roscore to be launched. If None, a random port is chosen.
gazebo_paused: Whether to launch Gazebo in a paused state or not.
gazebo_gui: Whether to launch Gazebo with the GUI or not.
seed: Seed for the random number generator.
reward_type: Type of reward to be used. It Can be “Sparse” or “Dense”.
delta_action: Whether to use the delta actions or the absolute actions.
delta_coeff: Coefficient to be used for the delta actions.
ee_action_type: Whether to use the end effector action space or the joint action space.
environment_loop_rate: Rate at which the environment should run. (in Hz) - default 10 Hz (default operating frequency of the robot)
action_cycle_time: Time to wait between two consecutive actions. (in seconds) - default 100 ms (should be equal to larger than the environment loop time “1/environment_loop_rate”)
use_smoothing: Whether to use smoothing for actions or not.
rgb_obs_only: Whether to use only the RGB image as the observations or not.
normal_obs_only: Whether to use only the traditional observations or not.
rgb_plus_normal_obs: Whether to use RGB image and traditional observations or not.
rgb_plus_depth_plus_normal_obs: Whether to use RGB image, Depth image and traditional observations or not.
load_table: Whether to load the table model or not.
debug: Whether to print debug messages or not.
action_speed: set the speed to complete the trajectory. default in 0.5 seconds
simple_dense_reward: Whether to use a simple dense reward or not.
log_internal_state: Whether to log the internal state of the environment or not.
extra_smoothing: Whether to use extra smoothing for actions or not.
variables to keep track of ros, gazebo ports and gazebo pid
- Parameters:
launch_gazebo (bool)
new_roscore (bool)
roscore_port (str)
gazebo_paused (bool)
gazebo_gui (bool)
seed (int)
reward_type (str)
delta_action (bool)
delta_coeff (float)
ee_action_type (bool)
environment_loop_rate (float)
action_cycle_time (float)
use_smoothing (bool)
rgb_obs_only (bool)
normal_obs_only (bool)
rgb_plus_normal_obs (bool)
rgb_plus_depth_plus_normal_obs (bool)
load_table (bool)
debug (bool)
action_speed (float)
simple_dense_reward (bool)
log_internal_state (bool)
extra_smoothing (bool)
- log_internal_state
Reward Architecture * Dense - Default * Sparse - -1 if not done and 1 if done * All the others and misspellings - default to “Dense” reward
- simple_dense_reward
Use action as deltas
- delta_coeff
Use smoothing for actions
- action_cycle_time
Action speed - Time to complete the trajectory
- action_speed
Observation space * RGB image only * traditional observations only (default) * RGB image and traditional observations (combined) * RGB image, Depth image and traditional observations (combined)
- rgb_plus_depth_plus_normal_obs
Action space * Joint action space (default) * End effector action space
- ee_action_type
Debug
- debug
Load YAML param file
- goal_space
Workspace so we can check if the action is within the workspace
- workspace_space
Define subscribers/publishers and Markers as needed.
- goal_marker
Init super class.
- within_goal_space
Finished __init__ method
- execute_action(action)[source]
Function to apply an action to the robot.
This method should be implemented here to apply the given action to the robot. The action could be a joint position command, a velocity command, or any other type of command that can be applied to the robot.
- Parameters:
action – The action to be applied to the robot.
- sample_observation()[source]
Function to get an observation from the environment.
# traditional observations 01. EE pos - 3 02. Vector to the goal (normalized linear distance) - 3 03. Euclidian distance (ee to reach goal)- 1 04. Current Joint values - 6 05. Previous action - 6 or 3 (joint or ee) 06. Joint velocities - 6
total: (3x2) + 1 + (6 or 3) + (6x2)
# depth image 480x640 32FC1
# rgb image 480x640X3 RGB images
- Returns:
An observation representing the current state of the environment.
- calculate_reward()[source]
Function to get a reward from the environment.
Sparse Reward: float => 1.0 for success, -1.0 for failure
- Dense Reward:
if reached: self.reached_goal_reward (positive reward) else: - self.mult_dist_reward * distance_to_the_goal
and as always, negative rewards for each step, non-execution and actions not within joint limits
- Returns:
A scalar reward value representing how well the agent is doing in the current episode.
- check_if_done()[source]
Function to check if the episode is done.
The Task is done if the EE is close enough to the goal
- Returns:
A boolean value indicating whether the episode has ended (e.g., because a goal has been reached or a failure condition has been triggered)
- get_random_goal(max_tries=100)[source]
Function to get a reachable goal
- Parameters:
max_tries (int)
- check_action_within_goal_space_fk(action)[source]
Function to check if the given action is within the goal space
- class rl_environments.ned2.sim.task_envs.reach.ned2_reach_goal_sim.NED2ReacherGoalEnv(launch_gazebo=True, new_roscore=True, roscore_port=None, gazebo_paused=False, gazebo_gui=False, seed=None, reward_type='sparse', delta_action=True, delta_coeff=0.05, ee_action_type=False, environment_loop_rate=10.0, action_cycle_time=0.8, use_smoothing=False, rgb_obs_only=False, normal_obs_only=True, rgb_plus_normal_obs=False, rgb_plus_depth_plus_normal_obs=False, load_table=True, debug=False, log_internal_state=False, action_speed=0.5, simple_dense_reward=True)[source]
Bases:
NED2RobotGoalEnvThis Task env is for a simple Reach Task with the NED2 robot. This env is made to work in a real-time environment.
- The task is done if
The robot reached the goal
- Here
Action Space - Continuous (6 actions for joints or 3 xyz position of the end effector)
Observation - Continuous (simp obs or rgb/depth image or a combination)
Desired Goal - Goal we are trying to reach
Achieved Goal - Position of the EE
- Init Args:
launch_gazebo: Whether to launch Gazebo or not. If False, it is assumed that Gazebo is already running.
new_roscore: Whether to launch a new roscore or not. If False, it is assumed that a roscore is already running.
roscore_port: Port of the roscore to be launched. If None, a random port is chosen.
gazebo_paused: Whether to launch Gazebo in a paused state or not.
gazebo_gui: Whether to launch Gazebo with the GUI or not.
seed: Seed for the random number generator.
reward_type: Type of reward to be used. It Can be “Sparse” or “Dense”.
delta_action: Whether to use the delta actions or the absolute actions.
delta_coeff: Coefficient to be used for the delta actions.
ee_action_type: Whether to use the end effector action space or the joint action space.
environment_loop_rate: Rate at which the environment should run. (in Hz) default is 10 Hz - (default operating frequency of the robot)
action_cycle_time: Time to wait between two consecutive actions. (in seconds) - default 100 ms (should be equal to larger than the environment loop time “1/environment_loop_rate”)
use_smoothing: Whether to use smoothing for actions or not.
rgb_obs_only: Whether to use only the RGB image as the observations or not.
normal_obs_only: Whether to use only the traditional observations or not.
rgb_plus_normal_obs: Whether to use RGB image and traditional observations or not.
rgb_plus_depth_plus_normal_obs: Whether to use RGB image, Depth image and traditional observations or not.
load_table: Whether to load the table model or not.
debug: Whether to print debug messages or not.
log_internal_state: Whether to log the internal state of the environment or not.
action_speed: set the speed to complete the trajectory. default in 0.5 seconds
simple_dense_reward: Whether to use a simple dense reward or not.
variables to keep track of ros, gazebo ports and gazebo pid
- Parameters:
launch_gazebo (bool)
new_roscore (bool)
roscore_port (str)
gazebo_paused (bool)
gazebo_gui (bool)
seed (int)
reward_type (str)
delta_action (bool)
delta_coeff (float)
ee_action_type (bool)
environment_loop_rate (float)
action_cycle_time (float)
use_smoothing (bool)
rgb_obs_only (bool)
normal_obs_only (bool)
rgb_plus_normal_obs (bool)
rgb_plus_depth_plus_normal_obs (bool)
load_table (bool)
debug (bool)
log_internal_state (bool)
action_speed (float)
simple_dense_reward (bool)
- log_internal_state
Reward Architecture * Dense - Default * Sparse - -1 if not done and 1 if done * All the others and misspellings - default to “Dense” reward
- simple_dense_reward
Use action as deltas
- delta_coeff
Use smoothing for actions
- action_cycle_time
Action speed - Time to complete the trajectory
- action_speed
Observation space * RGB image only * traditional observations only (default) * RGB image and traditional observations (combined) * RGB image, Depth image and traditional observations (combined)
- rgb_plus_depth_plus_normal_obs
Action space * Joint action space (default) * End effector action space
- ee_action_type
Debug
- debug
Load YAML param file
- rgb_image_space
Achieved goal (EE pose) - 3
- achieved_goal_space
Desired goal (Goal pose) - 3
- desired_goal_space
Define the overall observation space
- goal_space
Workspace so we can check if the action is within the workspace
- workspace_space
Define subscribers/publishers and Markers as needed.
- goal_marker
Init super class.
- within_goal_space
Finished __init__ method
- compute_reward(achieved_goal, desired_goal, info)[source]
Compute the reward for achieving a given goal.
- compute_terminated(achieved_goal, desired_goal, info)[source]
Function to check if the episode is terminated.
- Parameters:
achieved_goal – EE position
desired_goal – Reach Goal
info (dict) – Additional information about the environment.
- Returns:
A boolean value indicating whether the episode has ended (e.g. because a goal has been reached or a failure condition has been triggered)
- compute_truncated(achieved_goal, desired_goal, info)[source]
Function to check if the episode is truncated.
Mainly hard coded here since we are using a wrapper that sets the max number of steps and truncates the episode.
- Parameters:
achieved_goal – EE position
desired_goal – Reach Goal
info (dict) – Additional information about the environment.
- Returns:
A boolean value indicating whether the episode has been truncated (e.g. because the maximum number of steps has been reached)
- execute_action(action)[source]
Function to apply an action to the robot.
This method should be implemented here to apply the given action to the robot. The action could be a joint position command, a velocity command, or any other type of command that can be applied to the robot.
- Parameters:
action – The action to be applied to the robot.
- sample_observation()[source]
Function to get an observation from the environment.
# traditional observations 01. EE pos - 3 02. Vector to the goal (normalized linear distance) - 3 03. Euclidian distance (ee to reach goal)- 1 04. Current Joint values - 6 05. Previous action - 6 or 3 (joint or ee) 06. Joint velocities - 6
total: (3x2) + 1 + (6 or 3) + (6x2)
# depth image 480x640 32FC1
# rgb image 480x640X3 RGB images
- Returns:
An observation representing the current state of the environment.
- calculate_reward(achieved_goal=None, desired_goal=None, info=None)[source]
Compute the reward for achieving a given goal.
Sparse Reward: float => 1.0 for success, -1.0 for failure
- Dense Reward:
if reached: self.reached_goal_reward (positive reward) else: - self.mult_dist_reward * distance_to_the_goal
And as always negative rewards for each step, non-execution and actions not within joint limits
- check_if_done()[source]
Function to check if the episode is done.
The Task is done if the EE is close enough to the goal
- Returns:
A boolean value indicating whether the episode has ended (e.g. because a goal has been reached or a failure condition has been triggered)
- get_random_goal(max_tries=100)[source]
Function to get a reachable goal
- Parameters:
max_tries (int)
- check_action_within_goal_space_fk(action)[source]
Function to check if the given action is within the goal space
Niryo Ned2 — Real hardware
- class rl_environments.ned2.real.robot_envs.ned2_robot_real.NED2RobotEnv(ros_port=None, seed=None, close_env_prompt=False, action_cycle_time=0.0, use_kinect=False, use_zed2=False, remote_ip=None, local_ip=None, multi_device_mode=True)[source]
Bases:
RealBaseEnvSuperclass for all real NED2 Robot environments.
Initializes a new Robot Environment
Describe the robot and the sensors used in the env.
- Sensor Topic List:
MoveIt: To get the pose and rpy of the robot. /joint_states: JointState received for the joints of the robot /head_mount_kinect2/depth/image_raw: Depth image from the kinect sensor /head_mount_kinect2/rgb/image_raw: RGB image from the kinect sensor
- Actuators Topic List:
MoveIt!: Send the joint positions to the robot. /niryo_robot_follow_joint_trajectory_controller/command: Send the joint positions to the robot. /niryo_robot_tools_commander/action_server: Send the joint positions to the robot gripper.
- Parameters:
- ros_kin
Finished __init__ method
- fk_pykdl(action)[source]
Function to calculate the forward kinematics of the robot arm. We are using pykdl_utils.
- Parameters:
action – joint positions of the robot arm (in radians)
- Returns:
end-effector position as a numpy array
- Return type:
ee_position
- calculate_fk(joint_positions, euler=True)[source]
Calculate the forward kinematics of the robot arm using the ros_kinematics package.
- Parameters:
joint_positions – joint positions of the robot arm (in radians)
euler – True if the orientation is to be returned as euler angles (default: True)
- Returns:
True if the FK calculation is successful ee_position: end-effector position as a numpy array ee_rpy: end-effector orientation as a list of rpy or quaternion values
- Return type:
done
- calculate_ik(target_pos, ee_ori=array([0., 0., 0., 1.]))[source]
Calculate the inverse kinematics of the robot arm using the ros_kinematics package.
- Parameters:
target_pos – target end-effector position as a numpy array
ee_ori – end-effector orientation as a list of quaternion values (default: [0.0, 0.0, 0.0, 1.0])
- Returns:
True if the IK calculation is successful joint_positions: joint positions of the robot arm (in radians)
- Return type:
done
- move_arm_joints(q_positions, time_from_start=0.5)[source]
Set a joint position target only for the arm joints using low-level ros controllers.
- move_gripper_joints(action)[source]
Set a joint position target only for the gripper joints using low-level ros controllers. - ros action server
- smooth_trajectory(q_positions, time_from_start, multiplier=100)[source]
Smooth the trajectory by interpolating between the current and target positions.
- Parameters:
q_positions – target joint positions
time_from_start – time from start of the trajectory (set the speed to complete the trajectory)
multiplier – number of steps to interpolate between the current and target positions
- publish_trajectory(trajectory_points)[source]
Publish the entire trajectory at once.
- Parameters:
trajectory_points – List of tuples containing joint positions and time_from_start
- get_ee_rpy()[source]
Returns the end-effector orientation as a list of roll, pitch, and yaw angles.
- class rl_environments.ned2.real.robot_envs.ned2_robot_goal_real.NED2RobotGoalEnv(ros_port=None, seed=None, close_env_prompt=False, action_cycle_time=0.0, use_kinect=False, use_zed2=False, remote_ip=None, local_ip=None, multi_device_mode=True)[source]
Bases:
RealGoalEnvSuperclass for all real NED2 Robot environments. - For goal-conditioned tasks
Initializes a new Robot Goal Environment
Describe the robot and the sensors used in the env.
- Sensor Topic List:
MoveIt: To get the pose and rpy of the robot. /joint_states: JointState received for the joints of the robot /head_mount_kinect2/depth/image_raw: Depth image from the kinect sensor /head_mount_kinect2/rgb/image_raw: RGB image from the kinect sensor
- Actuators Topic List:
MoveIt!: Send the joint positions to the robot. /niryo_robot_follow_joint_trajectory_controller/command: Send the joint positions to the robot. /niryo_robot_tools_commander/action_server: Send the joint positions to the robot gripper.
- Parameters:
- ros_kin
Finished __init__ method
- fk_pykdl(action)[source]
Function to calculate the forward kinematics of the robot arm. We are using pykdl_utils.
- Parameters:
action – joint positions of the robot arm (in radians)
- Returns:
end-effector position as a numpy array
- Return type:
ee_position
- calculate_fk(joint_positions, euler=True)[source]
Calculate the forward kinematics of the robot arm using the ros_kinematics package.
- Parameters:
joint_positions – joint positions of the robot arm (in radians)
euler – True if the orientation is to be returned as euler angles (default: True)
- Returns:
True if the FK calculation is successful ee_position: end-effector position as a numpy array ee_rpy: end-effector orientation as a list of rpy or quaternion values
- Return type:
done
- calculate_ik(target_pos, ee_ori=array([0., 0., 0., 1.]))[source]
Calculate the inverse kinematics of the robot arm using the ros_kinematics package.
- Parameters:
target_pos – target end-effector position as a numpy array
ee_ori – end-effector orientation as a list of quaternion values (default: [0.0, 0.0, 0.0, 1.0])
- Returns:
True if the IK calculation is successful joint_positions: joint positions of the robot arm (in radians)
- Return type:
done
- move_arm_joints(q_positions, time_from_start=0.5)[source]
Set a joint position target only for the arm joints using low-level ros controllers.
- move_gripper_joints(action)[source]
Set a joint position target only for the gripper joints using low-level ros controllers. - ros action server
- smooth_trajectory(q_positions, time_from_start, multiplier=100)[source]
Smooth the trajectory by interpolating between the current and target positions.
- Parameters:
q_positions – target joint positions
time_from_start – time from start of the trajectory (set the speed to complete the trajectory)
multiplier – number of steps to interpolate between the current and target positions
- publish_trajectory(trajectory_points)[source]
Publish the entire trajectory at once.
- Parameters:
trajectory_points – List of tuples containing joint positions and time_from_start
- get_ee_rpy()[source]
Returns the end-effector orientation as a list of roll, pitch, and yaw angles.