unitree-g1-mujoco / CAMERA_README.md
nepyope's picture
Upload folder using huggingface_hub
ef6a683 verified

Camera System for MuJoCo G1 Simulator

Overview

The simulator has two cameras defined:

1. head_camera - Robot Ego-View

  • Location: Attached to torso_link body
  • Position: [0.06, 0.0, 0.45] relative to torso (6cm forward, 45cm up)
  • Orientation: euler="0 -0.8 -1.57" (facing forward, slightly tilted down)
  • FOV: 90 degrees
  • Purpose: First-person view from the robot's perspective (like a head-mounted camera)

2. global_view - Third-Person View

  • Location: Fixed in world coordinates
  • Position: [2.910, -5.040, 3.860] (behind and above the robot)
  • Purpose: External observer view for visualization

How Camera Publishing Works

The camera system uses a zero-copy architecture with three components:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   MuJoCo Simulator  β”‚
β”‚  (Main Process)     β”‚
β”‚                     β”‚
β”‚  1. Render cameras  β”‚
β”‚  2. Copy to shmem   │──┐
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
                         β”‚ Shared Memory
                         β”‚ (fast IPC)
                    β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚  Image Publisher    β”‚
                    β”‚  (Subprocess)       β”‚
                    β”‚                     β”‚
                    β”‚  3. Encode images   β”‚
                    β”‚  4. ZMQ publish     β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              β”‚
                              β”‚ TCP (ZMQ)
                              β”‚ port 5555
                              β–Ό
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚  Your Policy/Client β”‚
                    β”‚  (Subscribe)        β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key Technologies:

  • MuJoCo Renderer: Captures RGB images from virtual cameras
  • Shared Memory (multiprocessing.shared_memory): Zero-copy transfer between processes
  • ZMQ (ZeroMQ): Network socket for publishing images (TCP)
  • No ROS2 required! Pure Python multiprocessing

Usage

Basic Simulation (No Camera Publishing)

python run_sim.py

With Camera Publishing

# Publish head camera on default port 5555
python run_sim.py --publish-images

# Publish multiple cameras
python run_sim.py --publish-images --cameras head_camera global_view

# Custom port
python run_sim.py --publish-images --camera-port 6000

Viewing Camera Streams

In a separate terminal, run the camera viewer:

# Basic usage (default: localhost:5555)
python view_cameras.py

# Custom host/port
python view_cameras.py --host 192.168.1.100 --port 6000

# Save images to directory
python view_cameras.py --save ./camera_recordings

# Adjust display rate
python view_cameras.py --fps 60

Keyboard Controls:

  • q: Quit viewer
  • s: Save snapshot of current frame

Example Workflow:

# Terminal 1: Start simulator with camera publishing
python run_sim.py --publish-images

# Terminal 2: View the camera feed
python view_cameras.py

Receiving Images in Your Code

import zmq
import numpy as np
from gr00t_wbc.control.sensor.sensor_server import ImageMessageSchema

# Connect to camera publisher
context = zmq.Context()
socket = context.socket(zmq.SUB)
socket.connect("tcp://localhost:5555")
socket.setsockopt(zmq.SUBSCRIBE, b"")  # Subscribe to all messages

while True:
    # Receive serialized image data
    data = socket.recv_pyobj()
    
    # Decode images
    if "head_camera" in data:
        # Decode image (returns numpy array HxWx3 uint8)
        img = decode_image(data["head_camera"])
        
        # Use image for your policy
        process_observation(img)

Camera Configuration

Edit config.yaml to change camera settings:

IMAGE_DT: 0.033333  # Publishing rate (30 Hz)
ENABLE_OFFSCREEN: false  # Enable for camera rendering
MP_START_METHOD: "spawn"  # Multiprocessing method

Or programmatically in run_sim.py:

camera_configs = {
    "head_camera": {
        "height": 480,
        "width": 640
    },
    "custom_camera": {
        "height": 224,
        "width": 224
    }
}

Adding Custom Cameras

Edit assets/g1_29dof_with_hand.xml or assets/scene_43dof.xml:

<!-- Camera attached to robot body -->
<body name="torso_link" pos="0 0 0.019">
  <camera name="my_camera" pos="0.1 0.0 0.5" euler="0 0 0" fovy="60"/>
</body>

<!-- Camera in world coordinates -->
<worldbody>
  <camera name="side_view" pos="0 -3.0 1.5" xyaxes="1 0 0 0 0.5 0.866"/>
</worldbody>

Then publish it:

python run_sim.py --publish-images --cameras my_camera

Performance Notes

  • Rendering overhead: ~5-10ms per camera per frame @ 640x480
  • Publishing overhead: ~2-3ms for encoding + network
  • Image publishing runs in separate subprocess to not block simulation
  • Uses shared memory for fast inter-process image transfer
  • Target: 30 FPS camera publishing while maintaining 500 Hz simulation

Troubleshooting

No images received?

  1. Check if offscreen rendering is enabled (--publish-images flag)
  2. Verify ZMQ port is not blocked
  3. Check camera exists in scene XML

Images are delayed?

  • Reduce IMAGE_DT in config
  • Lower camera resolution
  • Use fewer cameras

"Camera not found" error?

  • Verify camera name in XML matches config
  • Check XML syntax is valid
  • Ensure MuJoCo model loads successfully

Quick Reference

File Structure

mujoco_sim_g1/
β”œβ”€β”€ run_sim.py              # Simulator launcher
β”œβ”€β”€ view_cameras.py         # Camera viewer (this file!)
β”œβ”€β”€ config.yaml             # Simulator config
β”œβ”€β”€ assets/
β”‚   β”œβ”€β”€ scene_43dof.xml     # Scene with global_view camera
β”‚   └── g1_29dof_with_hand.xml  # Robot model with head_camera
└── sim/
    β”œβ”€β”€ base_sim.py         # MuJoCo environment
    β”œβ”€β”€ sensor_utils.py     # ZMQ camera server/client
    └── image_publish_utils.py  # Multiprocessing image publisher

Camera Definitions

Edit these files to modify cameras:

assets/g1_29dof_with_hand.xml - Robot-attached cameras:

<body name="torso_link" pos="0 0 0.019">
  <camera name="head_camera" pos="0.06 0.0 0.45" euler="0 -0.8 -1.57" fovy="90"/>
</body>

assets/scene_43dof.xml - World-frame cameras:

<worldbody>
  <camera name="global_view" pos="2.910 -5.040 3.860" xyaxes="0.866 0.500 0.000 -0.250 0.433 0.866" fovy="45"/>
</worldbody>

Complete Example

# Terminal 1: Start simulator with camera publishing
cd mujoco_sim_g1
python run_sim.py --publish-images --cameras head_camera global_view

# Terminal 2: View cameras in real-time
python view_cameras.py

# Terminal 3: Use in your policy (Python code)
from sim.sensor_utils import SensorClient, ImageUtils
client = SensorClient()
client.start_client("localhost", 5555)
data = client.receive_message()
img = ImageUtils.decode_image(data["head_camera"])
# img is now numpy array (H, W, 3) in BGR format