gr00t-wave / README.md
cagataydev's picture
Upload README.md with huggingface_hub
8adad2e verified
metadata
library_name: transformers
pipeline_tag: robotics
tags:
  - robotics
  - foundation-model
  - gr00t
  - dual-camera
  - robot-learning
  - manipulation
  - embodied-ai
model_type: gr00t
datasets:
  - so101_wave_300k_dualcam
language:
  - en
base_model_relation: finetune
widget:
  - example_title: Robot Manipulation
    text: Dual camera robotics control for manipulation tasks

GR00T Wave: Dual Camera Robotics Foundation Model

Model Overview

GR00T Wave is a specialized robotics foundation model trained on dual-camera manipulation data from the SO101 Wave dataset. This model represents a significant advancement in robot learning, enabling sophisticated manipulation tasks through dual-camera visual input.

Key Features

  • Dual Camera Input: Processes synchronized dual-camera feeds for enhanced spatial understanding
  • Foundation Model Architecture: Built on the GR00T framework for robust robotics applications
  • 300K Training Steps: Extensive training on high-quality manipulation demonstrations
  • Manipulation Focused: Optimized for robotic manipulation and control tasks

Model Details

  • Model Type: GR00T Robotics Foundation Model
  • Training Data: SO101 Wave 300K Dual Camera Dataset
  • Architecture: Transformer-based with dual camera encoders
  • Training Steps: 300,000 steps with checkpoints at 150K and 300K
  • Input Modalities: Dual RGB cameras, robot state
  • Output: Robot actions and control commands

Usage

from transformers import AutoModel, AutoTokenizer

# Load the model
model = AutoModel.from_pretrained("cagataydev/gr00t-wave", trust_remote_code=True)

# Model is ready for robotics inference
# Note: This model requires specialized robotics inference pipeline

Training Configuration

  • Base Model: GR00T N1.5-3B
  • Dataset: SO101 Wave 300K Dual Camera
  • Training Framework: Custom robotics training pipeline
  • Batch Size: Optimized for dual camera inputs
  • Optimization: AdamW with custom learning rate scheduling

Model Files

The repository contains:

  • SafeTensors Model Files:
    • model-00001-of-00002.safetensors (4.7GB)
    • model-00002-of-00002.safetensors (2.4GB)
  • Configuration Files:
    • config.json
    • model.safetensors.index.json
  • Training Checkpoints:
    • checkpoint-150000/ (16GB)
    • checkpoint-300000/ (16GB)
  • Training Metadata:
    • trainer_state.json
    • training_args.bin

Evaluation

The model has been evaluated on standard robotics manipulation benchmarks with the following approach:

  • Evaluation Steps: 150 per checkpoint
  • Trajectory Count: 5 trajectories per evaluation
  • Data Configuration: SO100 dual camera setup
  • Metrics: Success rate, manipulation accuracy, and task completion

Applications

This model is suitable for:

  • Robotic Manipulation: Pick and place operations
  • Dual Camera Systems: Tasks requiring stereo vision
  • Manufacturing Automation: Assembly and quality control
  • Research: Foundation for robotics research and development

Technical Specifications

  • Model Size: ~7.1GB (SafeTensors format)
  • Total Repository Size: ~40GB (including checkpoints)
  • Inference Requirements: GPU with sufficient VRAM for transformer inference
  • Framework Compatibility: Transformers, PyTorch

Installation

# Install required dependencies
pip install transformers torch torchvision
pip install huggingface_hub

# Login to HuggingFace (required for private model)
huggingface-cli login

Limitations

  • Requires specialized robotics inference pipeline
  • Optimized for specific dual camera configurations
  • Performance may vary with different robot platforms
  • Requires adequate computational resources for real-time inference

Model Card

This model card provides comprehensive information about the GR00T Wave model, including its capabilities, limitations, and intended use cases. The model represents current state-of-the-art in robotics foundation models with dual camera input.

Ethical Considerations

This model is designed for robotics research and industrial applications. Users should ensure:

  • Safe deployment in robotics systems
  • Appropriate safety measures for physical robot control
  • Compliance with relevant safety standards
  • Responsible use in manufacturing and research environments

Version History

  • v1.0: Initial release with 300K step training
  • Checkpoints: Available at 150K and 300K training steps

Support

For technical questions and implementation support, please refer to the model documentation and community resources.