SAM‑TP Traversability Dataset
This repository contains pixel‑wise traversability masks paired with egocentric RGB images, prepared in a flat, filename‑aligned layout that is convenient for training SAM‑2 / SAM‑TP‑style segmentation models.
Folder layout
.
├─ images/ # RGB frames (.jpg/.png). Filenames are globally unique.
├─ annotations/ # Binary masks (.png/.jpg). Filenames match images 1‑to‑1.
└─ manifest.csv # Provenance rows and any missing‑pair notes.
Each annotations/<FILENAME> is the mask for images/<FILENAME> (same filename, different folder).
File naming
Filenames are made globally unique by concatenating the original subfolder path and the local stem with __ separators, e.g.
ride_68496_8ef98b_20240716023032_517__1.jpg
ride_68496_8ef98b_20240716023032_517__1.png # corresponding mask
Mask format
- Single‑channel binary masks; foreground = traversable, background = non‑traversable.
- Stored as
.pngor.jpgdepending on source. If your pipeline requires PNG, convert on the fly in your dataloader. - Values are typically
{0, 255}. You can binarize viamask = (mask > 127).astype(np.uint8).
How to use
A) Load with datasets (ImageFolder‑style)
from datasets import load_dataset
from pathlib import Path
from PIL import Image
REPO = "jamiewjm/sam-tp" # e.g. "jamiewjm/sam-tp"
ds_imgs = load_dataset(
"imagefolder",
data_dir=".",
data_files={"image": f"hf://datasets/{REPO}/images/**"},
split="train",
)
ds_msks = load_dataset(
"imagefolder",
data_dir=".",
data_files={"mask": f"hf://datasets/{REPO}/annotations/**"},
split="train",
)
# Build a mask index by filename
mask_index = {Path(r["image"]["path"]).name: r["image"]["path"] for r in ds_msks}
row = ds_imgs[0]
img_path = Path(row["image"]["path"])
msk_path = Path(mask_index[img_path.name])
img = Image.open(img_path).convert("RGB")
msk = Image.open(msk_path).convert("L")
B) Minimal PyTorch dataset
from pathlib import Path
from PIL import Image
from torch.utils.data import Dataset
class TraversabilityDataset(Dataset):
def __init__(self, root):
root = Path(root)
self.img_dir = root / "images"
self.msk_dir = root / "annotations"
self.items = sorted([p for p in self.img_dir.iterdir() if p.is_file()])
def __len__(self):
return len(self.items)
def __getitem__(self, idx):
ip = self.items[idx]
mp = self.msk_dir / ip.name
return Image.open(ip).convert("RGB"), Image.open(mp).convert("L")
C) Pre‑processing notes for SAM‑2/SAM‑TP training
- Resize/pad to your training resolution (commonly 1024×1024) with masks aligned.
- Normalize images per your backbone’s recipe.
- If your trainer expects COCO‑RLE masks, convert PNG → RLE in the dataloader stage.
Provenance & splits
- The dataset was flattened from mirrored directory trees (images and annotations) with 1‑to‑1 filename alignment.
- If you create explicit
train/val/testsplits, please add asplitcolumn to a copy ofmanifest.csvand contribute it back.
License
Data: CC‑BY‑4.0 (Attribution). See LICENSE for details.
Citation
If you use this dataset in academic or industrial research, please cite the accompanying paper/report describing the data collection and labeling protocol:
GeNIE: A Generalizable Navigation System for In-the-Wild Environments
Available at: https://arxiv.org/abs/2506.17960
Contains the SAM-TP traversability dataset and evaluation methodology.
@article{wang2025genie,
title = {GeNIE: A Generalizable Navigation System for In-the-Wild Environments},
author = {Wang, Jiaming and et al.},
journal = {arXiv preprint arXiv:2506.17960},
year = {2025},
url = {https://arxiv.org/abs/2506.17960}
}
@misc{sam_tp_dataset,
title = {SAM‑TP Traversability Dataset},
howpublished = {Hugging Face Datasets},
year = {2025},
note = {URL: https://huggingface.co/datasets/jamiewjm/sam-tp}
}
- Downloads last month
- -