The Source of Ground-Truth Human Motion & Behavior Data
142,200 motion sequences structured and optimized for humanoid control research. Focused on locomotion, object manipulation, gestures, and everyday human behavior — the motions robots actually need to learn.
Watch Full Video
Available in SOMA, Unitree G1 (MuJoCo compatible), and Vicon skeleton formats. Full semantic metadata with up to 6 natural language descriptions per motion and hierarchical categorization. Open source for academic use and qualifying startups. Gated access on Hugging Face.
700+ hours of labeled and annotated human 3D animations. Performed by 170+ physically diverse performers — including professionals like stuntmen, soldiers, and dancers.
Watch Full Video
Every motion captured in 20+ styles — from emotions to physical conditions. Optical motion capture (Vicon) at 120fps with submillimeter accuracy. Unified skeleton retarget across all files. Available in BVH, FBX. 3 multiview videos per take. Rich metadata with up to 5 descriptions per motion.
Systematically designed taxonomy across 21 categories :
When you license BONES data, you get full IP clearance. Enterprise-ready from day one
Basic, advanced, unusual, complex actions - from walking to parkour
Objects, manipulation, household, environments - grasping, opening, kitchen tasks
Gestures, pointing, consuming - body language, everyday behaviors
Dancing, sports, martial arts, stunts, combat - full-body dynamics
BONES RP1 data powered SONIC - a foundation model for humanoid whole-body control.
The only dataset where 3D motion, video, audio, face capture, and 3D scene reconstruction are synchronized frame-by-frame across the same performance. Not stitched from separate sources — captured together, in one take.
The depth to train. The precision to evaluate.wherever your pipeline needs spatial awareness, physics verification, or human behavior understanding.
- Skeletal 3D motion
- Full hand articulation
- Face video capture + FACS
- Egocentric stereo vision
- 3D body scans & models
- Audio & voice
- 8× 4K multi-view video
- Temporal annotations ( action-level )