Visuotactile Manipulation with Diffusion Policy

Project Information

Category: Robot Learning · Manipulation
Role: Graduate Researcher · Purdue
Date: 2026

Tech Stack

Diffusion PolicyImitation LearningGelSight TactilePyTorchROSSensor Fusion

Visuotactile Manipulation with Diffusion Policy

CONTACT (CONtact-aware TACTile learning for robotic disassembly) asks a simple question with hard consequences: how much better can a robot manipulate when it can feel, not just see?

We extended the Stanford Diffusion Policy framework to ingest GelSight tactile readings alongside camera observations, building an end-to-end multi-modal pipeline for data collection, training, and on-robot deployment. The policy learns from demonstrations and conditions its action prediction on the fused visual–tactile state, which matters most exactly when vision falls short: during occluded contact, slip, and fine insertion.

Trained policies were deployed on physical hardware to verify the approach on tactile-rich disassembly tasks, where the tactile channel produced measurable success-rate improvements over a vision-only baseline. The work has been accepted to IEEE/RSJ IROS 2026, the field’s flagship robotics conference.

Highlights

Accepted to IEEE/RSJ IROS 2026
Extended the Stanford Diffusion Policy framework to fuse GelSight tactile feedback with vision
Built a multi-modal training pipeline to study tactile-enhanced manipulation
Trained and deployed custom policies on a physical robot for contact-rich tasks
Measurable success-rate gains over a vision-only baseline

Back to all projects