Visuotactile Manipulation with Diffusion Policy

Project Information

  • Category: Robot Learning · Manipulation
  • Role: Graduate Researcher · Purdue
  • Date: 2026
Tech Stack
Diffusion PolicyImitation LearningGelSight TactilePyTorchROSSensor Fusion

Visuotactile Manipulation with Diffusion Policy

CONTACT (CONtact-aware TACTile learning for robotic disassembly) asks a simple question with hard consequences: how much better can a robot manipulate when it can feel, not just see?

We extended the Stanford Diffusion Policy framework to ingest GelSight tactile readings alongside camera observations, building an end-to-end multi-modal pipeline for data collection, training, and on-robot deployment. The policy learns from demonstrations and conditions its action prediction on the fused visual–tactile state, which matters most exactly when vision falls short: during occluded contact, slip, and fine insertion.

Trained policies were deployed on physical hardware to verify the approach on tactile-rich disassembly tasks, where the tactile channel produced measurable success-rate improvements over a vision-only baseline. The work has been accepted to IEEE/RSJ IROS 2026, the field’s flagship robotics conference.

Highlights

  • Accepted to IEEE/RSJ IROS 2026
  • Extended the Stanford Diffusion Policy framework to fuse GelSight tactile feedback with vision
  • Built a multi-modal training pipeline to study tactile-enhanced manipulation
  • Trained and deployed custom policies on a physical robot for contact-rich tasks
  • Measurable success-rate gains over a vision-only baseline
Back to all projects