About Me
I teach robots to perceive, reason, and act. As we move robots out of industrial “cages” and into people’s homes and daily lives, we must build generalist robotic models that can understand the world, reason in a human-centric way, and carry out meaningful real-world tasks.
Jiafei Duan is a fourth & final year PhD student in Robotics and AI at the Paul G. Allen School of Computer Science & Engineering, University of Washington, co-advised by Dieter Fox and Ranjay Krishna. His research centers on robot learning, embodied AI, and building large-scale robotics foundation models. His work has received Best Paper, Spotlight, and Oral recognitions at venues including ICLR, UR, and RSS, and has been featured in MIT Technology Review, GeekWire, VentureBeat, and Business Wire.
Jiafei is also a Graduate Student Researcher at the Allen Institute for AI (AI2) and has previously worked as a Research Scientist Intern at NVIDIA. He earned his B.Eng. in Electrical and Electronic Engineering with Highest Distinction from Nanyang Technological University (NTU), Singapore.
* [Announcement]: I am seeking motivated undergraduate and master’s students for research opportunities in the upcoming academic year at UW or AI2. Sign up here
*I am actively seeking faculty or postdoctoral positions in robotics foundation models and robot learning.
Publications

“MolmoAct: Action Reasoning Models that can Reason in Space”
ArXiv 2025
Jason Lee*, Jiafei Duan*, Haoquan Fang*, Yuquan Deng, Shuo Liu, Boyang Li, Bohan Fang, Jieyu Zhang, Yi Ru Wang, Sangho Lee, Winson Han, Wilbert Pumacay, Angelica Wu, Rose Hendrix, Karen Farley, Eli VanderBilt, Ali Farhadi, Dieter Fox, Ranjay Krishna

“FailSafe: Reasoning and Recovery from Failures in
Vision-Language-Action Models”
ArXiv 2025
Zijun Lin, Jiafei Duan, Haoquan Fang, Dieter Fox, Ranjay Krishna, Cheston Tan, Bihan Wen
|Paper|Code|Project Page|

“RoboEval: Where Robotic Manipulation Meets Structured and Scalable Evaluation”
ArXiv 2025
Yi Ru Wang, Carter Ung, Grant Tannert, Jiafei Duan, Josephine Li, Amy Le, Rishabh Oswal, Markus Grotz, Wilbert Pumacay, Yuquan Deng, Ranjay Krishna, Dieter Fox, Siddhartha Srinivasa.

“Point Arena: Probing Multimodal Grounding Through Language-Guided Pointing”
ArXiv 2025
Long Cheng∗ , Jiafei Duan∗ , Yi Ru Wang† , Haoquan Fang† , Boyang Li†, Yushan Huang, Elvis Wang, Ainaz Eftekhar, Jason Lee, Wentao Yuan, Rose Hendrix, Noah A. Smith, Fei Xia, Dieter Fox, Ranjay Krishna

“The One RING: a Robotic Indoor Navigation Generalist”
ArXiv 2025
Ainaz Eftekhar, Rose Hendrix, Luca Weihs, Jiafei Duan, Ege Caglar, Jordi Salvador, Alvaro Herrasti, Winson Han, Eli VanderBilt, Aniruddha Kembhavi, Ali Farhadi, Ranjay Krishna, Kiana Ehsani, Kuo-Hao Zeng

“From Mystery to Mastery: Failure Diagnosis for Improving Manipulation Policies”
ArXiv 2025
Som Sager, Jiafei Duan, Sreevishakh Vasudevan, Yifan Zhou, Heni Ben Amor, Dieter Fox, Ransalu Senanayake

“GraspMolmo: Generalizable Task-Oriented Grasping via Large-Scale Synthetic Data Generation”
CoRL 2025
Abhay Deshpande, Yuquan Deng, Arijit Ray, Jordi Salvador, Winson Han, Jiafei Duan, Kuo-Hao Zeng, Yuke Zhu, Ranjay Krishna, Rose Hendrix

“SAT: Spatial Aptitude Training for Multimodal Language Models“
COLM 2025
Arijit Ray, Jiafei Duan, Reuben Tan, Dina Bashkirova, Ross Hendrix, Kiana Ehsani, Aniruddha Kembhavi, Bryan A. Plummer, Ranjay Krishna*, Kuo-Hao Zeng*, Kate Saenko*

“SAM2Act: Integrating Visual Foundation Model with A Memory Architecture for Robotic Manipulation”
ICML 2025
RememberRL workshop@CoRL 2025 Best Paper
Haoquan Fang, Markus Grotz, Wilbert Pumacay, Yi Ru Wang, Dieter Fox*, Ranjay Krishna*, Jiafei Duan*

“AHA: A Vision-Language-Model for Detecting and Reasoning over Failures in Robotic Manipulation“
ICLR 2025
Jiafei Duan, Wilbert Pumacay, Nishanth Kumar, Yi Ru Wang, Shulin Tian, Wentao Yuan, Ranjay Krishna, Dieter Fox, Ajay Mandlekar*, Yijie Guo*

“Manipulate-Anything: Automating Real-World Robots using Vision-Language Models”
CoRL 2024
Jiafei Duan*, Wentao Yuan*, Wilbert Pumacay, Yi Ru Wang, Kiana Ehsani, Dieter Fox, Ranjay Krishna

“RoboPoint: A Vision-Language Model for Spatial Affordance Prediction for Robotics”
CoRL 2024
Wentao Yuan, Jiafei Duan, Valts Blukis, Wilbert Pumacay, Ranjay Krishna, Adithyavairavan Murali, Arsalan Mousavian, Dieter Fox

“Octopi: Object Property Reasoning with Large Tactile-Language Models”
RSS 2024, Oral
Samson Yu, Kelvin Lin, Anxing Xiao, Jiafei Duan, Harold Soh

“THE COLOSSEUM: A Benchmark for Evaluating Generalization for Robotic Manipulation”
RSS 2024, Oral
Wibert Pumacay*, Ishika Singh*, Jiafei Duan*, Ranjay Krishna, Jesse Thomason, Dieter Fox

“Selective Visual Representations Improve Convergence and Generalization for Embodied-AI”
ICLR 2024, Spotlight
Ainaz Eftekhar*, Kuo-Hao Zeng*, Jiafei Duan, Ali Farhadi, Ani Kembhavi , Ranjay Krishna

“NEWTON: Are Language models Capable of Physical Reasoning”
EMNLP 2023
Yi Ru Wang, Jiafei Duan, Dieter Fox, Siddhartha Srinivasa

“AR2-D2:Training a Robot Without a Robot”
CoRL 2023
Jiafei Duan, Yi Ru Wang, Mohit Shridhar, Dieter Fox, Ranjay Krishna

“Good Time to Ask: A Learning Framework for Asking for Help in Embodied Visual Navigation”
Ubiquitous Robots 2023, Best Paper Award
Jenny Zhang, Samson Yu, Jiafei Duan, Cheston Tan


“A Survey on Machine Learning Approaches for Modelling Intuitive Physics ”
IJCAI 2022, Oral
Jiafei Duan*, Arijit Dasgupta*, Jason Fischer, Cheston Tan

“PIP: Physical Interaction Prediction via Mental Simulation with Span Selection”
ECCV 2022
Jiafei Duan*, Samson Yu*, Soujanya Poria, Bihan Wen, Cheston Tan

“A Survey of Embodied AI: From Simulators to Research Tasks.”
IEEE Transactions on Emerging Topics in Computational Intelligence
Jiafei Duan, Samson Yu, Tan Hui Li, Hongyuan Zhu, Cheston Tan

“ActioNet: An Interactive End-to-End Platform for Tasked-Based Data Collection and Augmentation in 3D Environment.”
ICIP 2020
Jiafei Duan, Samson Yu, Tan Hui Li, Cheston Tan
Invited Talks
- Mila Robot Learning Seminar: Towards robotics foundation that can reason (REAL Lab)
- UT Dallas: Towards robotics foundation that can reason (Yu Xiang)
- UT Austin: Towards robotics foundation that can reason (Yuke Zhu)
- Workshop on Generalizable Priors for Robot Manipulation @ CoRL 2025: Grounded Reasoning from Vision-Language Models for Robotics Manipulation (Keynote speaker)
- META FAIR Robotics Group: Towards robotics foundation that can reason (Host: Homanga Bharadhwaj)
- John Hopkins Univeristy: Towards robotics foundation that can reason (Host: Tianmin Shu & Peter Kazanzides (LCSR))
- Georgia Tech: Towards robotics foundation that can reason (Host: GT Institute for Robotics and Intelligent Machines)
- Boston University: Towards robotics foundation that can reason
- TRI LBM Group: Towards robotics foundation that can reason (Host: Jose Barreiros)
- Cohere Lab: Towards robotics foundation that can reason (Host: Surya guthikonda)
- David Hsu NUS Group: Towards robotics foundation that can reason (Host: Yiqing Xu)
- NTU EEE: Towards robotics foundation that can reason (Host: Wen Bihan)
- Stanford PAIR Group: Towards a unified multimodal large language model for robotics (Host: Wenlong Huang)
- Franka Robotics Headquarter: Towards a unified multimodal large language model for robotics (Host: Sven Parusel (VP of Franka))
- CMU RCHI Group: Grounded Embodied Intelligence: Grounding Reasoning from Multimodal Language Models into Robotics Manipulation (Host: Zackory Erickson)
- RoboPapers: SAM2Act: Integrating Visual Foundation Model with A Memory Architecture for Robotic Manipulation (Host: Michael Cho and Chris Paxton)
- Amazon Lab126: Towards Democratizing Robot Learning for All (Host: Yuyin Sun)
- The AI Talks: Towards Democratizing Robot Learning for All (Host: AI talks organizer)
- Allen School Colloquium: Democratizing Robot Learning for All (Host: GRAIL Lab)
- NUS CLeAR Lab: Benchmarking Robot Learning for Manipulation (Host: Harold Soh)
- AAAI Summer Symposium: AR2-D2: Training a Robot without a Robot (Host: Workshop organizer)
Academic & Workshop Service
- 3D Vision Language Models (VLMs) for Robotics Manipulation: Opportunities and Challenges @CVPR2025
- Generalization in Robotics Manipulation Workshop and Challenges@CVPR2025
- Mobile Manipulation: Emerging Opportunities & Contemporary Challenges@RSS 2025
- SPACE in Vision, Language, and Embodied AI@NeurIPS 2025
- Generalizable Priors for Robot Manipulation@CoRL 2025
Reviewers for CVPR, ECCV, NeurIPS, ICLR, ICML, ICRA, ICCV, Cogsci, CVPR Workshop on 3D Vision and Robotics, RA-L, IEEE Transactions on Automation Science and Engineering, Pattern Recognition and IROS
Fun things I do besides robotics
Manipulation – I practice and perform magic professionally. [Performance]
Navigation – I love to travel and see the world. [Youtube Vlog]
