Jiafei Duan

Jiafei Duan
PhD Student

Contact me at

duanj1 [at] cs.washington.edu

About Me

I teach robots to perceive, reason, and act. As we move robots out of industrial “cages” and into people’s homes and daily lives, we must build generalist robotic models that can understand the world, reason in a human-centric way, and carry out meaningful real-world tasks.

Jiafei Duan is an incoming Presidential Young Professor at the National University of Singapore, School of Computing and i lead the MAGIC Lab at NUS. I did my PhD student in Robotics and AI at the Paul G. Allen School of Computer Science & Engineering, University of Washington, co-advised by Ranjay Krishna and Dieter Fox . His research centers on robot learning, embodied AI, and building large-scale robotics foundation models. His work has received Best Paper, Spotlight, and Oral recognitions at venues including ICLR, UR, and RSS, and has been featured in MIT Technology Review, GeekWire, VentureBeat, and Business Wire.
Jiafei is also a Graduate Student Researcher at the Allen Institute for AI (AI2) and has previously worked as a Research Scientist Intern at NVIDIA. He earned his B.Eng. in Electrical and Electronic Engineering with Highest Distinction from Nanyang Technological University (NTU), Singapore.

[Announcement]:
I’m recruiting PhD students, Postdocs, RAs, Masters, and Undergrads to work with me.
PhD, postdoc, and RA positions come with full funding. If you’re excited about bring AI into the physical world through robotics, I’d love to hear from you. (*It is not a must to have robotics background.)
Sign up here and send me an email to let me know.

Publications

“MolmoAct2: Action Reasoning Models for Real-world Deployment”

ArXiv 2026

Haoquan Fang*, Jiafei Duan*, Donovan Clay, Sam Wang, Shuo Liu, Weikai Huang, Xiang Fan, Wei-Chuan Tsai, Shirui Chen, Yi Ru Wang, Shanli Xing, Jaemin Cho, Jae Sung Park, Ainaz Eftekhar, Peter Sushko, Karen Farley, Angad Wadhwa, Cole Harrison, Winson Han, Ying-Chun Lee, Eli VanderBilt, Rose Hendrix, Suveen Ellawela, Lucas Ngoo, Joyce Chai, Zhongzheng Ren, Ali Farhadi, Dieter Fox, Ranjay Krishna

“WildDet3D: Scaling Promptable 3D Detection in the Wild”

ArXiv 2026

Weikai Huang, Jieyu Zhang, Sijun Li, Taoyang Jia, Jiafei Duan, Yunqian Cheng, Jaemin Cho, Matthew Wallingford, Rustin Soraki, Chris Dongjoo Kim, Shuo Liu, Donovan Clay, Taira Anderson, Winson Han, Ali Farhadi, Bharath Hariharan, Zhongzheng Ren, Ranjay Krishna

“Recurrent-Depth VLA: Implicit Test-Time Compute Scaling of Vision-Language-Action Models via Latent Iterative Reasoning”

ArXiv 2026

Yalcin Tur, Jalal Naghiyev, Haoquan Fang, Wei-Chuan Tsai, Jiafei Duan*, Dieter Fox*, Ranjay Krishna*

|Paper|Code|Project Page|

“TOPReward: Token Probabilities as Hidden Zero-Shot Rewards for Robotics“

ArXiv 2026

Shirui Chen, Cole Harrison, Ying-Chun Lee, Angela Jin Yang, Jason Ren, Lillian J.Ratliff, Jiafei Duan*, Dieter Fox*, Ranjay Krishna*

|Paper|Code|Project Page|

“MolmoBot: Training robot manipulation entirely in simulation”

ArXiv 2026

Abhay Deshpande*, Maya Guru*, …, Jiafei Duan,…, Ranjay Krishna

|Paper|Code|Project Page|

“VLS: Steering Pretrained Robot Policies
via Vision–Language Models”

ArXiv 2026

Shuo Liu, Ishneet Sukhvinder Singh, Yiqing Xu, Jiafei Duan*, Ranjay Krishna*

|Paper|Code|Project Page|

“FailSafe: Reasoning and Recovery from Failures in
Vision-Language-Action Models”

ArXiv 2025

Zijun Lin, Jiafei Duan, Haoquan Fang, Dieter Fox, Ranjay Krishna, Cheston Tan, Bihan Wen

|Paper|Code|Project Page|

“Point Arena: Probing Multimodal Grounding Through Language-Guided Pointing”

ArXiv 2025

Long Cheng^∗ , Jiafei Duan^∗ , Yi Ru Wang^† , Haoquan Fang^† , Boyang Li^†, Yushan Huang, Elvis Wang, Ainaz Eftekhar, Jason Lee, Wentao Yuan, Rose Hendrix, Noah A. Smith, Fei Xia, Dieter Fox, Ranjay Krishna

“MolmoSpaces: A Large-Scale Open Ecosystem for Robot Navigation and Manipulation”

RSS 2026, Oral

Yejin Kim*, Wilbert Pumacay*, …, Jiafei Duan,…, Dieter Fox, Ranjay Krishna

|Paper|Code|Project Page|

“MolmoAct: Action Reasoning Models that can Reason in Space”

ICRA 2026

Jason Lee*, Jiafei Duan*, Haoquan Fang*, Yuquan Deng, Shuo Liu, Boyang Li, Bohan Fang, Jieyu Zhang, Yi Ru Wang, Sangho Lee, Winson Han, Wilbert Pumacay, Angelica Wu, Rose Hendrix, Karen Farley, Eli VanderBilt, Ali Farhadi, Dieter Fox, Ranjay Krishna

“RoboCade: Gamifying Robot Data Collection”

ICRA 2026

Suvir Mirchandani*, Mia Tang*, Jiafei Duan, Jubayer Ibn Hamid, Michael Cho, Dorsa Sadigh

|Paper|Project Paper|

“RoboEval: Where Robotic Manipulation Meets Structured and Scalable Evaluation”

ICRA 2026

Yi Ru Wang, Carter Ung, Grant Tannert, Jiafei Duan, Josephine Li, Amy Le, Rishabh Oswal, Markus Grotz, Wilbert Pumacay, Yuquan Deng, Ranjay Krishna, Dieter Fox, Siddhartha Srinivasa.

|Project Page|Paper|Code|

“The One RING: a Robotic Indoor Navigation Generalist”

ICRA 2026, Oral

Ainaz Eftekhar, Rose Hendrix, Luca Weihs, Jiafei Duan, Ege Caglar, Jordi Salvador, Alvaro Herrasti, Winson Han, Eli VanderBilt, Aniruddha Kembhavi, Ali Farhadi, Ranjay Krishna, Kiana Ehsani, Kuo-Hao Zeng

|Project Page|Paper|Code|

“From Mystery to Mastery: Failure Diagnosis for Improving Manipulation Policies”

ICLR 2026

Som Sager, Jiafei Duan, Sreevishakh Vasudevan, Yifan Zhou, Heni Ben Amor, Dieter Fox, Ransalu Senanayake

|Project Page|Paper|Code|

“GraspMolmo: Generalizable Task-Oriented Grasping via Large-Scale Synthetic Data Generation”

CoRL 2025

Abhay Deshpande, Yuquan Deng, Arijit Ray, Jordi Salvador, Winson Han, Jiafei Duan, Kuo-Hao Zeng, Yuke Zhu, Ranjay Krishna, Rose Hendrix

“SAT: Spatial Aptitude Training for Multimodal Language Models“

COLM 2025

Arijit Ray, Jiafei Duan, Reuben Tan, Dina Bashkirova, Ross Hendrix, Kiana Ehsani, Aniruddha Kembhavi, Bryan A. Plummer, Ranjay Krishna*, Kuo-Hao Zeng*, Kate Saenko*

|Project Page|Paper|Dataset|

“SAM2Act: Integrating Visual Foundation Model with A Memory Architecture for Robotic Manipulation”

ICML 2025

RememberRL workshop@CoRL 2025 Best Paper

Haoquan Fang, Markus Grotz, Wilbert Pumacay, Yi Ru Wang, Dieter Fox*, Ranjay Krishna*, Jiafei Duan*

“AHA: A Vision-Language-Model for Detecting and Reasoning over Failures in Robotic Manipulation“

ICLR 2025

Jiafei Duan, Wilbert Pumacay, Nishanth Kumar, Yi Ru Wang, Shulin Tian, Wentao Yuan, Ranjay Krishna, Dieter Fox, Ajay Mandlekar*, Yijie Guo*

“Manipulate-Anything: Automating Real-World Robots using Vision-Language Models”

CoRL 2024

Jiafei Duan*, Wentao Yuan*, Wilbert Pumacay, Yi Ru Wang, Kiana Ehsani, Dieter Fox, Ranjay Krishna

|Project page |Paper|Code|

“RoboPoint: A Vision-Language Model for Spatial Affordance Prediction for Robotics”

CoRL 2024

Wentao Yuan, Jiafei Duan, Valts Blukis, Wilbert Pumacay, Ranjay Krishna, Adithyavairavan Murali, Arsalan Mousavian, Dieter Fox

EVE: Enabling Anyone to Train Robot using Augmented Reality”

UIST 2024

Jun Wang, Chun-Cheng Chang*, Jiafei Duan*, Dieter Fox, Ranjay Krishna

|Paper|Project|

“Octopi: Object Property Reasoning with Large Tactile-Language Models”

RSS 2024, Oral

Samson Yu, Kelvin Lin, Anxing Xiao, Jiafei Duan, Harold Soh

|Project Page|Code|Paper|

“THE COLOSSEUM: A Benchmark for Evaluating Generalization for Robotic Manipulation”

RSS 2024, Oral

Wibert Pumacay*, Ishika Singh*, Jiafei Duan*, Ranjay Krishna, Jesse Thomason, Dieter Fox

“Selective Visual Representations Improve Convergence and Generalization for Embodied-AI”

ICLR 2024, Spotlight

Ainaz Eftekhar*, Kuo-Hao Zeng*, Jiafei Duan, Ali Farhadi, Ani Kembhavi , Ranjay Krishna

|Paper |Project Page|Code|

“NEWTON: Are Language models Capable of Physical Reasoning”

EMNLP 2023

Yi Ru Wang, Jiafei Duan, Dieter Fox, Siddhartha Srinivasa

“AR2-D2:Training a Robot Without a Robot”

CoRL 2023

Jiafei Duan, Yi Ru Wang, Mohit Shridhar, Dieter Fox, Ranjay Krishna

“Good Time to Ask: A Learning Framework for Asking for Help in Embodied Visual Navigation”

Ubiquitous Robots 2023, Best Paper Award

Jenny Zhang, Samson Yu, Jiafei Duan, Cheston Tan

|Paper |Project Page |Code|

“A Benchmark for Modeling Violation-of-Expectation in Physical Reasoning Across Event Categories”

CogSci 2023

Arijit Dasgupta, Jiafei Duan, Marcelo Ang, Yi Lin, Su-Hua Wang, Renée Baillargeon, Cheston Tan

|Paper |Code |Dataset|

“A Survey on Machine Learning Approaches for Modelling Intuitive Physics ”

IJCAI 2022, Oral

Jiafei Duan*, Arijit Dasgupta*, Jason Fischer, Cheston Tan

|Paper |Project Page |Video|

“PIP: Physical Interaction Prediction via Mental Simulation with Span Selection”

ECCV 2022

Jiafei Duan*, Samson Yu*, Soujanya Poria, Bihan Wen, Cheston Tan

|Paper |Project Page |Code|

“A Survey of Embodied AI: From Simulators to Research Tasks.”

IEEE Transactions on Emerging Topics in Computational Intelligence

Jiafei Duan, Samson Yu, Tan Hui Li, Hongyuan Zhu, Cheston Tan

|Paper|CIS Journal Featured Publication|

“ActioNet: An Interactive End-to-End Platform for Tasked-Based Data Collection and Augmentation in 3D Environment.”

ICIP 2020

Jiafei Duan, Samson Yu, Tan Hui Li, Cheston Tan

Invited Talks

UWaterloo CS Seminar: Building Robotics Foundation Models with Reasoning-in-the-Loop (Host: Victor Zhong)
UT Austin CS Seminar: Building Robotics Foundation Models with Reasoning-in-the-Loop (Host: Joydeep Biswas)
University of Michigan CSE Seminar: Building Robotics Foundation Models with Reasoning-in-the-Loop (Host: Bernadette Bucher)
UC Irvine: Building Robotics Foundation Models with Reasoning-in-the-Loop (Host: Sven Koenig)
UPenn SFI: Building Robotics Foundation Models with Reasoning-in-the-Loop (Host: UPenn GRASP Lab)
MPI for Informatics Talk: Building Robotics Foundation Models with Reasoning-in-the-Loop (Host: Christian Theobalt)
Mila Robot Learning Seminar: Towards robotics foundation that can reason (REAL Lab)
TTIC Young Researcher Seminar Series: Towards robotics foundation that can reason (Matt Walter)
UT Dallas: Towards robotics foundation that can reason (Yu Xiang)
UT Austin: Towards robotics foundation that can reason (Yuke Zhu)
Workshop on Generalizable Priors for Robot Manipulation @ CoRL 2025: Grounded Reasoning from Vision-Language Models for Robotics Manipulation (Keynote speaker)
META FAIR Robotics Group: Towards robotics foundation that can reason (Host: Homanga Bharadhwaj)
John Hopkins Univeristy: Towards robotics foundation that can reason (Host: Tianmin Shu & Peter Kazanzides (LCSR))
Georgia Tech: Towards robotics foundation that can reason (Host: GT Institute for Robotics and Intelligent Machines)
Boston University: Towards robotics foundation that can reason
TRI LBM Group: Towards robotics foundation that can reason (Host: Jose Barreiros)
Cohere Lab: Towards robotics foundation that can reason (Host: Surya guthikonda)
David Hsu NUS Group: Towards robotics foundation that can reason (Host: Yiqing Xu)
NTU EEE: Towards robotics foundation that can reason (Host: Wen Bihan)
Stanford PAIR Group: Towards a unified multimodal large language model for robotics (Host: Wenlong Huang)
Franka Robotics Headquarter: Towards a unified multimodal large language model for robotics (Host: Sven Parusel (VP of Franka))
CMU RCHI Group: Grounded Embodied Intelligence: Grounding Reasoning from Multimodal Language Models into Robotics Manipulation (Host: Zackory Erickson)
RoboPapers: SAM2Act: Integrating Visual Foundation Model with A Memory Architecture for Robotic Manipulation (Host: Michael Cho and Chris Paxton)
Amazon Lab126: Towards Democratizing Robot Learning for All (Host: Yuyin Sun)
The AI Talks: Towards Democratizing Robot Learning for All (Host: AI talks organizer)
Allen School Colloquium: Democratizing Robot Learning for All (Host: GRAIL Lab)
NUS CLeAR Lab: Benchmarking Robot Learning for Manipulation (Host: Harold Soh)
AAAI Summer Symposium: AR2-D2: Training a Robot without a Robot (Host: Workshop organizer)

Academic & Workshop Service

3D Vision Language Models (VLMs) for Robotics Manipulation: Opportunities and Challenges @CVPR2025
Generalization in Robotics Manipulation Workshop and Challenges@CVPR2025
Mobile Manipulation: Emerging Opportunities & Contemporary Challenges@RSS 2025
SPACE in Vision, Language, and Embodied AI@NeurIPS 2025
Generalizable Priors for Robot Manipulation@CoRL 2025

Reviewers for CVPR, ECCV, NeurIPS, ICLR, ICML, ICRA, ICCV, Cogsci, CVPR Workshop on 3D Vision and Robotics, RA-L, IEEE Transactions on Automation Science and Engineering, Pattern Recognition and IROS

Teaching experience

I have 4 years experience as private science tutors for O Level examination (teaching chemistry, physics, and maths.)

Beyond that, i served as heading Teaching Assistants for:
CSE 599H-Wi23 (Artificial Intelligence vs Intelligence Augmentation), CSE 571-Sp25 (Robotics)

I also co-instructed CSE 571 (Robotics)-Wi26 with Dieter Fox.

Fun things I do besides robotics

Manipulation – I practice and perform magic professionally. [Performance]

Navigation – I love to travel and see the world. [Youtube Vlog]