Yan Ding

I am a Researcher at Shanghai AI Laboratory, working on AI-driven embodied agents and decision-making systems, directed by Professor Xuelong Li. I completed my PhD in Computer Science at the State University of New York (SUNY), Binghamton, in February 2024. I supported by grants from the Ford Motor Company. I also got the Academic Excellence in Computer Science (PhD) at Binghamton University. I was supervised by Professor Chao Chen, during my master's program. I received my M.S. in Computer Science in 2019, and got my B.S. in Mechanical Engineering in 2016 from Chongqing University, China. The master's thesis, awarded the Outstanding Thesis of Chongqing City, is available via this Link.

[ Homepage (No VPN Needed)] [Google Scholar] [CV (Dec 2023)]

I am seeking interns for Embodied AI and Robotics! Feel free to contact me at yding25@binghamton.edu.

Research Direction

The current research interests include:

Spatial Intelligence for Robotics: Empowering Robots to Understand the Real World
Skill Learning for Robotics: Enabling Robots to Transform the Real World

with a particular emphasis on their applications in the context of mobile manipulators (MoMa).

Robotic Tool

My team has several robots, including five robotic arms and two mobile manipulators. Therefore, we are developing an open-source robotic tool called BestMan. This tool supports development both in simulation and on real machines. By using a unified framework, BestMan facilitates rapid development, helping researchers save significant time. (Note: BestMan is still under construction.)

This project encompasses various sub-projects (selected):

This is my YouTube channel @yanding1760, featuring several videos centered around robots.

Interns / Graduate Students

Zhaxizhuoma (Internship Period: 2024.05 -- )
Master, Technische Universität Berlin

Ziniu Wu (Internship Period: 2024.06 -- )
PhD student, University of Bristol

Yuhan Wu (Internship Period: 2024.07 -- )
Master student, University of Chinese Academy of Sciences

Chuyue Guan (Internship Period: 2024.08 -- )
Master Student, Stanford University

Tianyu Wang (Internship Period: 2024.08 -- )
Master Student, Fudan University

Zhongjie Jia
PhD Student, Shanghai Jiaotong University

Kehui Liu
PhD Student, Northwestern Polytechnical University

Pengyuan Wu
Pre-PhD Student, Zhejiang University

Publication

	DKPROMPT: Domain Knowledge Prompting Vision-Language Models for Open-World Planning Xiaohan Zhang, Zainab Altaweel, Yohei Hayamizu, Yan Ding, Saeid Amiri, Hao Yang, Andy Kaminski, Chad Esselink, Shiqi Zhang Under Review [Paper]
	A Survey of Optimization-based Task and Motion Planning: From Classical To Learning Approaches Zhigen Zhao, Shuo Chen, Yan Ding, Ziyi Zhou, Shiqi Zhang, Danfei Xu, Ye Zhao IEEE/ASME Transactions on Mechatronics, 2024 [Paper]
	MoMa-Pos: Where Should Mobile Manipulators Stand in Cluttered Environment Before Task Execution? Beichen Shao, Yan Ding#*, Xingchen Wang, Xuefeng Xie, Fuqiang Gu, Jun Luo, Chao Chen# Under Review* [Paper] [Project] [Video] [Code] Mobile manipulators always need to determine feasible base positions prior to carrying out navigation-manipulation tasks. Real-world environments are often cluttered with various furniture, obstacles, and dozens of other objects. Efficiently computing base positions poses a challenge. In this work, we introduce a framework named MoMa-Pos to address this issue.
	Task and Motion Planning with Large Language Models for Object Rearrangement Yan Ding, Xiaohan Zhang, Chris Paxton, Shiqi Zhang International Conference on Intelligent Robots and Systems (IROS), 2023 [Paper] [Project] [Video] [Code] LLM-GROP is a method that uses prompting to extract commonsense knowledge about object configurations from a large language model and instantiates them with a task and motion planner, allowing for successful and efficient multi-object rearrangement in various environments using a mobile manipulator.
	ARDIE: AR, Dialogue, and Eye Gaze Policies for Human-Robot Collaboration Chelsea Zou, Kishan Chandan, Yan Ding, Shiqi Zhang ICRA Workshop on CoPerception: Collaborative Perception and Learning, 2023 [Paper]
	Symbolic State Space Optimization for Long Horizon Mobile Manipulation Planning Xiaohan Zhang, Yifeng Zhu, Yan Ding, Yuqian Jiang, Yuke Zhu, Peter Stone, and Shiqi Zhang International Conference on Intelligent Robots and Systems (IROS), 2023 [Paper]
	Learning to Reason about Contextual Knowledge for Planning under Uncertainty Cheng Cui, Saeid Amiri, Yan Ding, Xingyue Zhan, Shiqi Zhang The Conference on Uncertainty in Artificial Intelligence (UAI), 2023 [Paper]
	ORLA: Mobile Manipulator-Based Object Rearrangement with Lazy A Kai Gao, Yan Ding, Shiqi Zhang, Jingjin Yu Under Review* [Paper] In this research, we propose ORLA*, which leverages delayed (lazy) evaluation in searching for a high-quality object pick and place sequence that considers both end-effector and mobile robot base travel.
	Grounding Classical Task Planners via Vision-Language Models Xiaohan Zhang, Yan Ding, Saeid Amiri, Hao Yang, Andy Kaminski, Chad Esselink, and Shiqi Zhang ICRA Workshop on Robot Execution Failures and Failure Management Strategies, 2023 [Paper]
	Integrating Action Knowledge and LLMs for Task Planning and Situation Handling in Open Worlds Yan Ding, Xiaohan Zhang, Saeid Amiri, Nieqing Cao, Hao Yang, Chad Esselink, Shiqi Zhang Autonomous Robots (accepted) [Paper] [Project] [Video] [Code] The paper introduces a new algorithm (COWP) that uses task-oriented common sense extracted from Large Language Models to help robots handle unforeseen situations and complete complex tasks in an open world, with better success rates than previous algorithms.
	Learning to Ground Objects for Robot Task and Motion Planning Yan Ding, Xiaohan Zhang, Xingyue Zhan, Shiqi Zhang IEEE Robotics and Automation Letters (RA-L), 2022 [Paper] [Project] [Code] [Presentation] The paper presents a new robot planning algorithm, TMOC, which can handle complex real-world scenarios without prior knowledge of object properties by learning them through a physics engine, outperforming existing algorithms.
	Visually Grounded Task and Motion Planning for Mobile Manipulation Xiaohan Zhang, Yifeng Zhu, Yan Ding, Yuke Zhu, Peter Stone, and Shiqi Zhang International Conference on Robotics and Automation (ICRA), 2022 [Paper] [Project]
	Task and Situation Structures for Case-Based Planning Hao Yang, Tavan Eftekhar, Chad Esselink, Yan Ding, Shiqi Zhang International Conference on Case-Based Reasoning (ICCBR), 2021 [Paper]
	Task-Motion Planning for Safe and Efficient Urban Driving Yan Ding, Xiaohan Zhang, Xingyue Zhan, Shiqi Zhang International Conference on Intelligent Robots and Systems (IROS), 2020. [Paper] [Project] [Code] [Demo] [Presentation] Autonomous vehicles need to balance efficiency and safety when planning tasks and motions, and the algorithm Task-Motion Planning for Urban Driving (TMPUD) enables communication between planners for optimal performance.
	DAVT: an error-bounded vehicle trajectory data representation and compression framework Chao Chen, Yan Ding, Suiming Guo, Yasha Wang IEEE TVT, 2020. [PDF] DAVT proposes a mobile edge computing solution for vehicle trajectory data compression, which reduces data at the source and lowers communication and storage costs, using three compressors for distance, acceleration, velocity, and time data parts, and outperforms other baselines according to evaluation results.
	VTracer: When online vehicle trajectory compression meets mobile edge computing Chao Chen, Yan Ding, Zhu Wang, Junfeng Zhao, Bin Guo, Daqing Zhang IEEE Systems Journal, 2019. [PDF] This paper proposes an online trajectory compression framework that uses SD-Matching for GPS alignment and HCC for compression, and demonstrates its effectiveness and efficiency using real-world datasets in Beijing and deployment in Chongqing.
	TrajCompressor: An Online Map-matching-based Trajectory Compression Framework Leveraging Vehicle Heading Direction and Change Chao Chen, Yan Ding, Xuefeng Xie, Shu Zhang, Zhu Wang, Liang Feng IEEE TITS, 2019. [PDF] This paper presents an online trajectory compression framework for reducing storage, communication, and computation issues caused by massive and redundant vehicle trajectory data, consisting of two phases: online trajectory mapping and trajectory compression, using Spatial-Directional Matching and Heading Change Compression algorithms respectively, which have been evaluated with real-world datasets in Beijing and deployed in Chongqing, showing higher accuracy and efficiency compared to state-of-the-art algorithms.
	Fuel Consumption Estimation of Potential Driving Paths by Leveraging Online Route APIs Chao Chen, Yan Ding, Xuefeng Xie, Xuefeng Xie, Zhikai Yang Green, Pervasive, and Cloud Computing: 13th International Conference (GPC), 2018. [PDF] This paper proposes a fuel consumption model based on GPS trajectory and OBD-II data, which can estimate the fuel usage of driving paths and help drivers choose fuel-efficient routes to reduce greenhouse gas and pollutant emissions.
	A three-stage online map-matching algorithm by fully using vehicle heading direction Chao Chen, Yan Ding, Xuefeng Xie, Shu Zhang Journal of Ambient Intelligence and Humanized Computing, 2018. [PDF] The SD-Matching algorithm proposes a three-stage approach to improve the accuracy and speed of online map-matching by incorporating vehicle heading direction data.
	Greenplanner: Planning personalized fuel-efficient driving routes using multi-sourced urban data Yan Ding, Chao Chen, Shu Zhang, Bin Guo, Zhiwen Yu, Yasha Wang IEEE PerCom, 2017. [PDF] Greenhouse gas emissions from vehicles in modern cities is a significant problem, but recommending fuel-efficient routes to drivers through a personalized fuel consumption model can help alleviate this issue, as demonstrated by the successful implementation of GreenPlanner in Beijing, which achieved a mean fuel consumption error of less than 7% and an average savings of 20% fuel consumption for suggested routes.

Template from here.