Publications

You can find my full list of publications on my Google Scholar profile.

Selected Publications

	AutoMA: Automated Modular Attention enables Context-Rich Imitation Learning using Foundation Models Yifan Zhou, Xiao Liu, Quan Vuong, Heni Ben Amor Under Review, 2024 website Introduces AutoMA, an imitation learning framework that supports leveraging (V)LLMs to generate rich contexts for robot training. Experiments show high data efficiency and success rate.
	"Task Success" is not Enough: Investigating the Use of Video-Language Models as Behavior Critics for Catching Undesirable Agent Behaviors Lin Guan, Yifan Zhou, Denis Liu, Yantian Zha, Heni Ben Amor, Subbarao Kambhampati COLM, 2024 website / arxiv / code When no sound verifier is available, can we use large vision and language models (VLMs), which are approximately omniscient, as scalable Behavior Critics to catch undesirable embodied agent behaviors in videos? To answer this, we first construct a benchmark that contains diverse cases of goal-reaching yet undesirable agent policies. Then, we comprehensively evaluate VLM critics to gain a deeper understanding of their strengths and failure modes.
	Diff-Control: A Stateful Diffusion-based Policy for Imitation Learning Xiao Liu, Yifan Zhou, Fabian Weigend, Shuhei Ikemoto, Heni Ben Amor IROS, 2024 website / paper / video / code Introducing Diff-Control Policy, which incorporates ControlNet functioning as a transition model that captures temporal transitions within the action space to ensure action consistency.
	Open X-Embodiment: Robotic Learning Datasets and RT-X Models Open X-Embodiment Collaboration ICRA, 2024 website / arxiv / blog code / data In this paper, we assemble a dataset from 22 different robots collected through a collaboration between 21 institutions, demonstrating 527 skills (160266 tasks). We show that a high-capacity model trained on this data, which we call RT-X, exhibits positive transfer and improves the capabilities of multiple robots by leveraging experience from other platforms.
	Learning Modular Language-Conditioned Robot Policies through Attention Yifan Zhou, Shubham Sonawani, Mariano Phielipp, Heni Ben Amor, Simon Stepputtis Autonomous Robotics, 2023 website / paper / code Our proposed method is an imitation learning method for language-conditioned robot policies. It demonstrates high performance on a variety of tasks. It is able to transfer to new robots in a data-efficient manner, while still keeping a high execution performance. It also accepts adding new behaviors to an existing trained policy.
	α-MDF: An Attention-based Multimodal Differentiable Filter for Robot State Estimation Xiao Liu, Yifan Zhou, Fabian Weigend, Shuhei Ikemoto, Heni Ben Amor CoRL, 2023 website / paper / video α-MDF is an attention-based multimodal differentiable filter framework, the framework establishes the link between modern neural attention and Kalman Filters for robot state estimation.
	Modularity through Attention: Efficient Training and Transfer of Language-Conditioned Policies for Robot Manipulation Yifan Zhou, Shubham Sonawani, Mariano Phielipp, Simon Stepputtis, Heni Ben Amor CoRL, 2022 website / arxiv / video / code Proposing a sample-efficient approach for training language-conditioned manipulation policies that allows for rapid training and transfer across different types of robots. By introducing a novel method, namely Hierarchical Modularity, and adopting supervised attention across multiple sub-modules, we bridge the divide between modular and end-to-end learning and enable the reuse of functional building blocks.