Publications

You can find my full list of publications on my Google Scholar profile.

Selected Publications

AutoMA: Automated Modular Attention enables Context-Rich Imitation Learning using Foundation Models
Yifan Zhou, Xiao Liu, Quan Vuong, Heni Ben Amor
Under Review, 2024  
website

Introduces AutoMA, an imitation learning framework that supports leveraging (V)LLMs to generate rich contexts for robot training. Experiments show high data efficiency and success rate.

"Task Success" is not Enough: Investigating the Use of Video-Language Models as Behavior Critics for Catching Undesirable Agent Behaviors
Lin Guan*, Yifan Zhou*, Denis Liu, Yantian Zha, Heni Ben Amor, Subbarao Kambhampati
COLM, 2024  
website / arxiv / code

When no sound verifier is available, can we use large vision and language models (VLMs), which are approximately omniscient, as scalable Behavior Critics to catch undesirable embodied agent behaviors in videos? To answer this, we first construct a benchmark that contains diverse cases of goal-reaching yet undesirable agent policies. Then, we comprehensively evaluate VLM critics to gain a deeper understanding of their strengths and failure modes.

Diff-Control: A Stateful Diffusion-based Policy for Imitation Learning
Xiao Liu, Yifan Zhou, Fabian Weigend, Shuhei Ikemoto, Heni Ben Amor
IROS, 2024  
website / paper / video / code

Introducing Diff-Control Policy, which incorporates ControlNet functioning as a transition model that captures temporal transitions within the action space to ensure action consistency.

Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Open X-Embodiment Collaboration
ICRA, 2024  
website / arxiv / blog code / data

In this paper, we assemble a dataset from 22 different robots collected through a collaboration between 21 institutions, demonstrating 527 skills (160266 tasks). We show that a high-capacity model trained on this data, which we call RT-X, exhibits positive transfer and improves the capabilities of multiple robots by leveraging experience from other platforms.

Learning Modular Language-Conditioned Robot Policies through Attention
Yifan Zhou, Shubham Sonawani, Mariano Phielipp, Heni Ben Amor, Simon Stepputtis
Autonomous Robotics, 2023  
website / paper / code

Our proposed method is an imitation learning method for language-conditioned robot policies. It demonstrates high performance on a variety of tasks. It is able to transfer to new robots in a data-efficient manner, while still keeping a high execution performance. It also accepts adding new behaviors to an existing trained policy.

α-MDF: An Attention-based Multimodal Differentiable Filter for Robot State Estimation
Xiao Liu, Yifan Zhou, Fabian Weigend, Shuhei Ikemoto, Heni Ben Amor
CoRL, 2023  
website / paper / video

α-MDF is an attention-based multimodal differentiable filter framework, the framework establishes the link between modern neural attention and Kalman Filters for robot state estimation.

Modularity through Attention: Efficient Training and Transfer of Language-Conditioned Policies for Robot Manipulation
Yifan Zhou, Shubham Sonawani, Mariano Phielipp, Simon Stepputtis, Heni Ben Amor
CoRL, 2022  
website / arxiv / video / code

Proposing a sample-efficient approach for training language-conditioned manipulation policies that allows for rapid training and transfer across different types of robots. By introducing a novel method, namely Hierarchical Modularity, and adopting supervised attention across multiple sub-modules, we bridge the divide between modular and end-to-end learning and enable the reuse of functional building blocks.