Publications
Selected Publications
Yifan Zhou, Xiao Liu, Quan Vuong, Heni Ben Amor Under Review, 2024 website Introduces AutoMA, an imitation learning framework that supports leveraging (V)LLMs to generate rich contexts for robot training. Experiments show high data efficiency and success rate. | |
Lin Guan*, Yifan Zhou*, Denis Liu, Yantian Zha, Heni Ben Amor, Subbarao Kambhampati COLM, 2024 website / arxiv / code When no sound verifier is available, can we use large vision and language models (VLMs), which are approximately omniscient, as scalable Behavior Critics to catch undesirable embodied agent behaviors in videos? To answer this, we first construct a benchmark that contains diverse cases of goal-reaching yet undesirable agent policies. Then, we comprehensively evaluate VLM critics to gain a deeper understanding of their strengths and failure modes. | |
Xiao Liu, Yifan Zhou, Fabian Weigend, Shuhei Ikemoto, Heni Ben Amor IROS, 2024 website / paper / video / code Introducing Diff-Control Policy, which incorporates ControlNet functioning as a transition model that captures temporal transitions within the action space to ensure action consistency. | |
Open X-Embodiment Collaboration ICRA, 2024 website / arxiv / blog code / data In this paper, we assemble a dataset from 22 different robots collected through a collaboration between 21 institutions, demonstrating 527 skills (160266 tasks). We show that a high-capacity model trained on this data, which we call RT-X, exhibits positive transfer and improves the capabilities of multiple robots by leveraging experience from other platforms. | |
Yifan Zhou, Shubham Sonawani, Mariano Phielipp, Heni Ben Amor, Simon Stepputtis Autonomous Robotics, 2023 website / paper / code Our proposed method is an imitation learning method for language-conditioned robot policies. It demonstrates high performance on a variety of tasks. It is able to transfer to new robots in a data-efficient manner, while still keeping a high execution performance. It also accepts adding new behaviors to an existing trained policy. | |
Xiao Liu, Yifan Zhou, Fabian Weigend, Shuhei Ikemoto, Heni Ben Amor CoRL, 2023 website / paper / video α-MDF is an attention-based multimodal differentiable filter framework, the framework establishes the link between modern neural attention and Kalman Filters for robot state estimation. | |
Yifan Zhou, Shubham Sonawani, Mariano Phielipp, Simon Stepputtis, Heni Ben Amor CoRL, 2022 website / arxiv / video / code Proposing a sample-efficient approach for training language-conditioned manipulation policies that allows for rapid training and transfer across different types of robots. By introducing a novel method, namely Hierarchical Modularity, and adopting supervised attention across multiple sub-modules, we bridge the divide between modular and end-to-end learning and enable the reuse of functional building blocks. |