Supervisor(s): Jen Jen Chung,Brendan Tidd,Yifei Chen
Designing optimal reward functions for RL is a challenge, especially for complex tasks like walking. Recently, LMMs have gained attention for their remarkable capability to translate natural language into machine-level instructions. ViTLearn explores the use of large multi-modal models (LMMs) to automatically generate rewards for robotic tasks.