Unified Approaches for Multi-Task Vision-Language InteractionsYou, Haoxuan
Vision-based Manipulation In-the-WildChi, Cheng
Multimodal Representations for VideoSuris Coll-Vinent, Didac
High-level, part-based features for fine-grained visual categorizationBerg, Thomas