SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models Paper • 2502.09604 • Published 28 days ago • 33
Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia Paper • 2503.07920 • Published 3 days ago • 89
Personalizing Reinforcement Learning from Human Feedback with Variational Preference Learning Paper • 2408.10075 • Published Aug 19, 2024
Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation Paper • 2502.14846 • Published 21 days ago • 13
Articulate-Anything: Automatic Modeling of Articulated Objects via a Vision-Language Foundation Model Paper • 2410.13882 • Published Oct 3, 2024
MiRAGeNews: Multimodal Realistic AI-Generated News Detection Paper • 2410.09045 • Published Oct 11, 2024 • 4
TESS 2: A Large-Scale Generalist Diffusion Language Model Paper • 2502.13917 • Published 22 days ago • 6
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding Paper • 2501.12380 • Published Jan 21 • 84