Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual information for complex reasoning, planning, and generation.
-
Updated
Oct 4, 2025
Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual information for complex reasoning, planning, and generation.
🌴 ARES is an open-source framework for adaptive multimodal reasoning, featuring a two-stage pipeline—Adaptive Cold-Start and Entropy-Shaped Policy Optimization—to balance reasoning depth and efficiency.
Add a description, image, and links to the multimodal-reasoning-visual-reasoning topic page so that developers can more easily learn about it.
To associate your repository with the multimodal-reasoning-visual-reasoning topic, visit your repo's landing page and select "manage topics."