织梦CMS - 轻松建站从此开始!

欧博ABG官网-欧博官方网址-会员登入

欧博官网OneVision: Easy Visual Task Transfer

时间:2025-08-21 01:18来源: 作者:admin 点击: 2 次
We present LLaVA-OneVision, a family of open large multimodal models (LMMs) developed by consolidating our insights into data, models, and visual repr

Abstract: We present LLaVA-OneVision, a family of open large multimodal models (LMMs) developed by consolidating our insights into data, models, and visual representations in the LLaVA-NeXT blog series. Our experimental results demonstrate that LLaVA-OneVision is the first single model that can simultaneously push the performance boundaries of open LMMs in three important computer vision scenarios: single-image, multi-image, and video scenarios. Importantly, the design of LLaVA-OneVision allows strong transfer learning across different modalities/scenarios, yielding new emerging capabilities. In particular, strong video understanding and cross-scenario capabilities are demonstrated through task transfer from images to videos.

Submission Length: Long submission (more than 12 pages of main content)

Previous TMLR Submission Url: https://openreview.net/forum?id=zKv8qULV6n&referrer=%5BAuthor%20Console%5D(%2Fgroup%3Fid%3DTMLR%2FAuthors%23your-submissions)

Changes Since Last Submission: Dear Editor and Reviewers, We sincerely appreciate your time and consideration. We have updated the camera-ready version and would be grateful if you could kindly review it to ensure it aligns with the template. In conclusion, during the revision phase, we have addressed the issues raised during the rebuttal and added two subsections in Appendix Section C to discuss “Dataset Statistics” and “Training Resource Details.” Additionally, all supplementary experiments and ablation studies have been incorporated into Section E, covering the aspects of "Transferability Evaluations," "Scaling Effects on Grids/Tokens" and "Different Training Strategy Design". Furthermore, we included insights in a dedicated "Discussion" subsection.

Supplementary Material: zip

Assigned Action Editor: ~Jianbo_Jiao2

Submission Number: 3432

(责任编辑:)
------分隔线----------------------------
发表评论
请自觉遵守互联网相关的政策法规,严禁发布色情、暴力、反动的言论。
评价:
表情:
用户名: 验证码:
发布者资料
查看详细资料 发送留言 加为好友 用户等级: 注册时间:2025-08-21 08:08 最后登录:2025-08-21 08:08
栏目列表
推荐内容