AI论文速递 2026年05月26日(HuggingFace Daily Papers)¶
数据来源:https://huggingface.co/papers 采集时间:2026-05-26
📌 重点关注¶
- VaaWIT: Visual-Aware Adaptation of Large Language Models for Multilingual Web Image Translation | arXiv — 【重点关注】 Translating text embedded in Web images is crucial for improving content acce...
- ETCHR: Editing To Clarify and Harness Reasoning | arXiv — 【重点关注】 Multimodal Large Language Models have advanced visual reasoning, yet a purely...
- Efficient Agentic Reasoning Through Self-Regulated Simulative Planning | arXiv — 【重点关注】 How should an agent decide when and how to plan? A dominant approach builds a...
📋 其他值得关注¶
- See What I Mean: Aligning Vision and Language Representations for Video Fine-grained Object Understanding | arXiv — We present SWIM (See What I Mean), a novel training strategy that aligns visi...
- SciAtlas: A Large-Scale Knowledge Graph for Automated Scientific Research | arXiv — The exponential growth of global academic output has confronted researchers a...
- SkillOpt: Executive Strategy for Self-Evolving Agent Skills | arXiv — Agent skills today are hand-crafted, generated one-shot, or evolved through l...
- ParaVT: Taming the Tool Prior Paradox for Parallel Tool Use in Agentic Video Reinforcement Learning | arXiv — Training large multimodal models (LMMs) via reinforcement learning (RL) to na...
- Self-Improving CAD Generation Agents with Finite Element Analysis as Feedback | arXiv — Computer-aided design (CAD) is the backbone of modern industrial design, yet ...
- PhotoFlow: Agentic 3D Virtual Photography Missions | arXiv — Virtual photography asks an agent to enter a prepared 3D scene with no presel...
- AnyMo: Geometry-Aware Setup-Agnostic Modeling of Human Motion in the Wild | arXiv — As wearable and mobile devices become increasingly embedded in daily life, th...