Skip to content

AI论文速递 2026年05月26日(HuggingFace Daily Papers)

数据来源:https://huggingface.co/papers 采集时间:2026-05-26

📌 重点关注

  1. VaaWIT: Visual-Aware Adaptation of Large Language Models for Multilingual Web Image Translation | arXiv【重点关注】 Translating text embedded in Web images is crucial for improving content acce...
  2. ETCHR: Editing To Clarify and Harness Reasoning | arXiv【重点关注】 Multimodal Large Language Models have advanced visual reasoning, yet a purely...
  3. Efficient Agentic Reasoning Through Self-Regulated Simulative Planning | arXiv【重点关注】 How should an agent decide when and how to plan? A dominant approach builds a...

📋 其他值得关注

  1. See What I Mean: Aligning Vision and Language Representations for Video Fine-grained Object Understanding | arXiv — We present SWIM (See What I Mean), a novel training strategy that aligns visi...
  2. SciAtlas: A Large-Scale Knowledge Graph for Automated Scientific Research | arXiv — The exponential growth of global academic output has confronted researchers a...
  3. SkillOpt: Executive Strategy for Self-Evolving Agent Skills | arXiv — Agent skills today are hand-crafted, generated one-shot, or evolved through l...
  4. ParaVT: Taming the Tool Prior Paradox for Parallel Tool Use in Agentic Video Reinforcement Learning | arXiv — Training large multimodal models (LMMs) via reinforcement learning (RL) to na...
  5. Self-Improving CAD Generation Agents with Finite Element Analysis as Feedback | arXiv — Computer-aided design (CAD) is the backbone of modern industrial design, yet ...
  6. PhotoFlow: Agentic 3D Virtual Photography Missions | arXiv — Virtual photography asks an agent to enter a prepared 3D scene with no presel...
  7. AnyMo: Geometry-Aware Setup-Agnostic Modeling of Human Motion in the Wild | arXiv — As wearable and mobile devices become increasingly embedded in daily life, th...