Skip to content

AI论文速递 2026年06月03日(HuggingFace Daily Papers)

数据来源:https://huggingface.co/papers 采集时间:2026-06-03

📌 重点关注

  1. Trust Region On-Policy Distillation | arXiv【重点关注】 On-Policy Distillation (OPD) is a fundamental technique for efficient post-training knowledge distillation. 💡 策略蒸馏新范式,对Agent模型轻量化部署和知识迁移有直接参考价值
  2. OpenWebRL: Demystifying Online Multi-turn Reinforcement Learning for Visual Web Agents | arXiv【重点关注】 Building capable visual web agents requires long-horizon reasoning, precise geometric understanding, and robust online adaptation. 💡 首次系统拆解Web Agent多轮在线RL训练,Agent长程决策研究必读
  3. SVI-Bench: A Dynamic Microworld for Strategic Video Intelligence | arXiv【重点关注】 True video intelligence demands more than recognizing what is visible: it requires understanding causality, intent, and potential outcomes. 💡 从感知到策略的视频理解评测,为多模态Agent评估提供新范式

📋 其他值得关注

  1. MineExplorer: Evaluating Open-World Exploration of MLLM Agents in Minecraft | arXiv — Multimodal large language models (MLLMs) have shown strong capabilities in pe...
  2. Decentralized Instruction Tuning: Conflict-Aware Splitting and Weight Merging | arXiv — Instruction tuning aligns large language models, including multimodal ones, w...
  3. MindZero: Learning Online Mental Reasoning With Zero Annotations | arXiv — Effective real-world assistance requires AI agents with robust Theory of Mind...
  4. ACL-Verbatim: hallucination-free question answering for research | arXiv — Academic researchers need efficient and reliable methods for collecting high-...
  5. Agent Skills Should Go Beyond Text: The Case for Visual Skills | arXiv — Reusable skills are a key mechanism for extending agent capabilities, allowin...
  6. Ψ-Bench: Evaluating Persona-Sensitive Influencing in Persuasive Dialogues | arXiv — Personalization is a crucial capability of modern language agents. However, c...
  7. A Matter of TASTE: Improving Coverage and Difficulty of Agent Benchmarks | arXiv — As agent capabilities advance, existing benchmarks, such as τ^2-Bench, are be...