AI论文速递 2026年06月03日（HuggingFace Daily Papers）¶

数据来源：https://huggingface.co/papers 采集时间：2026-06-03

📌 重点关注¶

Trust Region On-Policy Distillation | arXiv — 【重点关注】 On-Policy Distillation (OPD) is a fundamental technique for efficient post-training knowledge distillation. 💡 策略蒸馏新范式，对Agent模型轻量化部署和知识迁移有直接参考价值
OpenWebRL: Demystifying Online Multi-turn Reinforcement Learning for Visual Web Agents | arXiv — 【重点关注】 Building capable visual web agents requires long-horizon reasoning, precise geometric understanding, and robust online adaptation. 💡 首次系统拆解Web Agent多轮在线RL训练，Agent长程决策研究必读
SVI-Bench: A Dynamic Microworld for Strategic Video Intelligence | arXiv — 【重点关注】 True video intelligence demands more than recognizing what is visible: it requires understanding causality, intent, and potential outcomes. 💡 从感知到策略的视频理解评测，为多模态Agent评估提供新范式

MineExplorer: Evaluating Open-World Exploration of MLLM Agents in Minecraft | arXiv — Multimodal large language models (MLLMs) have shown strong capabilities in pe...
Decentralized Instruction Tuning: Conflict-Aware Splitting and Weight Merging | arXiv — Instruction tuning aligns large language models, including multimodal ones, w...
MindZero: Learning Online Mental Reasoning With Zero Annotations | arXiv — Effective real-world assistance requires AI agents with robust Theory of Mind...
ACL-Verbatim: hallucination-free question answering for research | arXiv — Academic researchers need efficient and reliable methods for collecting high-...
Agent Skills Should Go Beyond Text: The Case for Visual Skills | arXiv — Reusable skills are a key mechanism for extending agent capabilities, allowin...
Ψ-Bench: Evaluating Persona-Sensitive Influencing in Persuasive Dialogues | arXiv — Personalization is a crucial capability of modern language agents. However, c...
A Matter of TASTE: Improving Coverage and Difficulty of Agent Benchmarks | arXiv — As agent capabilities advance, existing benchmarks, such as τ^2-Bench, are be...