AI论文速递 2026年06月03日(HuggingFace Daily Papers)¶
数据来源:https://huggingface.co/papers 采集时间:2026-06-03
📌 重点关注¶
- Trust Region On-Policy Distillation | arXiv — 【重点关注】 On-Policy Distillation (OPD) is a fundamental technique for efficient post-training knowledge distillation. 💡 策略蒸馏新范式,对Agent模型轻量化部署和知识迁移有直接参考价值
- OpenWebRL: Demystifying Online Multi-turn Reinforcement Learning for Visual Web Agents | arXiv — 【重点关注】 Building capable visual web agents requires long-horizon reasoning, precise geometric understanding, and robust online adaptation. 💡 首次系统拆解Web Agent多轮在线RL训练,Agent长程决策研究必读
- SVI-Bench: A Dynamic Microworld for Strategic Video Intelligence | arXiv — 【重点关注】 True video intelligence demands more than recognizing what is visible: it requires understanding causality, intent, and potential outcomes. 💡 从感知到策略的视频理解评测,为多模态Agent评估提供新范式
📋 其他值得关注¶
- MineExplorer: Evaluating Open-World Exploration of MLLM Agents in Minecraft | arXiv — Multimodal large language models (MLLMs) have shown strong capabilities in pe...
- Decentralized Instruction Tuning: Conflict-Aware Splitting and Weight Merging | arXiv — Instruction tuning aligns large language models, including multimodal ones, w...
- MindZero: Learning Online Mental Reasoning With Zero Annotations | arXiv — Effective real-world assistance requires AI agents with robust Theory of Mind...
- ACL-Verbatim: hallucination-free question answering for research | arXiv — Academic researchers need efficient and reliable methods for collecting high-...
- Agent Skills Should Go Beyond Text: The Case for Visual Skills | arXiv — Reusable skills are a key mechanism for extending agent capabilities, allowin...
- Ψ-Bench: Evaluating Persona-Sensitive Influencing in Persuasive Dialogues | arXiv — Personalization is a crucial capability of modern language agents. However, c...
- A Matter of TASTE: Improving Coverage and Difficulty of Agent Benchmarks | arXiv — As agent capabilities advance, existing benchmarks, such as τ^2-Bench, are be...