Skip to content

AI论文速递 2026年06月04日(HuggingFace Daily Papers)

数据来源:https://huggingface.co/papers 采集时间:2026-06-04

📌 重点关注

  1. ClawHub Security Signals: When VirusTotal, Static Analysis, and SkillSpector Disagree | arXiv【重点关注】 Agent skills extend AI agents with reusable instructions, tools, scripts, ref... 💡 Agent skill安全审计的实用参考,帮你理解第三方技能包的风险评估方法。
  2. GRAIL: Gradient-Reweighted Advantages for Reinforcement Learning with Verifiable Rewards | arXiv【重点关注】 Reinforcement learning with verifiable rewards (e.g. GRPO) is now a common wa... 💡 GRPO强化学习的新优化方案,对模型训练和Agent奖励机制设计有直接启发。
  3. BraveGuard: From Open-World Threats to Safer Computer-Use Agents | arXiv【重点关注】 Computer-use agents extend language models from text generation to sustained ... 💡 Computer-Use Agent安全防护框架,做端侧Agent必看的威胁模型参考。

📋 其他值得关注

  1. Mitigating Perceptual Judgment Bias in Multimodal LLM-as-a-Judge via Perceptual Perturbation and Reward Modeling | arXiv — Recent multimodal large language models have demonstrated strong reasoning ab...
  2. MemTrain: Self-Supervised Context Memory Training | arXiv — Memory is an indispensable capability for long-horizon LLM agents, enabling t...
  3. MMG2Skill: Can Agents Distill In-the-Wild Guides into Self-Evolving Skills? | arXiv — Abundant procedural knowledge on the Web holds great potential for helping ag...
  4. Streaming Communication in Multi-Agent Reasoning | arXiv — Multi-agent reasoning systems adopt a "generate-then-transfer" paradigm that ...
  5. AutoMedBench: Towards Medical AutoResearch with Agentic AI Models | arXiv — Autonomous agents are increasingly expected to support end-to-end medical-AI ...
  6. Adaptive Auto-Harness: Sustained Self-Improvement for Agentic System Deployment on Open-Ended Task Streams | arXiv — Auto-harness systems such as A-Evolve, GEPA, and Meta-Harness improve LLM age...
  7. A Local Perturbation Theory for Cross-Domain Interference and Recovery in Multi-Domain RL | arXiv — Reinforcement learning (RL) post-training improves large language models (LLM...