Skip to content

AI论文速递 2026年05月18日(HuggingFace Daily Papers)

数据来源:https://huggingface.co/papers 采集时间:2026-05-18

📌 重点关注

  1. WildClawBench: A Benchmark for Real-World, Long-Horizon Agent Evaluation | arXiv【重点关注】 Large language and vision-language models increasingly power agents that act ... 💡 智能体长期行为评估,对AI应用开发关键
  2. MemLens: Benchmarking Multimodal Long-Term Memory in Large Vision-Language Models | arXiv【重点关注】 Memory is essential for large vision-language models (LVLMs) to handle long, 💡 多模态长期记忆技术,对AI工具设计有启发 ...
  3. PanoWorld: Towards Spatial Supersensing in 360^circ Panorama World | arXiv【重点关注】 Multimodal large laboratory models (MLLMs) still struggle with spatial unders 💡 空间感知能力,对端侧AI开发有重要价值...

📋 其他值得关注

  1. Solvita: Enhancing Large Language Models for Competitive Programming via Agentic Evolution | arXiv — Large language models (LLMs) still struggle with the rigorous reasoning deman...
  2. RewardHarness: Self-Evolving Agentic Post-Training | arXiv — Evaluating instruction-guided image edits requires rewards that reflect subtl...
  3. STALE: Can LLM Agents Know When Their Memories Are No Longer Valid? | arXiv — Large Language Model (LLM) agents are increasingly expected to maintain coher...
  4. PAGER: Bridging the Semantic-Execution Gap in Point-Precise Geometric GUI Control | arXiv — Large vision-language models have significantly advanced GUI agents, enabling...
  5. Learning to Communicate Locally for Large-Scale Multi-Agent Pathfinding | arXiv — Multi-agent pathfinding (MAPF) is a widely used abstraction for multi-robot t...
  6. Nudging Beyond the Comfort Zone: Efficient Strategy-Guided Exploration for RLVR | arXiv — Reinforcement learning with verifiable rewards (RLVR) has emerged as a scalab...
  7. IntentVLA: Short-Horizon Intent Modeling for Aliased Robot Manipulation | arXiv — Robot imitation data are often multimodal: similar visual-language observatio...