Skip to content

AI论文速递 2026年05月14日(HuggingFace Daily Papers)

数据来源:https://huggingface.co/papers 采集时间:2026-05-14

📌 重点关注

  1. Training Long-Context Vision-Language Models Effectively with Generalization Beyond 128K Context | arXiv【重点关注】 Long-context modeling is becoming a core capability of modern large vision-la...
  2. LLM Agents Already Know When to Call Tools -- Even Without Reasoning | arXiv【重点关注】 Tool-augmented LLM agents tend to call tools indiscriminately, even when the ...
  3. PAAC: Privacy-Aware Agentic Device-Cloud Collaboration | arXiv【重点关注】 Large language model (LLM) agents face a structural tension: cloud agents pro...

📋 其他值得关注

  1. SleepWalk: A Three-Tier Benchmark for Stress-Testing Instruction-Guided Vision-Language Navigation | arXiv — Vision-Language Models (VLMs) have advanced rapidly in multimodal perception ...
  2. MAP: A Map-then-Act Paradigm for Long-Horizon Interactive Agent Reasoning | arXiv — Current interactive LLM agents rely on goal-conditioned stepwise planning, wh...
  3. Covering Human Action Space for Computer Use: Data Synthesis and Benchmark | arXiv — Computer-use agents (CUAs) automate on-screen work, as illustrated by GPT-5.4...
  4. ORBIT: Preserving Foundational Language Capabilities in GenRetrieval via Origin-Regulated Merging | arXiv — Despite the rapid advancements in large language model (LLM) development, fin...
  5. UniPath: Adaptive Coordination of Understanding and Generation for Unified Multimodal Reasoning | arXiv — Unified multimodal models (UMMs) aim to integrate understanding and generatio...
  6. Context Training with Active Information Seeking | arXiv — Most existing large language models (LLMs) are expensive to adapt after deplo...
  7. Agent-BRACE: Decoupling Beliefs from Actions in Long-Horizon Tasks via Verbalized State Uncertainty | arXiv — Large language models (LLMs) are increasingly deployed on long-horizon tasks ...