AI论文速递 2026年05月12日（HuggingFace Daily Papers）¶

数据来源：https://huggingface.co/papers 采集时间：2026-05-12

📌 重点关注¶

HyperEyes: Dual-Grained Efficiency-Aware Reinforcement Learning for Parallel Multimodal Search Agents | arXiv — 【重点关注】 Existing multimodal search agents process target entities sequentially, issui... 💡 并行多模态搜索，提升效率 - 对移动端AI搜索优化有启发
Q-RAG: Long Context Multi-step Retrieval via Value-based Embedder Training | arXiv — 【重点关注】 Retrieval-Augmented Generation (RAG) methods enhance LLM performance by effic... 💡 长上下文多步检索，解决RAG长期记忆问题 - 对Agent记忆架构很重要
PrefixGuard: From LLM-Agent Traces to Online Failure-Warning Monitors | arXiv — 【重点关注】 Large language model (LLM) agents now execute long, tool-using tasks where fi... 💡 Agent失败预警机制，提升稳定性 - 对工业级Agent部署很有价值

AEM: Adaptive Entropy Modulation for Multi-Turn Agentic Reinforcement Learning | arXiv — Reinforcement learning (RL) has substantially improved the ability of large l...
BalCapRL: A Balanced Framework for RL-Based MLLM Image Captioning | arXiv — Image captioning is one of the most fundamental tasks in computer vision. Owi...
MLS-Bench: A Holistic and Rigorous Assessment of AI Systems on Building Better AI | arXiv — Modern AI progress has been driven by ML methods that are generalizable acros...
Anisotropic Modality Align | arXiv — Training multimodal large language models has long been limited by the scarci...
MDN: Parallelizing Stepwise Momentum for Delta Linear Attention | arXiv — Linear Attention (LA) offers a promising paradigm for scaling large language ...
Sparse Autoencoders as Plug-and-Play Firewalls for Adversarial Attack Detection in VLMs | arXiv — Vision-language models (VLMs) have advanced rapidly and are increasingly depl...
UniSD: Towards a Unified Self-Distillation Framework for Large Language Models | arXiv — Self-distillation (SD) offers a promising path for adapting large language mo...