ERASE: Error-Resilient Representation Learning on Graphs for Label Noise Tolerance Paper • 2312.08852 • Published Dec 13, 2023
InfoMosaic-Bench: Evaluating Multi-Source Information Seeking in Tool-Augmented Agents Paper • 2510.02271 • Published Oct 2, 2025 • 8
AgentIF-OneDay: A Task-level Instruction-Following Benchmark for General AI Agents in Daily Scenarios Paper • 2601.20613 • Published 25 days ago • 10
AgentIF-OneDay: A Task-level Instruction-Following Benchmark for General AI Agents in Daily Scenarios Paper • 2601.20613 • Published 25 days ago • 10
Toward Ultra-Long-Horizon Agentic Science: Cognitive Accumulation for Machine Learning Engineering Paper • 2601.10402 • Published Jan 15 • 37
AgentFold: Long-Horizon Web Agents with Proactive Context Management Paper • 2510.24699 • Published Oct 28, 2025 • 71
InfoMosaic-Bench: Evaluating Multi-Source Information Seeking in Tool-Augmented Agents Paper • 2510.02271 • Published Oct 2, 2025 • 8