arxiv:2602.12670
Xiangyi Li
xdotli
AI & ML interests
None yet
Recent Activity
upvoted a paper 11 days ago
Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces upvoted a paper 15 days ago
StockBench: Can LLM Agents Trade Stocks Profitably In Real-world
Markets? submitted
a paper
15 days ago
SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks