Xiangyi Li's picture

Xiangyi Li

xdotli

·

https://www.xiangyi.li

AI & ML interests

None yet

Recent Activity

upvoted a paper 11 days ago

Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces

upvoted a paper 15 days ago

StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets?

submitted a paper 15 days ago

SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks

View all activity

Organizations

Papers 1

arxiv:2602.12670

models 0

None public yet

datasets 4

xdotli/skillsbench-trajectories

Updated Jan 26 • 8.32k • 1

xdotli/xai-clash-eval

Updated Oct 13, 2024 • 16

xdotli/hn

Viewer • Updated Aug 30, 2024 • 642 • 4

xdotli/npr

Viewer • Updated Aug 15, 2024 • 1k • 4.25k