ChipNeMo: Domain-Adapted LLMs for Chip Design
Paper
• 2311.00176
• Published • 9
Language Models can be Logical Solvers
Paper
• 2311.06158
• Published • 20
JARVIS-1: Open-World Multi-task Agents with Memory-Augmented Multimodal
Language Models
Paper
• 2311.05997
• Published • 37
Lumos: Learning Agents with Unified Data, Modular Design, and
Open-Source LLMs
Paper
• 2311.05657
• Published • 30
JaxMARL: Multi-Agent RL Environments in JAX
Paper
• 2311.10090
• Published • 8
ML-Bench: Large Language Models Leverage Open-source Libraries for
Machine Learning Tasks
Paper
• 2311.09835
• Published • 11
Large Language Models for Mathematicians
Paper
• 2312.04556
• Published • 12
G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model
Paper
• 2312.11370
• Published • 20
Boundary Attention: Learning to Find Faint Boundaries at Any Resolution
Paper
• 2401.00935
• Published • 18
Teaching Large Language Models to Reason with Reinforcement Learning
Paper
• 2403.04642
• Published • 48
LLM Agent Operating System
Paper
• 2403.16971
• Published • 73
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep
Thinking
Paper
• 2501.04519
• Published • 290
Evolving Deeper LLM Thinking
Paper
• 2501.09891
• Published • 115
AgentRxiv: Towards Collaborative Autonomous Research
Paper
• 2503.18102
• Published • 25
TTRL: Test-Time Reinforcement Learning
Paper
• 2504.16084
• Published • 122
Learning Adaptive Parallel Reasoning with Language Models
Paper
• 2504.15466
• Published • 44
LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making
Abilities
Paper
• 2504.16078
• Published • 21
FlowReasoner: Reinforcing Query-Level Meta-Agents
Paper
• 2504.15257
• Published • 47
Paper2Code: Automating Code Generation from Scientific Papers in Machine
Learning
Paper
• 2504.17192
• Published • 124
AIMO-2 Winning Solution: Building State-of-the-Art Mathematical
Reasoning Models with OpenMathReasoning dataset
Paper
• 2504.16891
• Published • 26
Skywork R1V2: Multimodal Hybrid Reinforcement Learning for Reasoning
Paper
• 2504.16656
• Published • 57
Flow-GRPO: Training Flow Matching Models via Online RL
Paper
• 2505.05470
• Published • 88
Measuring General Intelligence with Generated Games
Paper
• 2505.07215
• Published • 11
Enigmata: Scaling Logical Reasoning in Large Language Models with
Synthetic Verifiable Puzzles
Paper
• 2505.19914
• Published • 46
LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive
Programming?
Paper
• 2506.11928
• Published • 25
Xolver: Multi-Agent Reasoning with Holistic Experience Learning Just
Like an Olympiad Team
Paper
• 2506.14234
• Published • 41
ALE-Bench: A Benchmark for Long-Horizon Objective-Driven Algorithm
Engineering
Paper
• 2506.09050
• Published • 6
ProtoReasoning: Prototypes as the Foundation for Generalizable Reasoning
in LLMs
Paper
• 2506.15211
• Published • 39
SwarmAgentic: Towards Fully Automated Agentic System Generation via
Swarm Intelligence
Paper
• 2506.15672
• Published • 15
NovelSeek: When Agent Becomes the Scientist -- Building Closed-Loop
System from Hypothesis to Verification
Paper
• 2505.16938
• Published • 121
DiffuCoder: Understanding and Improving Masked Diffusion Models for Code
Generation
Paper
• 2506.20639
• Published • 31
Inverse Reinforcement Learning Meets Large Language Model Post-Training:
Basics, Advances, and Opportunities
Paper
• 2507.13158
• Published • 24
CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement
Learning
Paper
• 2507.14111
• Published • 25
Speed Always Wins: A Survey on Efficient Architectures for Large
Language Models
Paper
• 2508.09834
• Published • 53
Deep Think with Confidence
Paper
• 2508.15260
• Published • 90
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning
Paper
• 2509.07980
• Published • 105
Revolutionizing Reinforcement Learning Framework for Diffusion Large
Language Models
Paper
• 2509.06949
• Published • 56
Paper2Agent: Reimagining Research Papers As Interactive and Reliable AI
Agents
Paper
• 2509.06917
• Published • 44
The Majority is not always right: RL training for solution aggregation
Paper
• 2509.06870
• Published • 15
SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering
Tasks?
Paper
• 2509.16941
• Published • 21
UltraHorizon: Benchmarking Agent Capabilities in Ultra Long-Horizon
Scenarios
Paper
• 2509.21766
• Published • 24
Less is More: Recursive Reasoning with Tiny Networks
Paper
• 2510.04871
• Published • 511
MLE-Smith: Scaling MLE Tasks with Automated Multi-Agent Pipeline
Paper
• 2510.07307
• Published • 6
Parallel Test-Time Scaling for Latent Reasoning Models
Paper
• 2510.07745
• Published • 7
LaSeR: Reinforcement Learning with Last-Token Self-Rewarding
Paper
• 2510.14943
• Published • 40
Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge
Paper
• 2601.08808
• Published • 39
Accelerating Scientific Research with Gemini: Case Studies and Common Techniques
Paper
• 2602.03837
• Published • 5