Joakim Lee's picture

581

Joakim Lee

Reinforcement4All

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 days ago

DiffCoT: Diffusion-styled Chain-of-Thought Reasoning in LLMs

upvoted a paper 2 days ago

VerseCrafter: Dynamic Realistic Video World Model with 4D Geometric Control

upvoted a paper 2 days ago

AT^2PO: Agentic Turn-based Policy Optimization via Tree Search

View all activity

Organizations

None yet

upvoted 6 papers 2 days ago

DiffCoT: Diffusion-styled Chain-of-Thought Reasoning in LLMs

Paper • 2601.03559 • Published 4 days ago • 7

VerseCrafter: Dynamic Realistic Video World Model with 4D Geometric Control

Paper • 2601.05138 • Published 3 days ago • 12

AT^2PO: Agentic Turn-based Policy Optimization via Tree Search

Paper • 2601.04767 • Published 3 days ago • 22

RelayLLM: Efficient Reasoning via Collaborative Decoding

Paper • 2601.05167 • Published 3 days ago • 24

Learnable Multipliers: Freeing the Scale of Language Model Matrix Layers

Paper • 2601.04890 • Published 3 days ago • 35

GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

Paper • 2601.05242 • Published 2 days ago • 125

upvoted 3 papers 3 days ago

Agentic Rubrics as Contextual Verifiers for SWE Agents

Paper • 2601.04171 • Published 3 days ago • 9

Benchmark^2: Systematic Evaluation of LLM Benchmarks

Paper • 2601.03986 • Published 4 days ago • 30

Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting

Paper • 2601.02151 • Published 6 days ago • 86

upvoted 10 papers 4 days ago

InfiniDepth: Arbitrary-Resolution and Fine-Grained Depth Estimation with Neural Implicit Fields

Paper • 2601.03252 • Published 4 days ago • 93

WebGym: Scaling Training Environments for Visual Web Agents with Realistic Tasks

Paper • 2601.02439 • Published 6 days ago • 15

AceFF: A State-of-the-Art Machine Learning Potential for Small Molecules

Paper • 2601.00581 • Published 9 days ago • 1

FFP-300K: Scaling First-Frame Propagation for Generalizable Video Editing

Paper • 2601.01720 • Published 6 days ago • 4

SOP: A Scalable Online Post-Training System for Vision-Language-Action Models

Paper • 2601.03044 • Published 5 days ago • 26

CogFlow: Bridging Perception and Reasoning through Knowledge Internalization for Visual Mathematical Problem Solving

Paper • 2601.01874 • Published 6 days ago • 17

MiMo-V2-Flash Technical Report

Paper • 2601.02780 • Published 5 days ago • 25

NitroGen: An Open Foundation Model for Generalist Gaming Agents

Paper • 2601.02427 • Published 7 days ago • 35

UniCorn: Towards Self-Improving Unified Multimodal Models through Self-Generated Supervision

Paper • 2601.03193 • Published 5 days ago • 42

SciEvalKit: An Open-source Evaluation Toolkit for Scientific General Intelligence

Paper • 2512.22334 • Published 16 days ago • 34

upvoted a paper 5 days ago

Recursive Language Models

Paper • 2512.24601 • Published 11 days ago • 49