💡HF Papers Live 1: Reinforcement Learning - a AI-Insight Collection

AI-Insight 's Collections

💡HF Papers Live 1: Reinforcement Learning

💡HF Papers Live 2: Code Bench

💡HF Papers Live 3: AI for Science

💡HF Papers Live 4: Multi Modal models

💡HF Papers Live 5: Omni-Modal models

💡HF Papers Live 6: OCR

💡HF Papers Live 1: Reinforcement Learning

updated Dec 3, 2025

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published May 6, 2025 • 188
Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning

Paper • 2502.06781 • Published Feb 10, 2025 • 58
internlm/OREAL-7B

Text Generation • 8B • Updated Feb 24, 2025 • 68 • • 20
internlm/OREAL-32B

Text Generation • 33B • Updated Feb 24, 2025 • 100 • 24
XiaomiMiMo/MiMo-VL-7B-RL

Image-Text-to-Text • 8B • Updated Jun 7, 2025 • 1.59k • 167