Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
AI-Insight 's Collections
💡HF Papers Live 1: Reinforcement Learning
💡HF Papers Live 2: Code Bench
💡HF Papers Live 3: AI for Science
💡HF Papers Live 4: Multi Modal models
💡HF Papers Live 5: Omni-Modal models
💡HF Papers Live 6: OCR

💡HF Papers Live 1: Reinforcement Learning

updated Dec 3, 2025
Upvote
-

  • Absolute Zero: Reinforced Self-play Reasoning with Zero Data

    Paper • 2505.03335 • Published May 6, 2025 • 188

  • Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning

    Paper • 2502.06781 • Published Feb 10, 2025 • 58

  • internlm/OREAL-7B

    Text Generation • 8B • Updated Feb 24, 2025 • 68 • • 20

  • internlm/OREAL-32B

    Text Generation • 33B • Updated Feb 24, 2025 • 100 • 24

  • XiaomiMiMo/MiMo-VL-7B-RL

    Image-Text-to-Text • 8B • Updated Jun 7, 2025 • 1.59k • 167
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs