HanningZhang/Qwen2.5-Math-7B-raft-plusplus_cliphigher040_em-iter3 Text Generation • 8B • Updated Apr 25, 2025 • 1
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_cliphigher040_em-iter2 Text Generation • 8B • Updated Apr 25, 2025 • 1
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_cliphigher040_em-iter1 Text Generation • 8B • Updated Apr 25, 2025 • 1
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_cliphigher050_em-iter5 Text Generation • 8B • Updated Apr 24, 2025 • 1
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_cliphigher050_em-iter4 Text Generation • 8B • Updated Apr 24, 2025 • 2
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_cliphigher050_em-iter3 Text Generation • 8B • Updated Apr 24, 2025
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_cliphigher050_em-iter2 Text Generation • 8B • Updated Apr 23, 2025 • 1
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_cliphigher050_em-iter1 Text Generation • 8B • Updated Apr 23, 2025 • 1
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter12 Text Generation • 8B • Updated Apr 22, 2025 • 2
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter11 Text Generation • 8B • Updated Apr 22, 2025 • 3
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter10 Text Generation • 8B • Updated Apr 22, 2025 • 2
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter9 Text Generation • 8B • Updated Apr 22, 2025 • 2
HanningZhang/Qwen-7B-grpo-plusplus-nocliphigher-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter8 Text Generation • 8B • Updated Apr 20, 2025 • 1
HanningZhang/Qwen-7B-grpo-plusplus-nocliphigher-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter7 Text Generation • 8B • Updated Apr 20, 2025 • 1
HanningZhang/Qwen-7B-grpo-plusplus-nocliphigher-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter6 Text Generation • 8B • Updated Apr 20, 2025 • 1
HanningZhang/Qwen-7B-grpo-plusplus-nocliphigher-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter5 Text Generation • 8B • Updated Apr 20, 2025 • 1
HanningZhang/Qwen2.5-Math-7B-grpo-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter8 Text Generation • 8B • Updated Apr 19, 2025 • 2
HanningZhang/Qwen2.5-Math-7B-grpo-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter7 Text Generation • 8B • Updated Apr 19, 2025 • 2
HanningZhang/Qwen2.5-Math-7B-grpo-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter6 Text Generation • 8B • Updated Apr 19, 2025 • 2
HanningZhang/Qwen2.5-Math-7B-grpo-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter5 Text Generation • 8B • Updated Apr 19, 2025 • 2
HanningZhang/Qwen-7B-grpo-plusplus-nocliphigher-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter4 Text Generation • 8B • Updated Apr 19, 2025 • 1
HanningZhang/Qwen-7B-grpo-plusplus-nocliphigher-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter3 Text Generation • 8B • Updated Apr 19, 2025 • 1
HanningZhang/Qwen-7B-grpo-plusplus-nocliphigher-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter2 Text Generation • 8B • Updated Apr 19, 2025 • 1
HanningZhang/Qwen-7B-grpo-plusplus-nocliphigher-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter1 Text Generation • 8B • Updated Apr 19, 2025 • 1
HanningZhang/Qwen2.5-Math-7B-grpo-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter4 Text Generation • 8B • Updated Apr 18, 2025 • 2
HanningZhang/Qwen2.5-Math-7B-grpo-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter3 Text Generation • 8B • Updated Apr 18, 2025 • 3
HanningZhang/Qwen2.5-Math-7B-grpo-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter2 Text Generation • 8B • Updated Apr 18, 2025 • 2
HanningZhang/Qwen2.5-Math-7B-grpo-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter1 Text Generation • 8B • Updated Apr 18, 2025 • 2
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter8 Text Generation • 8B • Updated Apr 17, 2025 • 4
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter7 Text Generation • 8B • Updated Apr 17, 2025 • 1