Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces Paper • 2601.11868 • Published Jan 17 • 33
Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces Paper • 2601.11868 • Published Jan 17 • 33
AlignRAG: An Adaptable Framework for Resolving Misalignments in Retrieval-Aware Reasoning of RAG Paper • 2504.14858 • Published Apr 21, 2025 • 4
view post Post 3096 Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning (2411.18203)Critic-V has been accepted by CVPR2025!Bonus! VRI-160K uploaded now! di-zhang-fdu/R1-Vision-Reasoning-Instructions See translation 🔥 4 4 + Reply
view post Post 1861 ChemVLM has been accepted by AAAI2025! Seeing and Understanding: Bridging Vision with Chemical Knowledge Via ChemVLM (2408.07246)Try have a chat wiht him🤗. AI4Chem/ChemVLM-26B-1-2 See translation 🚀 4 4 + Reply
Seeing and Understanding: Bridging Vision with Chemical Knowledge Via ChemVLM Paper • 2408.07246 • Published Aug 14, 2024 • 22
view post Post 3099 The first version of LLaMA-O1 has been uploaded to HF now!Here We Come!Supervised: SimpleBerry/LLaMA-O1-Supervised-1129Base(Pretrain): SimpleBerry/LLaMA-O1-Base-1127Supervised Finetune Dataset: SimpleBerry/OpenLongCoT-SFTPretraining Dataset: SimpleBerry/OpenLongCoT-Pretrain-1202RLHF is on the way! View our GitHub Repo:https://github.com/SimpleBerry/LLaMA-O1Our ongoing related researches: Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B (2406.07394) LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning (2410.02884) Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning (2411.18203) @AdinaY @akhaliq @jwu323 ------GGUF:https://huggingface.co/Lyte/LLaMA-O1-Supervised-1129-Q4_K_M-GGUFonline Demo (CPU-only): SimpleBerry/LLaMA-O1-Supervised-1129-Demo See translation 3 replies · 🚀 13 13 🤗 3 3 🔥 1 1 + Reply