mlp - a huez Collection

huez 's Collections

mlp

mlp

updated 6 days ago

DA-DPO: Cost-efficient Difficulty-aware Preference Optimization for Reducing MLLM Hallucinations

Paper • 2601.00623 • Published Jan 2
TeaRAG: A Token-Efficient Agentic Retrieval-Augmented Generation Framework

Paper • 2511.05385 • Published Nov 7, 2025
Pre-DPO: Improving Data Utilization in Direct Preference Optimization Using a Guiding Reference Model

Paper • 2504.15843 • Published Apr 22, 2025 • 16
VerIPO: Cultivating Long Reasoning in Video-LLMs via Verifier-Gudied Iterative Policy Optimization

Paper • 2505.19000 • Published May 25, 2025 • 42
OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference

Paper • 2502.18411 • Published Feb 25, 2025 • 74