When Gradients Collide: Failure Modes of Multi-Objective Prompt Optimization for LLM Judges Paper • 2605.26046 • Published 16 days ago • 3
PRECISE: Reducing the Bias of LLM Evaluations Using Prediction-Powered Ranking Estimation Paper • 2601.18777 • Published Jan 26
Benchmarking datasets for Anomaly-based Network Intrusion Detection: KDD CUP 99 alternatives Paper • 1811.05372 • Published Nov 13, 2018
SynthesizRR: Generating Diverse Datasets with Retrieval Augmentation Paper • 2405.10040 • Published May 16, 2024