BidirLM Collection BidirLM is a family of 5 frontier bidirectional encoders, including an omnimodal variant at 2.5B. • 8 items • Updated 18 days ago • 1
view article Article DenseOn with the LateOn: Open State-of-the-Art Single and Multi-Vector Models 12 days ago • 36
Embarrassingly Simple Self-Distillation Improves Code Generation Paper • 2604.01193 • Published Apr 1 • 47
TRACER: Trace-Based Adaptive Cost-Efficient Routing for LLM Classification Paper • 2604.14531 • Published 18 days ago • 7
view article Article How I contributed a new model to the Transformers library using Codex Mar 30 • 50
fiNERweb Collection A multilingual dataset for NER covering 91 langauges and 25 scripts • 3 items • Updated Dec 16, 2025 • 3
Fine-tune ready versions of the LLMSQL benchmark Collection This collection contains the versions of the benchmark in fine-tune ready format • 2 items • Updated Mar 4 • 1
Beyond Language Modeling: An Exploration of Multimodal Pretraining Paper • 2603.03276 • Published Mar 3 • 103
The Million-Label NER: Breaking Scale Barriers with GLiNER bi-encoder Paper • 2602.18487 • Published Feb 11 • 6
Optimal Turkish Subword Strategies at Scale: Systematic Evaluation of Data, Vocabulary, Morphology Interplay Paper • 2602.06942 • Published Feb 6 • 3