Contexts are Never Long Enough: Structured Reasoning for Scalable Question Answering over Long Document Sets
Abstract
SLIDERS enables scalable document question answering by extracting information into a relational database and using structured reasoning via SQL instead of traditional chunk-based aggregation methods.
Real-world document question answering is challenging. Analysts must synthesize evidence across multiple documents and different parts of each document. However, any fixed LLM context window can be exceeded as document collections grow. A common workaround is to decompose documents into chunks and assemble answers from chunk-level outputs, but this introduces an aggregation bottleneck: as the number of chunks grows, systems must still combine and reason over an increasingly large body of extracted evidence. We present SLIDERS, a framework for question answering over long document collections through structured reasoning. SLIDERS extracts salient information into a relational database, enabling scalable reasoning over persistent structured state via SQL rather than concatenated text. To make this locally extracted representation globally coherent, SLIDERS introduces a data reconciliation stage that leverages provenance, extraction rationales, and metadata to detect and repair duplicated, inconsistent, and incomplete records. SLIDERS outperforms all baselines on three existing long-context benchmarks, despite all of them fitting within the context window of strong base LLMs, exceeding GPT-4.1 by 6.6 points on average. It also improves over the next best baseline by ~19 and ~32 points on two new benchmarks at 3.9M and 36M tokens, respectively.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- SPD-RAG: Sub-Agent Per Document Retrieval-Augmented Generation (2026)
- DocSage: An Information Structuring Agent for Multi-Doc Multi-Entity Question Answering (2026)
- Navigating Large-Scale Document Collections: MuDABench for Multi-Document Analytical QA (2026)
- Halo: Domain-Aware Query Optimization for Long-Context Question Answering (2026)
- SAGE: Selective Attention-Guided Extraction for Token-Efficient Document Indexing (2026)
- Document-Level Numerical Reasoning across Single and Multiple Tables in Financial Reports (2026)
- MDER-DR: Multi-Hop Question Answering with Entity-Centric Summaries (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Interesting breakdown of this paper on arXivLens: https://arxivlens.com/PaperView/Details/contexts-are-never-long-enough-structured-reasoning-for-scalable-question-answering-over-long-document-sets-8321-70712763
Covers the executive summary, detailed methodology, and practical applications.
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper