YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
๐ Research Draft
AI-powered academic abstract generation โ 100 % local and private.
Research Draft is a lightweight tool that generates high-quality research paper abstracts from uploaded PDFs. It runs entirely on your local machine using a small instruction-tuned language model served through Ollama, with a clean Gradio web interface.
Built as a B.Tech / Data Science final-year project.
โจ Features
| Feature | Student | Researcher |
|---|---|---|
| Upload PDF | โ | โ |
| Generate abstract | โ | โ |
| Copy abstract | โ | โ |
| View generation history | โ | โ |
| Export latest result (.txt) | โ | โ |
| Export full history (.txt) | โ | โ |
| Clear history | โ | โ |
๐๏ธ Architecture
โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโ
โ Gradio UI โโโโโโถโ abstract_ โโโโโโถโ pdf_utils.py โ โ Ollama โ
โ (app.py) โ โ service.py โ โ (extract/ โ โ Server โ
โ โโโโโโโ โโโโโโถโ clean PDF) โ โ (local) โ
โโโโโโโโโโโโโโโ โ โโโโโโถโ โ โ โ
โ โ โโโโโโโโโโโโโโโโโโ โ โ
โ โโโโโโถโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ llm_client.py โโโโโโถโ /api/chat โ
โ โโโโโโโ (Ollama API) โโโโโโโ โ
โ โ โโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโ
โ โโโโโโถโโโโโโโโโโโโโโโโโโ
โ โ โ history_ โ
โโโโโโโโโโโโโโโโ โ manager.py โ
โ (JSON store) โ
โโโโโโโโโโโโโโโโโโ
๐ Project Structure
research-draft/
โโโ app.py # Gradio Blocks UI โ entry point
โโโ pdf_utils.py # PDF text extraction and cleaning
โโโ llm_client.py # Ollama API client
โโโ history_manager.py # JSON-based history persistence
โโโ abstract_service.py # Orchestration (PDF โ LLM โ history)
โโโ requirements.txt # Python dependencies
โโโ sample_modelfile.txt # Ollama Modelfile template
โโโ data/
โ โโโ history.json # Persistent generation history
โโโ README.md # This file
๐ Setup Instructions
Prerequisites
- Python 3.10+
- Ollama installed and running โ Install Ollama
- A GGUF model file (e.g., LFM2.5-1.2B-Instruct, Qwen2.5-1.5B-Instruct, or Phi-3-mini)
Step 1 โ Clone or download the project
git clone https://huggingface.co/Arunvarma2565/research-draft
cd research-draft
Step 2 โ Install Python dependencies
pip install -r requirements.txt
Or install manually:
pip install gradio PyMuPDF requests
Step 3 โ Set up the Ollama model
Download a GGUF model (e.g., from Hugging Face). Place the
.gguffile in the project directory or note its path.Edit
sample_modelfile.txtโ update theFROMline to point at your.gguffile:FROM /path/to/your/model.ggufCreate the model in Ollama:
ollama create researchdraft -f sample_modelfile.txtVerify it works:
ollama list # should show "researchdraft" ollama run researchdraft "Hello" # quick sanity check
Step 4 โ Start the Ollama server
If Ollama is not already running:
ollama serve
Leave this terminal open.
Step 5 โ Launch Research Draft
In a new terminal:
cd research-draft
python app.py
Open your browser at http://localhost:7860.
๐ How to Use
- Select your role โ Student or Researcher โ from the dropdown.
- Upload a PDF of a research paper.
- Click ๐ Generate Abstract.
- The generated abstract appears on the right. Use the copy button to grab it.
- (Researcher only) Use the tools below to view history, export results, or clear history.
โ๏ธ Configuration
| Setting | Location | Default |
|---|---|---|
| Ollama URL | llm_client.py โ OLLAMA_BASE_URL |
http://localhost:11434 |
| Model name | llm_client.py โ MODEL_NAME |
researchdraft |
| Temperature | llm_client.py โ generate_abstract() |
0.3 |
| Max text chars | pdf_utils.py โ MAX_TEXT_CHARS |
12 000 |
| History file | history_manager.py โ HISTORY_FILE |
data/history.json |
| Server port | app.py โ demo.launch() |
7860 |
๐งฉ Tech Stack
| Component | Library / Tool |
|---|---|
| UI | Gradio (Blocks API) |
| PDF parsing | PyMuPDF (fitz) |
| LLM runtime | Ollama (local) |
| HTTP client | requests |
| History storage | JSON file |
| Language | Python 3.10+ |
๐ Sample Models That Work Well
| Model | Size | Notes |
|---|---|---|
| LFM2.5-1.2B-Instruct | ~1.2 B | Lightweight, good for CPU |
| Qwen2.5-1.5B-Instruct | ~1.5 B | Strong instruction following |
| Phi-3-mini-4k-instruct | ~3.8 B | Higher quality, needs more RAM |
| Llama-3.2-3B-Instruct | ~3.2 B | Good balance of speed and quality |
All models should be in GGUF format (Q4_K_M or Q5_K_M quantisation recommended).
๐ฎ Future Improvements
- Multi-PDF batch processing โ upload several papers and generate abstracts in bulk.
- Abstract comparison โ compare generated vs. original abstract side-by-side.
- Keyword extraction โ automatically extract key terms from the paper.
- Citation-aware chunking โ smarter text splitting that preserves section boundaries.
- SQLite backend โ replace JSON history with SQLite for better querying.
- User authentication โ simple login to separate Student/Researcher sessions.
- PDF preview โ render the first page of the uploaded PDF in the UI.
- Streaming output โ show the abstract being generated token by token.
- Fine-tuned model โ fine-tune a small model on abstract-generation pairs for better quality.
- Evaluation metrics โ add ROUGE / BERTScore comparison against original abstracts.
๐ License
This project is for educational purposes (B.Tech final-year project). Use it freely for learning and research.
๐ Acknowledgements
- Ollama โ local LLM serving
- Gradio โ web UI framework
- PyMuPDF โ PDF text extraction
- Hugging Face โ model hub and community