📄 Research Draft

AI-powered academic abstract generation — 100 % local and private.

Research Draft is a lightweight tool that generates high-quality research paper abstracts from uploaded PDFs. It runs entirely on your local machine using a small instruction-tuned language model served through Ollama, with a clean Gradio web interface.

Built as a B.Tech / Data Science final-year project.

✨ Features

Feature	Student	Researcher
Upload PDF	✅	✅
Generate abstract	✅	✅
Copy abstract	✅	✅
View generation history	—	✅
Export latest result (.txt)	—	✅
Export full history (.txt)	—	✅
Clear history	—	✅

🏗️ Architecture

┌─────────────┐     ┌──────────────┐     ┌────────────────┐     ┌───────────┐
│  Gradio UI  │────▶│ abstract_    │────▶│  pdf_utils.py  │     │  Ollama   │
│  (app.py)   │     │ service.py   │     │  (extract/     │     │  Server   │
│             │◀────│              │────▶│   clean PDF)   │     │ (local)   │
└─────────────┘     │              │────▶│                │     │           │
                    │              │     └────────────────┘     │           │
                    │              │────▶┌────────────────┐     │           │
                    │              │     │  llm_client.py │────▶│ /api/chat │
                    │              │◀────│  (Ollama API)  │◀────│           │
                    │              │     └────────────────┘     └───────────┘
                    │              │────▶┌────────────────┐
                    │              │     │  history_       │
                    └──────────────┘     │  manager.py    │
                                        │  (JSON store)  │
                                        └────────────────┘

📂 Project Structure

research-draft/
├── app.py                  # Gradio Blocks UI — entry point
├── pdf_utils.py            # PDF text extraction and cleaning
├── llm_client.py           # Ollama API client
├── history_manager.py      # JSON-based history persistence
├── abstract_service.py     # Orchestration (PDF → LLM → history)
├── requirements.txt        # Python dependencies
├── sample_modelfile.txt    # Ollama Modelfile template
├── data/
│   └── history.json        # Persistent generation history
└── README.md               # This file

🚀 Setup Instructions

Prerequisites

Python 3.10+
Ollama installed and running — Install Ollama
A GGUF model file (e.g., LFM2.5-1.2B-Instruct, Qwen2.5-1.5B-Instruct, or Phi-3-mini)

Step 1 — Clone or download the project

git clone https://huggingface.co/Arunvarma2565/research-draft
cd research-draft

Step 2 — Install Python dependencies

pip install -r requirements.txt

Or install manually:

pip install gradio PyMuPDF requests

Step 3 — Set up the Ollama model

Download a GGUF model (e.g., from Hugging Face). Place the .gguf file in the project directory or note its path.
Edit sample_modelfile.txt — update the FROM line to point at your .gguf file:
```
FROM /path/to/your/model.gguf
```

Create the model in Ollama:

ollama create researchdraft -f sample_modelfile.txt

Verify it works:

ollama list                         # should show "researchdraft"
ollama run researchdraft "Hello"    # quick sanity check

Step 4 — Start the Ollama server

If Ollama is not already running:

ollama serve

Leave this terminal open.

Step 5 — Launch Research Draft

In a new terminal:

cd research-draft
python app.py

Open your browser at http://localhost:7860.

🎓 How to Use

Select your role — Student or Researcher — from the dropdown.
Upload a PDF of a research paper.
Click 🔍 Generate Abstract.
The generated abstract appears on the right. Use the copy button to grab it.
(Researcher only) Use the tools below to view history, export results, or clear history.

⚙️ Configuration

Setting	Location	Default
Ollama URL	`llm_client.py` → `OLLAMA_BASE_URL`	`http://localhost:11434`
Model name	`llm_client.py` → `MODEL_NAME`	`researchdraft`
Temperature	`llm_client.py` → `generate_abstract()`	`0.3`
Max text chars	`pdf_utils.py` → `MAX_TEXT_CHARS`	`12 000`
History file	`history_manager.py` → `HISTORY_FILE`	`data/history.json`
Server port	`app.py` → `demo.launch()`	`7860`

🧩 Tech Stack

Component	Library / Tool
UI	Gradio (Blocks API)
PDF parsing	PyMuPDF (fitz)
LLM runtime	Ollama (local)
HTTP client	requests
History storage	JSON file
Language	Python 3.10+

📝 Sample Models That Work Well

Model	Size	Notes
LFM2.5-1.2B-Instruct	~1.2 B	Lightweight, good for CPU
Qwen2.5-1.5B-Instruct	~1.5 B	Strong instruction following
Phi-3-mini-4k-instruct	~3.8 B	Higher quality, needs more RAM
Llama-3.2-3B-Instruct	~3.2 B	Good balance of speed and quality

All models should be in GGUF format (Q4_K_M or Q5_K_M quantisation recommended).

🔮 Future Improvements

Multi-PDF batch processing — upload several papers and generate abstracts in bulk.
Abstract comparison — compare generated vs. original abstract side-by-side.
Keyword extraction — automatically extract key terms from the paper.
Citation-aware chunking — smarter text splitting that preserves section boundaries.
SQLite backend — replace JSON history with SQLite for better querying.
User authentication — simple login to separate Student/Researcher sessions.
PDF preview — render the first page of the uploaded PDF in the UI.
Streaming output — show the abstract being generated token by token.
Fine-tuned model — fine-tune a small model on abstract-generation pairs for better quality.
Evaluation metrics — add ROUGE / BERTScore comparison against original abstracts.

📄 License

This project is for educational purposes (B.Tech final-year project). Use it freely for learning and research.

🙏 Acknowledgements

Ollama — local LLM serving
Gradio — web UI framework
PyMuPDF — PDF text extraction
Hugging Face — model hub and community

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support