YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

๐Ÿ“„ Research Draft

AI-powered academic abstract generation โ€” 100 % local and private.

Research Draft is a lightweight tool that generates high-quality research paper abstracts from uploaded PDFs. It runs entirely on your local machine using a small instruction-tuned language model served through Ollama, with a clean Gradio web interface.

Built as a B.Tech / Data Science final-year project.


โœจ Features

Feature Student Researcher
Upload PDF โœ… โœ…
Generate abstract โœ… โœ…
Copy abstract โœ… โœ…
View generation history โ€” โœ…
Export latest result (.txt) โ€” โœ…
Export full history (.txt) โ€” โœ…
Clear history โ€” โœ…

๐Ÿ—๏ธ Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Gradio UI  โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ abstract_    โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚  pdf_utils.py  โ”‚     โ”‚  Ollama   โ”‚
โ”‚  (app.py)   โ”‚     โ”‚ service.py   โ”‚     โ”‚  (extract/     โ”‚     โ”‚  Server   โ”‚
โ”‚             โ”‚โ—€โ”€โ”€โ”€โ”€โ”‚              โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚   clean PDF)   โ”‚     โ”‚ (local)   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ”‚              โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚                โ”‚     โ”‚           โ”‚
                    โ”‚              โ”‚     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ”‚           โ”‚
                    โ”‚              โ”‚โ”€โ”€โ”€โ”€โ–ถโ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”‚           โ”‚
                    โ”‚              โ”‚     โ”‚  llm_client.py โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ /api/chat โ”‚
                    โ”‚              โ”‚โ—€โ”€โ”€โ”€โ”€โ”‚  (Ollama API)  โ”‚โ—€โ”€โ”€โ”€โ”€โ”‚           โ”‚
                    โ”‚              โ”‚     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                    โ”‚              โ”‚โ”€โ”€โ”€โ”€โ–ถโ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                    โ”‚              โ”‚     โ”‚  history_       โ”‚
                    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ”‚  manager.py    โ”‚
                                        โ”‚  (JSON store)  โ”‚
                                        โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ“‚ Project Structure

research-draft/
โ”œโ”€โ”€ app.py                  # Gradio Blocks UI โ€” entry point
โ”œโ”€โ”€ pdf_utils.py            # PDF text extraction and cleaning
โ”œโ”€โ”€ llm_client.py           # Ollama API client
โ”œโ”€โ”€ history_manager.py      # JSON-based history persistence
โ”œโ”€โ”€ abstract_service.py     # Orchestration (PDF โ†’ LLM โ†’ history)
โ”œโ”€โ”€ requirements.txt        # Python dependencies
โ”œโ”€โ”€ sample_modelfile.txt    # Ollama Modelfile template
โ”œโ”€โ”€ data/
โ”‚   โ””โ”€โ”€ history.json        # Persistent generation history
โ””โ”€โ”€ README.md               # This file

๐Ÿš€ Setup Instructions

Prerequisites

  • Python 3.10+
  • Ollama installed and running โ€” Install Ollama
  • A GGUF model file (e.g., LFM2.5-1.2B-Instruct, Qwen2.5-1.5B-Instruct, or Phi-3-mini)

Step 1 โ€” Clone or download the project

git clone https://huggingface.co/Arunvarma2565/research-draft
cd research-draft

Step 2 โ€” Install Python dependencies

pip install -r requirements.txt

Or install manually:

pip install gradio PyMuPDF requests

Step 3 โ€” Set up the Ollama model

  1. Download a GGUF model (e.g., from Hugging Face). Place the .gguf file in the project directory or note its path.

  2. Edit sample_modelfile.txt โ€” update the FROM line to point at your .gguf file:

    FROM /path/to/your/model.gguf
    
  3. Create the model in Ollama:

    ollama create researchdraft -f sample_modelfile.txt
    
  4. Verify it works:

    ollama list                         # should show "researchdraft"
    ollama run researchdraft "Hello"    # quick sanity check
    

Step 4 โ€” Start the Ollama server

If Ollama is not already running:

ollama serve

Leave this terminal open.

Step 5 โ€” Launch Research Draft

In a new terminal:

cd research-draft
python app.py

Open your browser at http://localhost:7860.


๐ŸŽ“ How to Use

  1. Select your role โ€” Student or Researcher โ€” from the dropdown.
  2. Upload a PDF of a research paper.
  3. Click ๐Ÿ” Generate Abstract.
  4. The generated abstract appears on the right. Use the copy button to grab it.
  5. (Researcher only) Use the tools below to view history, export results, or clear history.

โš™๏ธ Configuration

Setting Location Default
Ollama URL llm_client.py โ†’ OLLAMA_BASE_URL http://localhost:11434
Model name llm_client.py โ†’ MODEL_NAME researchdraft
Temperature llm_client.py โ†’ generate_abstract() 0.3
Max text chars pdf_utils.py โ†’ MAX_TEXT_CHARS 12 000
History file history_manager.py โ†’ HISTORY_FILE data/history.json
Server port app.py โ†’ demo.launch() 7860

๐Ÿงฉ Tech Stack

Component Library / Tool
UI Gradio (Blocks API)
PDF parsing PyMuPDF (fitz)
LLM runtime Ollama (local)
HTTP client requests
History storage JSON file
Language Python 3.10+

๐Ÿ“ Sample Models That Work Well

Model Size Notes
LFM2.5-1.2B-Instruct ~1.2 B Lightweight, good for CPU
Qwen2.5-1.5B-Instruct ~1.5 B Strong instruction following
Phi-3-mini-4k-instruct ~3.8 B Higher quality, needs more RAM
Llama-3.2-3B-Instruct ~3.2 B Good balance of speed and quality

All models should be in GGUF format (Q4_K_M or Q5_K_M quantisation recommended).


๐Ÿ”ฎ Future Improvements

  • Multi-PDF batch processing โ€” upload several papers and generate abstracts in bulk.
  • Abstract comparison โ€” compare generated vs. original abstract side-by-side.
  • Keyword extraction โ€” automatically extract key terms from the paper.
  • Citation-aware chunking โ€” smarter text splitting that preserves section boundaries.
  • SQLite backend โ€” replace JSON history with SQLite for better querying.
  • User authentication โ€” simple login to separate Student/Researcher sessions.
  • PDF preview โ€” render the first page of the uploaded PDF in the UI.
  • Streaming output โ€” show the abstract being generated token by token.
  • Fine-tuned model โ€” fine-tune a small model on abstract-generation pairs for better quality.
  • Evaluation metrics โ€” add ROUGE / BERTScore comparison against original abstracts.

๐Ÿ“„ License

This project is for educational purposes (B.Tech final-year project). Use it freely for learning and research.


๐Ÿ™ Acknowledgements

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support