translation_app / README.md
Athena1621's picture
feat: Introduce new backend architecture with notebooks, sources, chat, and CLaRa models, alongside database schema and updated deployment scripts, while removing old frontend, deployment files, and previous backend components.
88f8604

πŸ“š Antigravity Notebook

A NotebookLM clone powered by Apple's CLaRa-7B-Instruct for infinite context reasoning

Antigravity Notebook enables you to create "Notebooks" where you can upload multiple disparate sources (PDFs, URLs, Text) and have an AI reason across all of them simultaneously using CLaRa's latent compression technology.

🌟 Key Features

The "Infinite Context" Strategy

  • 16x Compression: CLaRa compresses text into latent representations, reducing context usage by ~16x
  • Whole-Notebook Reasoning: When all sources fit in context (32k tokens), the AI reads EVERYTHING
  • Smart Retrieval: For larger notebooks, intelligently selects the most relevant sources
  • Multi-Modal Ingestion: Support for PDFs, URLs, and plain text

NotebookLM-Style Interface

  • Notebook Organization: Group related sources into project notebooks
  • Source Management: Easy upload, URL scraping, and text input
  • Memory Usage Meter: Visual gauge showing context utilization
  • Citation Tracking: See which sources were used for each response

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Streamlit UI                         β”‚
β”‚  (NotebookLM-style interface with sidebar + chat)      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                     β”‚
                     ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   FastAPI Backend                       β”‚
β”‚                                                         β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚  Notebooks   β”‚  β”‚   Sources    β”‚  β”‚     Chat     β”‚ β”‚
β”‚  β”‚   Router     β”‚  β”‚   Router     β”‚  β”‚   Router     β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                     β”‚
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        ↓                          ↓              ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   CLaRa-7B    β”‚      β”‚  ContextManager  β”‚   β”‚   Storage    β”‚
β”‚  (Compress &  β”‚      β”‚  (Whole-Context  β”‚   β”‚   Service    β”‚
β”‚   Generate)   β”‚      β”‚    Strategy)     β”‚   β”‚ (Tensors)    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
        ↓                          ↓                  ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     PostgreSQL                          β”‚
β”‚  (Notebooks β†’ Sources β†’ LatentTensors β†’ ChatMessages)  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸš€ Quick Start

Prerequisites

  • Python 3.9+
  • Docker & Docker Compose (for PostgreSQL)
  • CUDA-capable GPU (recommended, 16GB+ VRAM for CLaRa-7B)

Installation

  1. Clone the repository
git clone <your-repo-url>
cd antigravity-notebook
  1. Install dependencies
pip install -r requirements.txt
  1. Set up environment
cp .env.example .env
# Edit .env with your configuration
  1. Start PostgreSQL
docker-compose up -d
  1. Initialize database
python -m backend.database
  1. Start the backend
python -m backend.main
  1. Start the frontend (in a new terminal)
streamlit run frontend/app_notebook.py
  1. Open your browser

πŸ“– Usage

Creating a Notebook

  1. Open the Streamlit UI
  2. Click "Create New Notebook" in the sidebar
  3. Enter a name and description
  4. Click "Create Notebook"

Adding Sources

Upload PDF:

  1. Select your notebook
  2. Go to "Add Source" β†’ "PDF" tab
  3. Upload your PDF file
  4. Wait for processing (CLaRa compression)

Add URL:

  1. Select your notebook
  2. Go to "Add Source" β†’ "URL" tab
  3. Paste the URL
  4. Optionally add a custom title
  5. Click "Add URL"

Add Text:

  1. Select your notebook
  2. Go to "Add Source" β†’ "Text" tab
  3. Enter a title and paste your text
  4. Click "Add Text"

Querying Your Notebook

  1. Select a notebook with sources
  2. Type your question in the chat input
  3. The AI will reason across ALL your sources
  4. View the response and see which sources were cited

🧠 How It Works

Latent Compression

When you add a source:

  1. Text is extracted (PDF/URL/Text)
  2. Split into 2048-token chunks
  3. Each chunk is compressed by CLaRa into a latent tensor (~128 tokens)
  4. Latent tensors are saved to disk
  5. Metadata is stored in PostgreSQL

Context Management

When you query a notebook:

  1. ContextManager fetches ALL latent tensors for the notebook
  2. Calculates total token count
  3. If ≀ 32k tokens: Stacks ALL tensors β†’ Whole-Notebook Reasoning
  4. If > 32k tokens: Ranks tensors by relevance, selects top-N β†’ Selective Retrieval
  5. Generates response using CLaRa with the selected context
  6. Returns answer with source citations

πŸ› οΈ API Endpoints

Notebooks

  • POST /notebooks/ - Create notebook
  • GET /notebooks/ - List notebooks
  • GET /notebooks/{id} - Get notebook details
  • GET /notebooks/{id}/stats - Get context usage stats
  • PATCH /notebooks/{id} - Update notebook
  • DELETE /notebooks/{id} - Delete notebook

Sources

  • POST /sources/notebooks/{id}/sources/upload - Upload PDF
  • POST /sources/notebooks/{id}/sources/url - Add URL
  • POST /sources/notebooks/{id}/sources/text - Add text
  • GET /sources/notebooks/{id}/sources - List sources
  • DELETE /sources/{id} - Delete source

Chat

  • POST /chat/notebooks/{id}/chat - Query notebook
  • GET /chat/notebooks/{id}/messages - Get chat history
  • DELETE /chat/notebooks/{id}/messages - Clear chat history

πŸ“Š Database Schema

notebooks
β”œβ”€β”€ id (UUID)
β”œβ”€β”€ name
β”œβ”€β”€ description
β”œβ”€β”€ created_at
└── updated_at

sources
β”œβ”€β”€ id (UUID)
β”œβ”€β”€ notebook_id (FK)
β”œβ”€β”€ source_type (pdf|url|text)
β”œβ”€β”€ filename
β”œβ”€β”€ url
β”œβ”€β”€ content_hash
└── metadata (JSONB)

latent_tensors
β”œβ”€β”€ id (UUID)
β”œβ”€β”€ source_id (FK)
β”œβ”€β”€ tensor_path
β”œβ”€β”€ segment_index
β”œβ”€β”€ token_count
└── metadata (JSONB)

chat_messages
β”œβ”€β”€ id (UUID)
β”œβ”€β”€ notebook_id (FK)
β”œβ”€β”€ role (user|assistant)
β”œβ”€β”€ content
└── sources_used (JSONB)

βš™οΈ Configuration

Edit .env to configure:

# Database
POSTGRES_USER=antigravity
POSTGRES_PASSWORD=antigravity123
POSTGRES_DB=antigravity_db

# CLaRa Model
MODEL_NAME=apple/CLaRa-7B-Instruct
DEVICE=cuda  # or cpu
MAX_CONTEXT_TOKENS=32768
COMPRESSION_RATIO=16

# Storage
LATENT_TENSOR_DIR=./data/latent_tensors

# API
API_PORT=8000

🎯 Performance

  • Ingestion: ~30s for 50-page PDF
  • Query Response: ~10s for full notebook
  • Capacity: 10-20 average-sized books per notebook

πŸ”¬ Technical Details

Why CLaRa?

CLaRa (Compressing Long-range Attention) uses latent compression to represent text in a much smaller space, enabling:

  • 16x compression ratio
  • Preservation of semantic information
  • Cross-document reasoning

Context Budget

  • Standard: 32,768 tokens (latent space)
  • Equivalent to: ~500k original text tokens (with 16x compression)
  • Example: Can fit 10-20 full books simultaneously

🀝 Contributing

Contributions welcome! Please open an issue or PR.

πŸ“ License

MIT License - see LICENSE file

πŸ™ Acknowledgments

  • Apple for CLaRa-7B-Instruct
  • Google for NotebookLM inspiration
  • HuggingFace for model hosting

Built with ❀️ by the Antigravity Team