RAG Document Chat System v2

What it is

A Python-based RAG application that ingests PDF/TXT documents, processes them into multiple layers of granularity (basic chunks, smart summaries, paragraph contexts), and stores them in ChromaDB. It provides a FastAPI backend, Streamlit web interface, and Next.js modern UI for querying documents using OpenAI LLMs, featuring SOC2-compliant citation trails and relevancy scoring.

Features

Multi-layer document processing with basic chunks, smart summaries, and paragraph contexts
SOC2-compliant raw source citations with percentage relevancy scoring
Intelligent search hierarchy automatically selecting the best strategy per query
Modular architecture with 9 focused modules replacing monolithic design
Multiple interfaces: Streamlit web app, FastAPI REST API, and Next.js modern UI
CLI tool for direct search, asking questions, and system management

Quickstart

git clone https://github.com/davidbmar/rag-document-chat-ver2.git
cd rag-document-chat-ver2
chmod +x setup.sh start.sh
./setup.sh
nano .env
./start.sh

Architecture

flowchart TD
    User[User] -->|Query| UI[Web Interface / CLI]
    UI -->|HTTP Request| API[FastAPI Server]
    API -->|Orchestrate| RAG[RAGSystem Core]
    RAG -->|Search| VectorDB[(ChromaDB)]
    RAG -->|Retrieve Docs| Storage[S3 / Local FS]
    RAG -->|Generate Answer| LLM[OpenAI API]
    RAG -->|Return Response| API
    API -->|JSON Response| UI
    subgraph Processing
        DocProc[DocumentProcessor] -->|Extract Text| Parser[PDF/TXT Parser]
        DocProc -->|Chunk & Summarize| Indexer[Indexing Engine]
        Indexer -->|Store Vectors| VectorDB
    end
    API -->|Upload| DocProc

How it's built

Built with Python 3.9+ using a modular architecture split into core, processing, search, and API modules. It uses Pydantic for data validation, ChromaDB for vector storage, and supports S3 for document persistence. The system employs hierarchical indexing strategies to balance conceptual search with detailed retrieval.

How it runs

sequenceDiagram
    participant U as User
    participant F as FastAPI
    participant R as RAGSystem
    participant C as ChromaDB
    participant O as OpenAI
    
    U->>F: POST /ask {question}
    F->>R: search_and_answer(question)
    R->>C: similarity_search(query, top_k)
    C-->>R: List[DocumentChunks]
    R->>R: Filter & Score Relevancy
    R->>O: ChatCompletion(context + question)
    O-->>R: Generated Answer
    R-->>F: ChatResponse(answer, sources)
    F-->>U: JSON Response

How to apply & reuse

Deploy on Linux/Ubuntu servers or EC2 instances. Configure via .env file with OpenAI API keys and optional AWS credentials. Use Docker Compose for ChromaDB dependency management. Suitable for enterprise document Q&A where audit trails and source transparency are required.

At a glance

CapabilitiesHierarchical document indexingParagraph-aware semantic searchAutomated source citationRelevancy scoringMulti-format document ingestionRESTful API exposureInteractive web chatCommand-line interface

Componentssrc/core/config.pysrc/core/models.pysrc/search/rag_system.pysrc/processing/document_processor.pyapp_refactored.pystreamlit_app.pysrc/ui/modern

TechPython 3.9+FastAPIStreamlitNext.jsChromaDBOpenAI APIPydanticDocker

Depends onOpenAI API KeyDocker & Docker ComposePython 3.9+Node.js (for Modern UI)AWS Credentials (optional)

Integrates withOpenAI GPT ModelsChromaDB Vector StoreAWS S3StreamlitNext.js Frontend

PatternsRetrieval-Augmented GenerationModular MonolithDependency InjectionStrategy Pattern (Search)Repository Pattern (Storage)

Reuse tagsragdocument-chatvector-searchllm-applicationenterprise-searchpython-backend

⚠ Needs attention

unmerged_branch: dependabot/npm_and_yarn/src/ui/modern/npm_and_yarn-c6636f2d36 is 1 commit ahead of the default branch
unmerged_branch: dependabot/pip/pip-f7c9f9c38b is 1 commit ahead of the default branch
open_pr: PR #2: Bump the npm_and_yarn group across 1 directory with 7 updates
open_pr: PR #1: Bump the pip group across 1 directory with 6 updates