RAG Document Chat System v2

A modular, production-ready Retrieval-Augmented Generation system with hierarchical document processing and paragraph-aware search.

https://github.com/davidbmar/rag-document-chat-ver2  ·  public  ·  shipped

What it is

A Python-based RAG application that ingests PDF/TXT documents, processes them into multiple layers of granularity (basic chunks, smart summaries, paragraph contexts), and stores them in ChromaDB. It provides a FastAPI backend, Streamlit web interface, and Next.js modern UI for querying documents using OpenAI LLMs, featuring SOC2-compliant citation trails and relevancy scoring.

Features

Quickstart

git clone https://github.com/davidbmar/rag-document-chat-ver2.git
cd rag-document-chat-ver2
chmod +x setup.sh start.sh
./setup.sh
nano .env
./start.sh

Architecture

flowchart TD
    User[User] -->|Query| UI[Web Interface / CLI]
    UI -->|HTTP Request| API[FastAPI Server]
    API -->|Orchestrate| RAG[RAGSystem Core]
    RAG -->|Search| VectorDB[(ChromaDB)]
    RAG -->|Retrieve Docs| Storage[S3 / Local FS]
    RAG -->|Generate Answer| LLM[OpenAI API]
    RAG -->|Return Response| API
    API -->|JSON Response| UI
    subgraph Processing
        DocProc[DocumentProcessor] -->|Extract Text| Parser[PDF/TXT Parser]
        DocProc -->|Chunk & Summarize| Indexer[Indexing Engine]
        Indexer -->|Store Vectors| VectorDB
    end
    API -->|Upload| DocProc

How it's built

Built with Python 3.9+ using a modular architecture split into core, processing, search, and API modules. It uses Pydantic for data validation, ChromaDB for vector storage, and supports S3 for document persistence. The system employs hierarchical indexing strategies to balance conceptual search with detailed retrieval.

How it runs

sequenceDiagram
    participant U as User
    participant F as FastAPI
    participant R as RAGSystem
    participant C as ChromaDB
    participant O as OpenAI
    
    U->>F: POST /ask {question}
    F->>R: search_and_answer(question)
    R->>C: similarity_search(query, top_k)
    C-->>R: List[DocumentChunks]
    R->>R: Filter & Score Relevancy
    R->>O: ChatCompletion(context + question)
    O-->>R: Generated Answer
    R-->>F: ChatResponse(answer, sources)
    F-->>U: JSON Response

How to apply & reuse

Deploy on Linux/Ubuntu servers or EC2 instances. Configure via .env file with OpenAI API keys and optional AWS credentials. Use Docker Compose for ChromaDB dependency management. Suitable for enterprise document Q&A where audit trails and source transparency are required.

At a glance

CapabilitiesHierarchical document indexingParagraph-aware semantic searchAutomated source citationRelevancy scoringMulti-format document ingestionRESTful API exposureInteractive web chatCommand-line interface
Componentssrc/core/config.pysrc/core/models.pysrc/search/rag_system.pysrc/processing/document_processor.pyapp_refactored.pystreamlit_app.pysrc/ui/modern
TechPython 3.9+FastAPIStreamlitNext.jsChromaDBOpenAI APIPydanticDocker
Depends onOpenAI API KeyDocker & Docker ComposePython 3.9+Node.js (for Modern UI)AWS Credentials (optional)
Integrates withOpenAI GPT ModelsChromaDB Vector StoreAWS S3StreamlitNext.js Frontend
PatternsRetrieval-Augmented GenerationModular MonolithDependency InjectionStrategy Pattern (Search)Repository Pattern (Storage)
Reuse tagsragdocument-chatvector-searchllm-applicationenterprise-searchpython-backend

⚠ Needs attention