A modular, production-ready Retrieval-Augmented Generation system with hierarchical document processing and paragraph-aware search.
https://github.com/davidbmar/rag-document-chat-ver2 · public · shipped
A Python-based RAG application that ingests PDF/TXT documents, processes them into multiple layers of granularity (basic chunks, smart summaries, paragraph contexts), and stores them in ChromaDB. It provides a FastAPI backend, Streamlit web interface, and Next.js modern UI for querying documents using OpenAI LLMs, featuring SOC2-compliant citation trails and relevancy scoring.
git clone https://github.com/davidbmar/rag-document-chat-ver2.git cd rag-document-chat-ver2 chmod +x setup.sh start.sh ./setup.sh nano .env ./start.sh
flowchart TD
User[User] -->|Query| UI[Web Interface / CLI]
UI -->|HTTP Request| API[FastAPI Server]
API -->|Orchestrate| RAG[RAGSystem Core]
RAG -->|Search| VectorDB[(ChromaDB)]
RAG -->|Retrieve Docs| Storage[S3 / Local FS]
RAG -->|Generate Answer| LLM[OpenAI API]
RAG -->|Return Response| API
API -->|JSON Response| UI
subgraph Processing
DocProc[DocumentProcessor] -->|Extract Text| Parser[PDF/TXT Parser]
DocProc -->|Chunk & Summarize| Indexer[Indexing Engine]
Indexer -->|Store Vectors| VectorDB
end
API -->|Upload| DocProc
Built with Python 3.9+ using a modular architecture split into core, processing, search, and API modules. It uses Pydantic for data validation, ChromaDB for vector storage, and supports S3 for document persistence. The system employs hierarchical indexing strategies to balance conceptual search with detailed retrieval.
sequenceDiagram
participant U as User
participant F as FastAPI
participant R as RAGSystem
participant C as ChromaDB
participant O as OpenAI
U->>F: POST /ask {question}
F->>R: search_and_answer(question)
R->>C: similarity_search(query, top_k)
C-->>R: List[DocumentChunks]
R->>R: Filter & Score Relevancy
R->>O: ChatCompletion(context + question)
O-->>R: Generated Answer
R-->>F: ChatResponse(answer, sources)
F-->>U: JSON Response
Deploy on Linux/Ubuntu servers or EC2 instances. Configure via .env file with OpenAI API keys and optional AWS credentials. Use Docker Compose for ChromaDB dependency management. Suitable for enterprise document Q&A where audit trails and source transparency are required.