Persona RAG Engine · davidbmar.com

What it is

The Persona RAG Engine is a factory-style architecture for building independent analytical environments. Each 'domain' (e.g., political campaign, historical investigation) operates with its own set of persona cards, evidence collections in ChromaDB, and retrieval logic. It ensures zero data bleed between domains by strictly partitioning vector stores and session contexts. The system supports dynamic domain creation via UI or directory structure and integrates with external intelligence pipelines for live evidence syncing.

Features

Zero data bleed between domains via isolated ChromaDB collections
Persona-weighted retrieval boosting evidence based on signal affinity
Two-pass generation separating logical reasoning from stylistic voice
Dynamic domain creation via UI factory or directory-based configuration
Integration with external intelligence briefings for daily evidence syncs
WebSocket-supported streaming chat with real-time domain switching

Quickstart

git clone https://github.com/davidbmar/tool-RAG-for-split-personalities.git
cd tool-RAG-for-split-personalities
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
export ANTHROPIC_API_KEY=sk-ant-...
python scripts/serve.py

Architecture

flowchart TD
    User[User Browser] -->|HTTP/WebSocket| API[FastAPI Server]
    API --> DomainMgr[Domain Manager]
    DomainMgr -->|Select Context| Chroma[ChromaDB]
    Chroma -->|Per-Domain Collections| Evidence[Evidence Corpus]
    Chroma -->|Per-Persona Collections| Personas[Persona Cards]
    API -->|Retrieve & Rank| Retriever[Retrieval Engine]
    Retriever -->|Weighted Results| Generator[LLM Generator]
    Generator -->|Pass 1: Reasoning| ReasonFrame[Reasoning Frame]
    ReasonFrame -->|Pass 2: Styling| FinalResponse[Styled Response]
    FinalResponse -->|Stream| API
    IBT[Intelligence Briefing Toolkit] -->|Daily Sync Script| Evidence

How it's built

Built on Python 3.12+, the engine uses FastAPI for the backend server and WebSocket connections for streaming chat responses. It relies on ChromaDB for per-domain vector storage, organizing collections by domain and persona. Retrieval is enhanced by persona-weighted ranking, where specific signal affinities boost relevant evidence. Generation occurs in two passes: first constructing a reasoning frame, then styling the output to match the persona's voice. The frontend is a lightweight web interface with a domain switcher and optional password protection via client-side SHA-256 hashing.

How it runs

sequenceDiagram
    participant U as User
    participant F as FastAPI Server
    participant D as Domain Manager
    participant C as ChromaDB
    participant L as LLM Generator
    U->>F: Select Domain & Persona
    F->>D: Load Domain Context
    D->>C: Fetch Persona Card & Evidence Collection
    C-->>D: Return Config & Vector Store Ref
    U->>F: Send Question
    F->>D: Initiate Query
    D->>C: Retrieve Evidence (Persona-Weighted)
    C-->>D: Ranked Evidence Chunks
    D->>L: Pass 1: Generate Reasoning Frame
    L-->>D: Raw Logical Analysis
    D->>L: Pass 2: Apply Persona Style
    L-->>D: Styled Response
    D-->>F: Stream Response
    F-->>U: Display Answer

How to apply & reuse

Use this engine when you need multiple distinct analytical lenses on different topics without cross-contamination. Ideal for political war rooms, historical research teams, or literary criticism platforms where specific voices (personas) must adhere to strict evidentiary boundaries. It scales by adding new domain directories with JSON configurations and evidence files, allowing rapid deployment of new analytical workspaces.

At a glance

CapabilitiesMulti-domain isolationPersona-based retrieval weightingTwo-pass LLM generationDynamic domain ingestionExternal evidence syncingStreaming WebSocket chat

ComponentsFastAPI BackendChromaDB Vector StoreDomain Factory UIPersona Card SchemaEvidence Sync ScriptsClient-side Auth Gate

TechPython 3.12+FastAPIChromaDBAnthropic Claude APIWebSocketsJavaScript (SHA-256)

Depends onanthropicchromadbfastapiuvicornwebsockets

Integrates withIntelligence Briefing ToolkitCron (for scheduled syncs)Custom Evidence Pipelines

PatternsFactory PatternStrategy Pattern (Personas)Repository Pattern (Evidence)Two-Step GenerationDomain-Driven Design

Reuse tagsRAGMulti-AgentVector SearchDomain IsolationPolitical AnalysisHistorical Research

⚠ Needs attention

unmerged_branch: agentA-analytics-export is 1 commit ahead of the default branch
unmerged_branch: agentA-batch-roundtable is 2 commits ahead of the default branch
unmerged_branch: agentA-cli-installer is 1 commit ahead of the default branch
unmerged_branch: agentA-collaboration is 1 commit ahead of the default branch
unmerged_branch: agentA-deploy-command is 2 commits ahead of the default branch
unmerged_branch: agentA-openapi-docs is 1 commit ahead of the default branch
unmerged_branch: agentA-production-hardening is 1 commit ahead of the default branch
unmerged_branch: agentA-whitelabel is 1 commit ahead of the default branch
unmerged_branch: agentA-zero-config is 1 commit ahead of the default branch
unmerged_branch: agentB-analytics-readme is 2 commits ahead of the default branch
unmerged_branch: agentB-bug-sweep is 1 commit ahead of the default branch
unmerged_branch: agentB-domain-versioning is 1 commit ahead of the default branch
unmerged_branch: agentB-embed-widget is 1 commit ahead of the default branch
unmerged_branch: agentB-factory-validation is 1 commit ahead of the default branch
unmerged_branch: agentB-persona-editor is 1 commit ahead of the default branch
unmerged_branch: agentB-typing-indicators is 1 commit ahead of the default branch
unmerged_branch: agentC-ux-fixes is 1 commit ahead of the default branch