Architecture

Inherent is built on a three-layer architecture that separates authoritative data from searchable memory. This separation is what makes it possible to provide both accuracy and semantic search without compromising either.

Three Layers

┌─────────────────────────────────────────────────────────┐
│                      Your Application                    │
│              (search, retrieve, upload via API)           │
└────────────────────────┬────────────────────────────────┘
                         │
                    Public API
                         │
       ┌─────────────────┼─────────────────┐
       ▼                 ▼                 ▼
┌──────────────┐  ┌──────────────┐  ┌──────────────┐
│  Truth Layer │  │ Memory Layer │  │  Audit Layer │
│ (PostgreSQL) │  │  (Weaviate)  │  │   (Logs)     │
└──────────────┘  └──────────────┘  └──────────────┘

Truth Layer (PostgreSQL)

PostgreSQL is the single source of truth. It stores:

Documents — metadata, file references, processing status
Chunks — the parsed and split text passages from each document
Ingestion records — processing history and lineage

Only the ingestion pipeline writes to PostgreSQL. The API reads from it to hydrate search results with authoritative metadata. If the memory layer were wiped entirely, it could be rebuilt from the truth layer.

Memory Layer (Weaviate)

Weaviate stores vector embeddings that power semantic search. It holds:

Chunk embeddings — vector representations of every text passage
BM25 index — keyword index for hybrid search

Only the ingestion pipeline writes to Weaviate. The API queries it when you call the search endpoint. The memory layer is derived from the truth layer and can be reconstructed at any time.

Audit Layer

Every API call, search query, and document retrieval is logged for compliance and debugging. This gives you a complete trail of what was accessed, when, and by which API key.

Ingestion Pipeline

When you upload a document, it flows through an asynchronous pipeline:

Upload             S3 Storage          Message Queue         Ingestion Service
  │                    │                     │                      │
  │  POST /documents   │                     │                      │
  ├───────────────────►│  store file         │                      │
  │                    ├────────────────────►│  publish event       │
  │  { status: pending }                     ├─────────────────────►│
  │◄─────────────────────────────────────────┤                      │
  │                                          │                      │
  │                                          │      ┌───────────────┤
  │                                          │      │  1. Parse     │
  │                                          │      │  2. Chunk     │
  │                                          │      │  3. Embed     │
  │                                          │      │  4. Index     │
  │                                          │      └───────────────┤
  │                                          │                      │
  │                                          │         Write to     │
  │                                          │    PostgreSQL + Weaviate

Parse — Extract raw text from the uploaded file (PDF, DOCX, etc.)
Chunk — Split the text into passages optimized for retrieval
Embed — Generate vector embeddings for each chunk
Index — Store chunks in PostgreSQL (truth) and embeddings in Weaviate (memory)

The API returns immediately after the upload with status: "pending". Processing happens in the background. When it completes, the document status changes to processed and its content becomes searchable.

info

Ingestion is always asynchronous. This means uploads never block your application, but you need to check document status before searching newly uploaded content.

Multi-Tenant Workspace Isolation

Every organization operates within a workspace. Workspaces provide hard isolation:

Each workspace has its own document collection in Weaviate
Database queries are scoped to the workspace's tenant
API keys are bound to a single workspace
There is no way for one workspace to access another workspace's data

Workspace A                    Workspace B
┌────────────────────┐         ┌────────────────────┐
│  Documents (PG)    │         │  Documents (PG)    │
│  Embeddings (WV)   │         │  Embeddings (WV)   │
│  API Keys          │         │  API Keys          │
└────────────────────┘         └────────────────────┘
        ▲                              ▲
        │ scoped by tenant             │ scoped by tenant
        │                              │
    API Key A                      API Key B

Search Flow

When you call the search endpoint, the query flows through both layers:

Your Query
    │
    ▼
┌──────────────────────────────────┐
│  1. Embed query text             │  (generate vector)
│  2. Search Weaviate              │  (BM25 + vector similarity)
│  3. Rank and score results       │  (hybrid scoring)
│  4. Hydrate from PostgreSQL      │  (add metadata, document info)
│  5. Return ranked results        │
└──────────────────────────────────┘

Weaviate finds the semantically similar chunks. PostgreSQL provides the authoritative metadata (document name, upload date, custom metadata). The API merges both into the response you receive.

Why This Architecture

Accuracy. PostgreSQL is the single source of truth. Embeddings can drift or be regenerated, but the authoritative text and metadata never move.

Rebuildability. If the Weaviate index is corrupted or needs to be upgraded, it can be rebuilt entirely from PostgreSQL. No data is lost.

Search quality. Weaviate's hybrid search (BM25 + vectors) handles both keyword-exact queries and natural language questions, without requiring you to choose between the two.

Isolation. Workspace-level tenancy means your data is never mixed with another organization's data, not at the application level and not at the storage level.

Async processing. The message queue between upload and ingestion means your application is never blocked waiting for document processing. Upload, get an ID, and poll for completion.

Three Layers​

Truth Layer (PostgreSQL)​

Memory Layer (Weaviate)​

Audit Layer​

Ingestion Pipeline​

Multi-Tenant Workspace Isolation​

Search Flow​

Why This Architecture​