Quickstart

Upload a document, search it, and use the results in your AI -- all in 5 minutes.

Base URL: https://api.inherent.sh

Auth: Every request requires an X-API-Key header.

Step 1: Get Your API Key

Sign in to the Inherent Dashboard.
Go to Settings > API Keys.
Click Create API Key and copy the key.

tip

API keys start with ink_. Store yours in an environment variable:

export INHERENT_API_KEY="ink_your_api_key"

Step 2: Upload a Document

Upload a file to your knowledge base. Inherent handles parsing, chunking, and embedding automatically.

cURL
Python
JavaScript

curl -X POST https://api.inherent.sh/v1/documents \
  -H "X-API-Key: $INHERENT_API_KEY" \
  -F "file=@./my-document.pdf"

import requests

API_KEY = "ink_your_api_key"
BASE_URL = "https://api.inherent.sh/v1"

with open("my-document.pdf", "rb") as f:
    resp = requests.post(
        f"{BASE_URL}/documents",
        headers={"X-API-Key": API_KEY},
        files={"file": ("my-document.pdf", f, "application/pdf")},
    )

print(resp.json())

const API_KEY = 'ink_your_api_key';
const BASE_URL = 'https://api.inherent.sh/v1';

const formData = new FormData();
formData.append('file', fs.createReadStream('./my-document.pdf'));

const resp = await fetch(`${BASE_URL}/documents`, {
  method: 'POST',
  headers: { 'X-API-Key': API_KEY },
  body: formData,
});

console.log(await resp.json());

Response:

{
  "document_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "name": "my-document.pdf",
  "workspace_id": "ws_abc123",
  "mime_type": "application/pdf",
  "size_bytes": 245760,
  "status": "pending",
  "message": "Document uploaded successfully. Processing will begin shortly."
}

Async Processing

Documents are processed asynchronously. The upload returns immediately with status: "pending". Processing typically takes a few seconds -- check the status before searching.

Step 3: Check Document Status

Poll the documents endpoint until your document shows status: "processed".

cURL
Python
JavaScript

curl https://api.inherent.sh/v1/documents \
  -H "X-API-Key: $INHERENT_API_KEY"

resp = requests.get(
    f"{BASE_URL}/documents",
    headers={"X-API-Key": API_KEY},
)

for doc in resp.json()["documents"]:
    print(f"{doc['name']} -- {doc['status']}")

const resp = await fetch(`${BASE_URL}/documents`, {
  headers: { 'X-API-Key': API_KEY },
});

const data = await resp.json();
data.documents.forEach(doc => {
  console.log(`${doc.name} -- ${doc.status}`);
});

Response:

{
  "documents": [
    {
      "id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
      "name": "my-document.pdf",
      "workspace_id": "ws_abc123",
      "source_type": "upload",
      "mime_type": "application/pdf",
      "size_bytes": 245760,
      "chunk_count": 42,
      "status": "processed",
      "created_at": "2026-04-03T10:30:00Z",
      "updated_at": "2026-04-03T10:30:05Z",
      "metadata": null
    }
  ],
  "total": 1,
  "page": 1,
  "page_size": 20
}

When status is "processed", the document is searchable.

Step 4: Search Your Knowledge Base

Run a semantic search query to retrieve relevant chunks.

cURL
Python
JavaScript

curl -X POST https://api.inherent.sh/v1/search \
  -H "X-API-Key: $INHERENT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What does this document cover?",
    "limit": 5
  }'

resp = requests.post(
    f"{BASE_URL}/search",
    headers={"X-API-Key": API_KEY},
    json={"query": "What does this document cover?", "limit": 5},
)

for result in resp.json()["results"]:
    print(f"[{result['score']:.3f}] {result['content'][:120]}...")

const resp = await fetch(`${BASE_URL}/search`, {
  method: 'POST',
  headers: {
    'X-API-Key': API_KEY,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    query: 'What does this document cover?',
    limit: 5,
  }),
});

const data = await resp.json();
data.results.forEach(r => {
  console.log(`[${r.score.toFixed(3)}] ${r.content.slice(0, 120)}...`);
});

Response:

{
  "results": [
    {
      "chunk_id": "chunk_xyz789",
      "document_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
      "document_name": "my-document.pdf",
      "content": "This document provides an overview of the system architecture, including the ingestion pipeline, search infrastructure, and audit logging.",
      "score": 0.94,
      "metadata": null
    },
    {
      "chunk_id": "chunk_abc456",
      "document_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
      "document_name": "my-document.pdf",
      "content": "The platform supports PDF, Markdown, DOCX, and plain text formats. Documents are automatically chunked and indexed for semantic retrieval.",
      "score": 0.87,
      "metadata": null
    }
  ],
  "query": "What does this document cover?",
  "total_results": 2,
  "processing_time_ms": 38.7
}

Step 5: Use with Your LLM

Feed the retrieved context into your LLM to get grounded answers.

OpenAI
Anthropic

import openai
import requests

API_KEY = "ink_your_api_key"
BASE_URL = "https://api.inherent.sh/v1"

def ask(question: str) -> str:
    # 1. Retrieve context from Inherent
    search_resp = requests.post(
        f"{BASE_URL}/search",
        headers={"X-API-Key": API_KEY},
        json={"query": question, "limit": 5},
    )
    chunks = search_resp.json()["results"]
    context = "\n\n".join(c["content"] for c in chunks)

    # 2. Send to OpenAI with context
    completion = openai.chat.completions.create(
        model="gpt-4o",
        messages=[
            {
                "role": "system",
                "content": (
                    "Answer the user's question using only the provided context. "
                    "If the context doesn't contain the answer, say so.\n\n"
                    f"Context:\n{context}"
                ),
            },
            {"role": "user", "content": question},
        ],
    )
    return completion.choices[0].message.content

answer = ask("What file formats are supported?")
print(answer)

import anthropic
import requests

API_KEY = "ink_your_api_key"
BASE_URL = "https://api.inherent.sh/v1"

def ask(question: str) -> str:
    # 1. Retrieve context from Inherent
    search_resp = requests.post(
        f"{BASE_URL}/search",
        headers={"X-API-Key": API_KEY},
        json={"query": question, "limit": 5},
    )
    chunks = search_resp.json()["results"]
    context = "\n\n".join(c["content"] for c in chunks)

    # 2. Send to Claude with context
    client = anthropic.Anthropic()
    message = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=1024,
        system=(
            "Answer the user's question using only the provided context. "
            "If the context doesn't contain the answer, say so.\n\n"
            f"Context:\n{context}"
        ),
        messages=[{"role": "user", "content": question}],
    )
    return message.content[0].text

answer = ask("What file formats are supported?")
print(answer)

tip

For production, add error handling and consider caching frequent queries. The processing_time_ms field in search responses helps you monitor latency.

Search with context window

If you want the surrounding chunks for each match (better for RAG prompts), set include_context: true:

curl -X POST https://api.inherent.sh/v1/search \
  -H "X-API-Key: $INHERENT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "how do I rotate an API key",
    "limit": 5,
    "include_context": true,
    "context_window": 2
  }'

Each result now includes context_before and context_after arrays -- up to 2 chunks on each side of the match (max 5). The response's total_tokens field sums across every chunk returned, so you can plan your LLM prompt budget without counting yourself.

Local development

Swap https://api.inherent.sh for https://dev-api.inherent.sh when testing against the development environment.

Hybrid search for exact matches

Pure semantic search sometimes misses literal strings (error codes, code snippets, IDs). Switch to hybrid mode to fuse BM25 + vector similarity:

curl -X POST https://api.inherent.sh/v1/search \
  -H "X-API-Key: $INHERENT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "ErrRateLimitExceeded",
    "limit": 5,
    "search_mode": "hybrid",
    "alpha": 0.7
  }'

search_mode: "semantic" (default, vector), "hybrid" (BM25 + vector with alpha weighting), "keyword" (BM25 only)
alpha: 0.0 (pure keyword) to 1.0 (vector-heavy). Default 0.7 works well for mixed prose + code corpora.

The response echoes back search_mode so you can confirm routing.

Next Steps

Authentication Guide -- API key management and security best practices
Uploading Documents -- Supported file types, metadata, and processing details
Searching Your Knowledge Base -- Search filters, scoring, and tuning
Retrieving Context -- Chunks, full context, and LLM prompt building
API Reference -- Complete endpoint documentation

Step 1: Get Your API Key​

Step 2: Upload a Document​

Step 3: Check Document Status​

Step 4: Search Your Knowledge Base​

Step 5: Use with Your LLM​

Search with context window​

Hybrid search for exact matches​

Next Steps​

Step 1: Get Your API Key

Step 2: Upload a Document

Step 3: Check Document Status

Step 4: Search Your Knowledge Base

Step 5: Use with Your LLM

Search with context window

Hybrid search for exact matches

Next Steps