Skip to main content

Quickstart

Upload a document, search it, and use the results in your AI -- all in 5 minutes.

Base URL: https://api.inherent.sh

Auth: Every request requires an X-API-Key header.

Step 1: Get Your API Key

  1. Sign in to the Inherent Dashboard.
  2. Go to Settings > API Keys.
  3. Click Create API Key and copy the key.
tip

API keys start with ink_. Store yours in an environment variable:

export INHERENT_API_KEY="ink_your_api_key"

Step 2: Upload a Document

Upload a file to your knowledge base. Inherent handles parsing, chunking, and embedding automatically.

curl -X POST https://api.inherent.sh/v1/documents \
-H "X-API-Key: $INHERENT_API_KEY" \
-F "file=@./my-document.pdf"

Response:

{
"document_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"name": "my-document.pdf",
"workspace_id": "ws_abc123",
"mime_type": "application/pdf",
"size_bytes": 245760,
"status": "pending",
"message": "Document uploaded successfully. Processing will begin shortly."
}
Async Processing

Documents are processed asynchronously. The upload returns immediately with status: "pending". Processing typically takes a few seconds -- check the status before searching.

Step 3: Check Document Status

Poll the documents endpoint until your document shows status: "processed".

curl https://api.inherent.sh/v1/documents \
-H "X-API-Key: $INHERENT_API_KEY"

Response:

{
"documents": [
{
"id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"name": "my-document.pdf",
"workspace_id": "ws_abc123",
"source_type": "upload",
"mime_type": "application/pdf",
"size_bytes": 245760,
"chunk_count": 42,
"status": "processed",
"created_at": "2026-04-03T10:30:00Z",
"updated_at": "2026-04-03T10:30:05Z",
"metadata": null
}
],
"total": 1,
"page": 1,
"page_size": 20
}

When status is "processed", the document is searchable.

Step 4: Search Your Knowledge Base

Run a semantic search query to retrieve relevant chunks.

curl -X POST https://api.inherent.sh/v1/search \
-H "X-API-Key: $INHERENT_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"query": "What does this document cover?",
"limit": 5
}'

Response:

{
"results": [
{
"chunk_id": "chunk_xyz789",
"document_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"document_name": "my-document.pdf",
"content": "This document provides an overview of the system architecture, including the ingestion pipeline, search infrastructure, and audit logging.",
"score": 0.94,
"metadata": null
},
{
"chunk_id": "chunk_abc456",
"document_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"document_name": "my-document.pdf",
"content": "The platform supports PDF, Markdown, DOCX, and plain text formats. Documents are automatically chunked and indexed for semantic retrieval.",
"score": 0.87,
"metadata": null
}
],
"query": "What does this document cover?",
"total_results": 2,
"processing_time_ms": 38.7
}

Step 5: Use with Your LLM

Feed the retrieved context into your LLM to get grounded answers.

import openai
import requests

API_KEY = "ink_your_api_key"
BASE_URL = "https://api.inherent.sh/v1"

def ask(question: str) -> str:
# 1. Retrieve context from Inherent
search_resp = requests.post(
f"{BASE_URL}/search",
headers={"X-API-Key": API_KEY},
json={"query": question, "limit": 5},
)
chunks = search_resp.json()["results"]
context = "\n\n".join(c["content"] for c in chunks)

# 2. Send to OpenAI with context
completion = openai.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "system",
"content": (
"Answer the user's question using only the provided context. "
"If the context doesn't contain the answer, say so.\n\n"
f"Context:\n{context}"
),
},
{"role": "user", "content": question},
],
)
return completion.choices[0].message.content

answer = ask("What file formats are supported?")
print(answer)
tip

For production, add error handling and consider caching frequent queries. The processing_time_ms field in search responses helps you monitor latency.

Search with context window

If you want the surrounding chunks for each match (better for RAG prompts), set include_context: true:

curl -X POST https://api.inherent.sh/v1/search \
-H "X-API-Key: $INHERENT_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"query": "how do I rotate an API key",
"limit": 5,
"include_context": true,
"context_window": 2
}'

Each result now includes context_before and context_after arrays -- up to 2 chunks on each side of the match (max 5). The response's total_tokens field sums across every chunk returned, so you can plan your LLM prompt budget without counting yourself.

Local development

Swap https://api.inherent.sh for https://dev-api.inherent.sh when testing against the development environment.

Hybrid search for exact matches

Pure semantic search sometimes misses literal strings (error codes, code snippets, IDs). Switch to hybrid mode to fuse BM25 + vector similarity:

curl -X POST https://api.inherent.sh/v1/search \
-H "X-API-Key: $INHERENT_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"query": "ErrRateLimitExceeded",
"limit": 5,
"search_mode": "hybrid",
"alpha": 0.7
}'
  • search_mode: "semantic" (default, vector), "hybrid" (BM25 + vector with alpha weighting), "keyword" (BM25 only)
  • alpha: 0.0 (pure keyword) to 1.0 (vector-heavy). Default 0.7 works well for mixed prose + code corpora.

The response echoes back search_mode so you can confirm routing.

Next Steps