Searching Your Knowledge Base

Inherent provides semantic search across every document in your workspace. Instead of keyword matching, it understands the meaning of your query and returns the most relevant passages.

note

Searching requires an API key with search permission. See Authentication for details.

How Search Works

Your query text is converted into a vector embedding.
Weaviate performs a hybrid search combining BM25 keyword matching with vector similarity.
Results are scored, ranked, and returned with the matching chunk content.

This means a query like "revenue growth in Q1" will find passages about "first quarter earnings increase" even if those exact words never appear.

Search Request

Send a POST request to /api/v1/search with a JSON body.

cURL
Python
JavaScript

curl -X POST https://api.inherent.systems/api/v1/search \
  -H "X-API-Key: $INHERENT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What was the revenue growth in Q1?",
    "limit": 5,
    "min_score": 0.3
  }'

import requests
import os

response = requests.post(
    "https://api.inherent.systems/api/v1/search",
    headers={
        "X-API-Key": os.environ["INHERENT_API_KEY"],
        "Content-Type": "application/json",
    },
    json={
        "query": "What was the revenue growth in Q1?",
        "limit": 5,
        "min_score": 0.3,
    },
)

results = response.json()["results"]
for r in results:
    print(f"[{r['score']:.2f}] {r['document_name']}: {r['content'][:100]}")

const response = await fetch("https://api.inherent.systems/api/v1/search", {
  method: "POST",
  headers: {
    "X-API-Key": process.env.INHERENT_API_KEY,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    query: "What was the revenue growth in Q1?",
    limit: 5,
    min_score: 0.3,
  }),
});

const { results } = await response.json();
results.forEach((r) => {
  console.log(`[${r.score.toFixed(2)}] ${r.document_name}: ${r.content.slice(0, 100)}`);
});

Request Parameters

Parameter	Type	Required	Default	Description
`query`	string	Yes	—	The search query. Maximum 1,000 characters.
`limit`	integer	No	10	Number of results to return. Range: 1–100.
`min_score`	float	No	0.0	Minimum relevance score. Range: 0.0–1.0. Results below this threshold are excluded.
`document_ids`	string[]	No	—	Restrict search to specific documents by ID.

Response

{
  "results": [
    {
      "chunk_id": "chk_x1y2z3",
      "document_id": "doc_abc123",
      "document_name": "q1-2026-revenue-report.pdf",
      "content": "Revenue grew 23% year-over-year in Q1 2026, driven primarily by expansion in the enterprise segment...",
      "score": 0.87,
      "metadata": {
        "department": "finance",
        "quarter": "Q1-2026"
      }
    },
    {
      "chunk_id": "chk_a4b5c6",
      "document_id": "doc_def456",
      "document_name": "board-deck-march-2026.pdf",
      "content": "First quarter results exceeded projections by 8%, with revenue reaching $4.2M...",
      "score": 0.72,
      "metadata": {}
    }
  ]
}

Tuning Search Results

Precision vs. Recall

Higher min_score (e.g., 0.7): Fewer results, but each one is highly relevant. Good for RAG pipelines where you need precise context.
Lower min_score (e.g., 0.1) or omitted: More results, including tangentially related passages. Good for exploratory search.
Smaller limit: Return only the top matches. Useful when feeding results into an LLM with a limited context window.
Larger limit: Cast a wider net. Useful when you need comprehensive coverage of a topic.

Filtering by Document

Use document_ids to restrict search to specific documents. This is useful when you know which documents are relevant and want to avoid noise from unrelated content.

curl -X POST https://api.inherent.systems/api/v1/search \
  -H "X-API-Key: $INHERENT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "quarterly revenue",
    "document_ids": ["doc_abc123", "doc_def456"],
    "limit": 10
  }'

Best Practices

Write natural language queries. "What was the revenue in Q1?" works better than "revenue Q1" because the embedding model captures more meaning from complete sentences.
Be specific. "How did the enterprise segment perform in Q1 2026?" returns more targeted results than "enterprise performance."
Start with a moderate min_score. A threshold of 0.3 is a reasonable starting point. Adjust up or down based on result quality.
Handle empty results gracefully. If results is empty, the query did not match any chunks above your min_score threshold. Try lowering the threshold or rephrasing the query.
Combine search with context retrieval. Use search to find the most relevant chunks, then use the context retrieval endpoints to get the full document context for your LLM.

How Search Works​

Search Request​

Request Parameters​

Response​

Tuning Search Results​

Precision vs. Recall​

Filtering by Document​

Best Practices​