Retrieval

Retrieval is how you search your knowledge base and get relevant context for your AI applications.

Basic Search

Perform a semantic search:

curl -X POST https://api.inherent.systems/v1/search \
  -H "Authorization: Bearer $INHERENT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "How do I authenticate API requests?",
    "limit": 5
  }'

Response:

{
  "chunks": [
    {
      "id": "chunk_abc123",
      "content": "Include your API key in the Authorization header...",
      "score": 0.94,
      "document_id": "doc_xyz789",
      "metadata": {
        "title": "Authentication Guide"
      }
    }
  ],
  "query_id": "qry_def456"
}

Search Parameters

Parameter	Type	Default	Description
`query`	string	required	Your search query
`limit`	integer	10	Max results (1-100)
`threshold`	float	0.0	Min similarity score (0-1)
`filter`	object	null	Metadata filters
`include_metadata`	boolean	true	Include metadata in results

Filtering by Metadata

Filter search results by metadata values:

{
  "query": "authentication",
  "filter": {
    "category": "documentation",
    "version": "2.0"
  }
}

Filter Operators

{
  "query": "API endpoints",
  "filter": {
    "category": {"$eq": "api"},
    "version": {"$gte": "2.0"},
    "tags": {"$in": ["api", "reference"]},
    "deprecated": {"$ne": true}
  }
}

Operator	Description
`$eq`	Equals
`$ne`	Not equals
`$gt`, `$gte`	Greater than (or equal)
`$lt`, `$lte`	Less than (or equal)
`$in`	Value in array
`$nin`	Value not in array

Hybrid Search

Combine semantic search with keyword matching:

{
  "query": "OAuth2 authentication flow",
  "hybrid": {
    "enabled": true,
    "alpha": 0.7
  }
}

The alpha parameter controls the balance:

1.0 = Pure semantic search
0.0 = Pure keyword search
0.7 = 70% semantic, 30% keyword (recommended)

Getting Full Documents

Retrieve a complete document:

curl https://api.inherent.systems/v1/documents/doc_abc123 \
  -H "Authorization: Bearer $INHERENT_API_KEY"

Get all chunks from a document:

curl https://api.inherent.systems/v1/documents/doc_abc123/chunks \
  -H "Authorization: Bearer $INHERENT_API_KEY"

Context Window Management

When building prompts for LLMs, manage your context window:

import tiktoken

def get_context(query, max_tokens=4000):
    response = requests.post(
        f"{base_url}/search",
        headers={"Authorization": f"Bearer {api_key}"},
        json={"query": query, "limit": 20}
    )

    chunks = response.json()["chunks"]
    context = []
    total_tokens = 0
    encoder = tiktoken.get_encoding("cl100k_base")

    for chunk in chunks:
        chunk_tokens = len(encoder.encode(chunk["content"]))
        if total_tokens + chunk_tokens > max_tokens:
            break
        context.append(chunk["content"])
        total_tokens += chunk_tokens

    return "\n\n".join(context)

Query Logging

All queries are logged for audit and debugging:

curl https://api.inherent.systems/v1/queries \
  -H "Authorization: Bearer $INHERENT_API_KEY"

Each query log includes:

Query text
Results returned
Latency
Timestamp
User/API key used

Best Practices

Be specific - "How do I authenticate with OAuth2?" beats "authentication"
Use filters - Narrow results with metadata filters
Set thresholds - Use threshold: 0.7 to filter low-quality matches
Limit results - Don't retrieve more than you need
Cache wisely - Cache frequent queries at the application level

Basic Search​

Search Parameters​

Filtering by Metadata​

Filter Operators​

Hybrid Search​

Getting Full Documents​

Context Window Management​

Query Logging​

Best Practices​

Next Steps​