Skip to main content

Uploading Documents

Upload files to your knowledge base and let the ingestion pipeline handle parsing, chunking, embedding, and indexing automatically.

note

Uploading documents requires an API key with write permission. See Authentication for details.

How It Works

  1. You send a POST request with your file to /api/v1/documents.
  2. The API stores the file, creates a document record with status pending, and returns immediately.
  3. The ingestion pipeline processes the file asynchronously: parse the raw content, chunk it into passages, embed each chunk into a vector, and index everything into the search layer.
  4. When processing completes, the document status changes to processed.
Upload → S3 → Message Queue → Parse → Chunk → Embed → Index

PostgreSQL + Weaviate

Supported File Types

FormatMIME TypeExtension
PDFapplication/pdf.pdf
Word (DOCX)application/vnd.openxmlformats-officedocument.wordprocessingml.document.docx
Excel (XLSX)application/vnd.openxmlformats-officedocument.spreadsheetml.sheet.xlsx
Plain Texttext/plain.txt
Markdowntext/markdown.md
CSVtext/csv.csv
JSONapplication/json.json

Maximum file size: 50 MB

Upload a Document

Send a multipart/form-data request with your file in the file field.

curl -X POST https://api.inherent.systems/api/v1/documents \
-H "X-API-Key: $INHERENT_API_KEY" \
-F "file=@./quarterly-report.pdf"

Response

{
"id": "doc_abc123",
"name": "quarterly-report.pdf",
"status": "pending",
"content_type": "application/pdf",
"size_bytes": 2048576,
"metadata": {},
"created_at": "2026-04-03T10:30:00Z"
}

Adding Metadata

Pass an optional metadata field as a JSON string to attach key-value pairs to the document. Metadata is returned in search results and can help you organize documents.

curl -X POST https://api.inherent.systems/api/v1/documents \
-H "X-API-Key: $INHERENT_API_KEY" \
-F "file=@./quarterly-report.pdf" \
-F 'metadata={"department": "finance", "quarter": "Q1-2026"}'

Checking Document Status

After uploading, poll GET /api/v1/documents to check when processing completes.

curl https://api.inherent.systems/api/v1/documents \
-H "X-API-Key: $INHERENT_API_KEY"
{
"documents": [
{
"id": "doc_abc123",
"name": "quarterly-report.pdf",
"status": "processed",
"chunk_count": 47,
"content_type": "application/pdf",
"size_bytes": 2048576,
"metadata": {"department": "finance", "quarter": "Q1-2026"},
"created_at": "2026-04-03T10:30:00Z",
"processed_at": "2026-04-03T10:30:45Z"
}
]
}

Document Statuses

StatusMeaning
pendingQueued for processing
processingIngestion pipeline is actively working on it
processedReady to search and retrieve
failedProcessing failed — check the dashboard for details
tip

A typical PDF processes in under 60 seconds. If your document stays in pending for more than a few minutes, check the dashboard for errors.

Best Practices

  • Use descriptive file names. The file name is stored as the document name and appears in search results. q1-2026-revenue-report.pdf is more useful than document.pdf.
  • Stick to supported formats. Unsupported file types are rejected at upload time with a 400 error.
  • Handle large files. Files close to the 50 MB limit may take longer to process. Consider splitting very large documents into logical sections before uploading.
  • Attach metadata. Adding metadata like department, source, or date makes it easier to filter and organize your knowledge base later.
  • Wait for processed before searching. Documents in pending or processing status are not yet available for search or chunk retrieval.