API Reference

All request and response bodies are JSON. The base URL defaults to http://localhost:3000.

Namespaces

Every data operation is scoped to a namespace. Namespace names must be lowercase alphanumeric with hyphens, and no longer than 64 characters. Each namespace maps to an isolated S3 prefix s3://bucket/namespace/.

Valid: my-project, embeddings-v2, prod-search.
Invalid: My_Project (uppercase and underscores), a-very-long-name-that-exceeds-the-sixty-four-character-limit-imposed-by-firn.

Endpoints

GET /health Liveness check

Returns 200 OK with body ok. Use this for load balancer health checks and container readiness probes.

Example

curl http://localhost:3000/health

Response

ok
GET /metrics Prometheus metrics

Returns all Prometheus metrics in text exposition format (text/plain; version=0.0.4). See the monitoring guide for the full metric list and PromQL examples.

Example

curl http://localhost:3000/metrics
POST /ns/{namespace}/upsert Insert or update vectors and text

Appends rows to the namespace's Lance table. The vector dimension is inferred from the first upsert and enforced on subsequent calls. After a successful write, all cached query results for this namespace are invalidated.

Request body

FieldTypeRequiredDescription
rowsarrayYesList of rows to insert
rows[].idu64YesUnique identifier for the row
rows[].vectorfloat32[]YesDense vector (dimension must match the namespace)
rows[].textstringNoText payload for full-text search

Response (200)

FieldTypeDescription
upsertedintegerNumber of rows accepted

Example

curl -X POST http://localhost:3000/ns/demo/upsert \
  -H 'Content-Type: application/json' \
  -d '{
    "rows": [
      {"id": 1, "vector": [1.0, 0.0, 0.0, 0.0], "text": "hello world"},
      {"id": 2, "vector": [0.0, 1.0, 0.0, 0.0]}
    ]
  }'
{"upserted": 2}

Errors

  • 400 - invalid namespace name or vector dimension mismatch
  • 500 - S3 write failure (client gets generic error; full details logged server-side)
POST /ns/{namespace}/query Vector, full-text, or hybrid search

Queries the namespace through the cache-aside path. On a cache hit, the result is returned from RAM or NVMe with zero S3 access. On a miss, the query runs against S3 via LanceDB and the result is cached.

Query modes

The query mode is determined by which fields are present:

ModevectortextDescription
VectorsetabsentNearest-neighbour search (L2 distance)
FTSabsentsetBM25 full-text search (requires FTS index)
HybridsetsetBoth, fused via Reciprocal Rank Fusion (RRF)

Request body

FieldTypeRequiredDescription
vectorfloat32[]No*Query vector for nearest-neighbour search
kintegerYesNumber of results to return
nprobesintegerNoIVF partitions to probe (default 20). Higher values trade latency for recall.
textstringNo*Text query for full-text or hybrid search

* At least one of vector or text must be present.

Response (200)

FieldTypeDescription
query_idstringDeterministic hash of the query parameters (the cache key)
resultsarrayOrdered list of matching rows
results[].idu64Row identifier
results[].scorefloat32Distance (vector), BM25 score (FTS), or relevance score (hybrid)
results[].vectorfloat32[]The stored vector
results[].textstring?The stored text (null if none)

Examples

Vector search:

curl -X POST http://localhost:3000/ns/demo/query \
  -H 'Content-Type: application/json' \
  -d '{"vector": [1.0, 0.0, 0.0, 0.0], "k": 5}'

Full-text search:

curl -X POST http://localhost:3000/ns/demo/query \
  -H 'Content-Type: application/json' \
  -d '{"text": "search query terms", "k": 10}'

Hybrid search:

curl -X POST http://localhost:3000/ns/demo/query \
  -H 'Content-Type: application/json' \
  -d '{
    "vector": [1.0, 0.0, 0.0, 0.0],
    "text": "search terms",
    "k": 10,
    "nprobes": 40
  }'
DELETE /ns/{namespace} Delete a namespace and all its data

Removes every S3 object under the namespace's prefix and evicts all cached query results. This is irreversible.

Response (200)

FieldTypeDescription
objects_deletedintegerNumber of S3 objects removed

Example

curl -X DELETE http://localhost:3000/ns/demo
{"objects_deleted": 12}
POST /ns/{namespace}/warmup Pre-warm the cache (async)

Accepts a list of queries and runs them in a background task to populate the cache. Returns 202 Accepted immediately. Useful for warming the cache after a deployment or before expected traffic.

Request body

FieldTypeRequiredDescription
queriesQueryRequest[]YesList of query objects (same schema as /query)

Response (202)

FieldTypeDescription
queuedintegerNumber of queries submitted for background execution

Example

curl -X POST http://localhost:3000/ns/demo/warmup \
  -H 'Content-Type: application/json' \
  -d '{
    "queries": [
      {"vector": [1.0, 0.0, 0.0, 0.0], "k": 5},
      {"vector": [0.0, 1.0, 0.0, 0.0], "k": 5}
    ]
  }'
{"queued": 2}
Monitoring warmup progress
Watch the firnflow_cache_misses_total metric to track how many warmup queries have completed. Failures inside the background task are logged server-side but do not affect the HTTP response.
POST /ns/{namespace}/index Build an ANN vector index (async)

Builds an IVF_PQ (Inverted File with Product Quantisation) index on the namespace's vector column. Returns 202 Accepted and builds in the background. Building an index dramatically reduces cold query latency (25x speedup on AWS S3).

Request body

FieldTypeRequiredDefaultDescription
kindstringYes-Index type. Only "ivf_pq" is supported.
num_partitionsu32Nosqrt(row_count)Number of IVF partitions
num_sub_vectorsu32Nodim / 16Number of PQ sub-vectors (must divide dimension evenly)

Response (202)

{"status": "index build queued"}

Example

curl -X POST http://localhost:3000/ns/demo/index \
  -H 'Content-Type: application/json' \
  -d '{"kind": "ivf_pq"}'
Index build time
Index builds can take minutes for large datasets (147s for 100k vectors at dim=1536 on MinIO). Monitor firnflow_index_build_duration_seconds to track completion. Queries against the namespace continue to work during the build using linear scan.
POST /ns/{namespace}/fts-index Build a BM25 full-text search index (async)

Builds a BM25 full-text search index on the namespace's text column. Required before FTS or hybrid queries will return results. Returns 202 Accepted.

Request body

No request body required.

Response (202)

{"status": "fts index build queued"}

Example

curl -X POST http://localhost:3000/ns/demo/fts-index
Prerequisite
At least one row must have a non-null text field before building an FTS index.
POST /ns/{namespace}/compact Compact data files (async)

Merges small Lance data fragments into fewer, larger files to reduce S3 round-trips on cold queries. Returns 202 Accepted. Also invalidates the cache for this namespace, since file offsets change after compaction.

Request body

No request body required.

Response (202)

{"status": "compaction queued"}

Example

curl -X POST http://localhost:3000/ns/demo/compact
When to compact
Compact after many small upsert batches have created fragment sprawl. The server logs fragments_removed and fragments_added when the compaction completes.

Error responses

All errors return a JSON body with an error field.

StatusCauseExample
400 Invalid namespace name, dimension mismatch, empty query, unsupported index kind {"error": "invalid namespace: must be lowercase alphanumeric and hyphens, max 64 chars"}
500 S3 connectivity, cache failure, or internal error {"error": "internal error"}

On 500 errors, the full error details are logged server-side via tracing::error! but scrubbed from the client response to prevent leaking internal state.