API Reference
All request and response bodies are JSON. The base URL defaults to http://localhost:3000.
Namespaces
Every data operation is scoped to a namespace. Namespace names must be lowercase alphanumeric with hyphens, and no longer than 64 characters. Each namespace maps to an isolated S3 prefix s3://bucket/namespace/.
Valid: my-project, embeddings-v2, prod-search.
Invalid: My_Project (uppercase and underscores), a-very-long-name-that-exceeds-the-sixty-four-character-limit-imposed-by-firn.
Endpoints
Returns 200 OK with body ok. Use this for load balancer health checks and container readiness probes.
Example
curl http://localhost:3000/health
Response
ok
Returns all Prometheus metrics in text exposition format (text/plain; version=0.0.4). See the monitoring guide for the full metric list and PromQL examples.
Example
curl http://localhost:3000/metrics
Appends rows to the namespace's Lance table. The vector dimension is inferred from the first upsert and enforced on subsequent calls. After a successful write, all cached query results for this namespace are invalidated.
Request body
| Field | Type | Required | Description |
|---|---|---|---|
rows | array | Yes | List of rows to insert |
rows[].id | u64 | Yes | Unique identifier for the row |
rows[].vector | float32[] | Yes | Dense vector (dimension must match the namespace) |
rows[].text | string | No | Text payload for full-text search |
Response (200)
| Field | Type | Description |
|---|---|---|
upserted | integer | Number of rows accepted |
Example
curl -X POST http://localhost:3000/ns/demo/upsert \
-H 'Content-Type: application/json' \
-d '{
"rows": [
{"id": 1, "vector": [1.0, 0.0, 0.0, 0.0], "text": "hello world"},
{"id": 2, "vector": [0.0, 1.0, 0.0, 0.0]}
]
}'
{"upserted": 2}
Errors
400- invalid namespace name or vector dimension mismatch500- S3 write failure (client gets generic error; full details logged server-side)
Queries the namespace through the cache-aside path. On a cache hit, the result is returned from RAM or NVMe with zero S3 access. On a miss, the query runs against S3 via LanceDB and the result is cached.
Query modes
The query mode is determined by which fields are present:
| Mode | vector | text | Description |
|---|---|---|---|
| Vector | set | absent | Nearest-neighbour search (L2 distance) |
| FTS | absent | set | BM25 full-text search (requires FTS index) |
| Hybrid | set | set | Both, fused via Reciprocal Rank Fusion (RRF) |
Request body
| Field | Type | Required | Description |
|---|---|---|---|
vector | float32[] | No* | Query vector for nearest-neighbour search |
k | integer | Yes | Number of results to return |
nprobes | integer | No | IVF partitions to probe (default 20). Higher values trade latency for recall. |
text | string | No* | Text query for full-text or hybrid search |
* At least one of vector or text must be present.
Response (200)
| Field | Type | Description |
|---|---|---|
query_id | string | Deterministic hash of the query parameters (the cache key) |
results | array | Ordered list of matching rows |
results[].id | u64 | Row identifier |
results[].score | float32 | Distance (vector), BM25 score (FTS), or relevance score (hybrid) |
results[].vector | float32[] | The stored vector |
results[].text | string? | The stored text (null if none) |
Examples
Vector search:
curl -X POST http://localhost:3000/ns/demo/query \
-H 'Content-Type: application/json' \
-d '{"vector": [1.0, 0.0, 0.0, 0.0], "k": 5}'
Full-text search:
curl -X POST http://localhost:3000/ns/demo/query \
-H 'Content-Type: application/json' \
-d '{"text": "search query terms", "k": 10}'
Hybrid search:
curl -X POST http://localhost:3000/ns/demo/query \
-H 'Content-Type: application/json' \
-d '{
"vector": [1.0, 0.0, 0.0, 0.0],
"text": "search terms",
"k": 10,
"nprobes": 40
}'
Removes every S3 object under the namespace's prefix and evicts all cached query results. This is irreversible.
Response (200)
| Field | Type | Description |
|---|---|---|
objects_deleted | integer | Number of S3 objects removed |
Example
curl -X DELETE http://localhost:3000/ns/demo
{"objects_deleted": 12}
Accepts a list of queries and runs them in a background task to populate the cache. Returns 202 Accepted immediately. Useful for warming the cache after a deployment or before expected traffic.
Request body
| Field | Type | Required | Description |
|---|---|---|---|
queries | QueryRequest[] | Yes | List of query objects (same schema as /query) |
Response (202)
| Field | Type | Description |
|---|---|---|
queued | integer | Number of queries submitted for background execution |
Example
curl -X POST http://localhost:3000/ns/demo/warmup \
-H 'Content-Type: application/json' \
-d '{
"queries": [
{"vector": [1.0, 0.0, 0.0, 0.0], "k": 5},
{"vector": [0.0, 1.0, 0.0, 0.0], "k": 5}
]
}'
{"queued": 2}
firnflow_cache_misses_total metric to track how many warmup queries have completed. Failures inside the background task are logged server-side but do not affect the HTTP response.
Builds an IVF_PQ (Inverted File with Product Quantisation) index on the namespace's vector column. Returns 202 Accepted and builds in the background. Building an index dramatically reduces cold query latency (25x speedup on AWS S3).
Request body
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
kind | string | Yes | - | Index type. Only "ivf_pq" is supported. |
num_partitions | u32 | No | sqrt(row_count) | Number of IVF partitions |
num_sub_vectors | u32 | No | dim / 16 | Number of PQ sub-vectors (must divide dimension evenly) |
Response (202)
{"status": "index build queued"}
Example
curl -X POST http://localhost:3000/ns/demo/index \
-H 'Content-Type: application/json' \
-d '{"kind": "ivf_pq"}'
firnflow_index_build_duration_seconds to track completion. Queries against the namespace continue to work during the build using linear scan.
Builds a BM25 full-text search index on the namespace's text column. Required before FTS or hybrid queries will return results. Returns 202 Accepted.
Request body
No request body required.
Response (202)
{"status": "fts index build queued"}
Example
curl -X POST http://localhost:3000/ns/demo/fts-index
text field before building an FTS index.
Merges small Lance data fragments into fewer, larger files to reduce S3 round-trips on cold queries. Returns 202 Accepted. Also invalidates the cache for this namespace, since file offsets change after compaction.
Request body
No request body required.
Response (202)
{"status": "compaction queued"}
Example
curl -X POST http://localhost:3000/ns/demo/compact
fragments_removed and fragments_added when the compaction completes.
Error responses
All errors return a JSON body with an error field.
| Status | Cause | Example |
|---|---|---|
400 |
Invalid namespace name, dimension mismatch, empty query, unsupported index kind | {"error": "invalid namespace: must be lowercase alphanumeric and hyphens, max 64 chars"} |
500 |
S3 connectivity, cache failure, or internal error | {"error": "internal error"} |
On 500 errors, the full error details are logged server-side via tracing::error! but scrubbed from the client response to prevent leaking internal state.