REST API
Recall API
Semantic search endpoint. Find the most relevant memories for a given query.
Recall
POST /v1/recallSemantic search across your memories. Returns the most relevant pieces of knowledge for a given query, ranked by relevance.
Request body
{
"query": "how does authentication work?",
"limit": 5,
"hint": "debugging a token refresh issue",
"hub_id": "hub_abc123",
"project_context": {
"repo": "github.com/org/app",
"branch": "fix/auth-refresh"
}
}| Field | Type | Required | Description |
|---|---|---|---|
query | string | Yes | Search query (semantic) |
limit | number | No | Max results (default: 5, max: 20) |
hint | string | No | Additional context for better ranking |
hub_id | string | No | Scope to a specific hub |
project_context | object | No | Repo/project metadata for context-aware ranking |
threshold | number | No | Minimum relevance score (0–1, default: 0.3) |
Response
{
"data": {
"results": [
{
"memory_id": "mem_a1b2c3d4",
"title": "Auth System Design",
"content": "The auth system uses JWT with short-lived access tokens (1h) and refresh tokens (30d)...",
"summary": "JWT-based authentication with token rotation",
"category": "core",
"relevance": 0.94,
"chunk_index": 2,
"created_at": "2025-01-15T10:30:00Z"
},
{
"memory_id": "mem_e5f6g7h8",
"title": "Login Flow Documentation",
"content": "Users authenticate via POST /api/auth/login...",
"summary": "Login endpoint and token generation flow",
"category": "reference",
"relevance": 0.87,
"chunk_index": 0,
"created_at": "2025-01-10T09:15:00Z"
}
],
"query_embedding_ms": 48,
"search_ms": 22,
"rerank_ms": 195
}
}Response fields
| Field | Description |
|---|---|
memory_id | ID of the parent memory |
title | Memory title |
content | The relevant chunk content |
summary | AI-generated summary |
category | Memory category |
relevance | Relevance score (0–1, higher is better) |
chunk_index | Which chunk of the parent memory matched |
query_embedding_ms | Time to embed the query |
search_ms | Time for vector search |
rerank_ms | Time for reranking |
Performance
Typical latencies:
| Step | Time |
|---|---|
| Query embedding | ~50ms |
| Vector search | ~20ms |
| Reranking | ~200ms |
| Total (p95) | < 500ms |