memaxdocs
REST API

Recall API

Semantic search endpoint. Find the most relevant memories for a given query.

Recall

POST /v1/recall

Semantic search across your memories. Returns the most relevant pieces of knowledge for a given query, ranked by relevance.

Request body

{
  "query": "how does authentication work?",
  "limit": 5,
  "hint": "debugging a token refresh issue",
  "hub_id": "hub_abc123",
  "project_context": {
    "repo": "github.com/org/app",
    "branch": "fix/auth-refresh"
  }
}
FieldTypeRequiredDescription
querystringYesSearch query (semantic)
limitnumberNoMax results (default: 5, max: 20)
hintstringNoAdditional context for better ranking
hub_idstringNoScope to a specific hub
project_contextobjectNoRepo/project metadata for context-aware ranking
thresholdnumberNoMinimum relevance score (0–1, default: 0.3)

Response

{
  "data": {
    "results": [
      {
        "memory_id": "mem_a1b2c3d4",
        "title": "Auth System Design",
        "content": "The auth system uses JWT with short-lived access tokens (1h) and refresh tokens (30d)...",
        "summary": "JWT-based authentication with token rotation",
        "category": "core",
        "relevance": 0.94,
        "chunk_index": 2,
        "created_at": "2025-01-15T10:30:00Z"
      },
      {
        "memory_id": "mem_e5f6g7h8",
        "title": "Login Flow Documentation",
        "content": "Users authenticate via POST /api/auth/login...",
        "summary": "Login endpoint and token generation flow",
        "category": "reference",
        "relevance": 0.87,
        "chunk_index": 0,
        "created_at": "2025-01-10T09:15:00Z"
      }
    ],
    "query_embedding_ms": 48,
    "search_ms": 22,
    "rerank_ms": 195
  }
}

Response fields

FieldDescription
memory_idID of the parent memory
titleMemory title
contentThe relevant chunk content
summaryAI-generated summary
categoryMemory category
relevanceRelevance score (0–1, higher is better)
chunk_indexWhich chunk of the parent memory matched
query_embedding_msTime to embed the query
search_msTime for vector search
rerank_msTime for reranking

Performance

Typical latencies:

StepTime
Query embedding~50ms
Vector search~20ms
Reranking~200ms
Total (p95)< 500ms