memaxdocs
REST API

Recall API

Semantic search endpoint. Find the most relevant memories for a query.

Semantic retrieval. Returns chunks from the most relevant memories, already reranked (unless no_rerank is set).

Recall

POST /v1/recall

Request body

{
  "query": "how does authentication work?",
  "limit": 10,
  "hub_ids": ["hub_abc123"],
  "tags": ["auth"],
  "topic_id": "top_xyz",
  "kind": "semantic",
  "no_rerank": false,
  "include_archived": false,
  "project_context": {
    "repo": "github.com/org/app",
    "branch": "fix/auth-refresh"
  }
}
FieldTypeRequiredDescription
querystringyesNatural-language search query.
limitnumberMax memories returned.
hub_idsstring[]Narrow to specific hubs. Hub IDs only — slugs are not accepted here. Omit for all.
kindstringRestrict to episodic, semantic, procedural, or rationale.
tagsstring[]Declared on the request struct but not applied by the recall handler today — treat as a no-op.
topic_idstringRestrict to memories in this topic.
no_rerankbooleanSkip the reranker stage (faster, lower quality).
include_archivedbooleanDeclared on the request struct but not honored today — the recall pipeline unconditionally excludes archived and processing memories.
sourcestringHint about where this recall came from (CLI, hook, MCP…).
working_dirstringOptional working directory hint.
project_contextobject{ repo, project, branch } for relevance boosting.
created_afterstring (RFC3339)Temporal lower bound.
created_beforestring (RFC3339)Temporal upper bound.

There is no threshold parameter. The docs previously claimed one; the server doesn't take a relevance-score floor.

Response

{
  "data": {
    "memories": [
      {
        "id": "mem_a1b2c3d4",
        "title": "Auth system design",
        "summary": "JWT-based auth with token rotation",
        "chunk_content": "The auth system uses JWT with short-lived access tokens...",
        "heading_chain": "Auth > Tokens > Access",
        "relevance_score": 0.94,
        "kind": "semantic",
        "stability": "evolving",
        "source": "cli",
        "age": "3 days ago",
        "created_at": "2026-04-18T10:30:00Z",
        "hub_id": "hub_abc123",
        "hub_name": "Platform Team",
        "topic_id": "top_xyz",
        "topic_name": "Auth & identity"
      }
    ],
    "query_metadata": {
      "intent": "explanation",
      "kinds_searched": ["semantic", "rationale"],
      "total_candidates": 47,
      "reranked": true,
      "latency_ms": 284
    }
  }
}

Response fields

Each entry in memories:

FieldDescription
idMemory ID (parent, not chunk).
titleMemory title.
summaryServer-generated summary.
chunk_contentThe matched chunk's content (not the full memory).
heading_chainHeading path within the source document.
relevance_score0–1, higher is better.
kindepisodic / semantic / procedural / rationale.
stabilityvolatile / evolving / stable.
sourceOrigin of the parent memory.
ageHuman-readable age ("3 days ago").
hub_id, hub_nameParent hub (present when hub search spanned multiple hubs).
topic_id, topic_nameTopic assignment if any.

query_metadata:

FieldDescription
intentClassifier's guess at query intent.
kinds_searchedWhich memory kinds were queried.
total_candidatesNumber of chunks considered before final ranking.
rerankedWhether rerank ran (false if no_rerank or no model).
latency_msEnd-to-end server time.
GET /v1/memories/{id}/related

Returns memories related to a specific memory — used by the web app's memory detail "Related" section. Same RecalledMemory[] shape as the recall endpoint.