Persistent Vector Memory
6 MCP tools

Persistent Vector Memory

Agents that remember. Qdrant vector search and Ollama embeddings give every session access to accumulated knowledge — procedures that worked, mistakes to avoid, and context that would otherwise be lost at session end.

The 6 MCP Tools

Purpose-built tools for every type of knowledge operation.

memory_recall
Semantic search across all stored memories. Returns ranked results by vector similarity with metadata filtering.
memory_store
Persist new knowledge with type classification, tags, and provenance metadata. Deduplication prevents redundant entries.
rag_search
Retrieval-augmented generation across indexed documents. Combines vector similarity with document-level context.
episode
Record session episodes — what happened, what was decided, and why. Temporal context for future sessions.
learning
Capture verified insights: what works, what fails, and under what conditions. The system's growing knowledge base.
procedure / trajectory
Store multi-step procedures and successful execution trajectories. Replayable workflows for recurring tasks.

How It Works

From agent observation to compounding organizational knowledge.

1

Agent Learns

Observation, decision, or outcome identified as worth preserving

2

memory_store

Knowledge classified by type, tagged, and sent to the MCP server

3

Qdrant Indexes

nomic-embed-text generates vectors; Qdrant indexes for similarity search

4

Future Recall

New sessions invoke memory_recall; relevant knowledge surfaces automatically

5

Knowledge Compounds

Each session adds to the knowledge base; organizational intelligence grows

Persistent Vector Memory Architecture

Dual-Path Architecture

Real-time operations and batch maintenance run on separate paths.

Hot Path — MCP Real-Time

Interactive Operations

  • memory_recall and memory_store during active sessions
  • Sub-second vector search via Qdrant
  • Ollama embedding generation on demand
  • Deduplication checks before every store operation
  • SessionStart auto-recall for context restoration
Cold Path — Batch Maintenance

Background Operations

  • Scheduled deduplication and consolidation
  • Memory promotion from scratch to permanent
  • Embedding re-indexing when models update
  • Stale memory detection and archival
  • Cross-collection knowledge graph updates

Key Capabilities

Structured knowledge that compounds over time.

6 MCP Tools

Purpose-built tools for every knowledge operation: recall, store, RAG search, episodes, learnings, and procedures/trajectories. Each tool has a specific schema and validation — no generic key-value dumping.

Semantic Search

Qdrant vector database with nomic-embed-text embeddings enables similarity search that understands meaning, not just keywords. A query about "database migration failures" surfaces memories about schema rollbacks even if those words never appeared.

Hot/Cold Path

The MCP server handles real-time operations during sessions — recall, store, search. Batch maintenance runs asynchronously: deduplication, consolidation, re-indexing, and stale memory archival happen without blocking active work.

Cross-Session Transfer

Knowledge persists across sessions and accumulates over time. An agent that solves a tricky deployment issue in Session 1 recalls that solution in Session 47 when a similar problem appears — without being told to look for it.

Memory Types

Five structured types — episodes, learnings, procedures, trajectories, and RAG documents — each with distinct schemas. Episodes capture what happened. Learnings capture what was verified. Procedures capture how to do it again.

Organizational Intelligence

Individual agent memories compound into institutional knowledge. Patterns emerge across sessions: which approaches work for which problems, which configurations cause issues, which reviews catch which classes of bugs.

Why It Matters

Every AI session today starts from zero. The agent has no memory of yesterday's debugging session, last week's architecture decision, or the deployment procedure that took three attempts to get right. Context windows are finite, and when they end, everything learned evaporates.

Persistent vector memory changes the economics of AI assistance. Instead of re-explaining context at the start of every session, agents auto-recall relevant knowledge. Instead of rediscovering solutions to problems that were already solved, agents surface past procedures. Instead of repeating mistakes, agents recall learnings about what did not work and why.

The compounding effect is the real value. After 100 sessions, the memory system contains a rich, searchable knowledge base of how your specific codebase works, what your team's preferences are, which approaches succeed in your environment, and which fail. That is not generic AI capability — that is organizational intelligence that grows with every interaction.