Skip to main content
Jarvis agents don’t start from scratch on every task. The RAG (Retrieval-Augmented Generation) memory system gives agents access to context from past conversations, stored documents, and structured knowledge — automatically, without any configuration on your part. The memory system has three layers, each optimized for a different kind of recall.

MCP tools

Use jarvis.memory.search and jarvis.memory.store to interact with memory directly.

Agent overview

See how Paperclip and Hermes use memory during multi-step tasks.

Memory layers

Qdrant is the vector store. It stores documents and conversation chunks as high-dimensional embeddings, enabling semantic search — finding content by meaning rather than exact keyword match.What it stores:
  • Conversation history (chunked and embedded)
  • Documents you’ve added (code files, notes, runbooks)
  • Tool outputs that agents flag as worth remembering
When it’s used: Qdrant is queried any time an agent needs to recall something that was discussed before or find relevant background information. If you ask “what did we decide about the monitoring stack last week?”, the agent searches Qdrant for semantically similar content.Query Qdrant directly:
curl -X POST https://your-jarvis-host/api/memory/search \
  -H "Content-Type: application/json" \
  -d '{
    "layer": "qdrant",
    "query": "monitoring stack configuration decisions",
    "top_k": 5
  }'
Response:
{
  "results": [
    {
      "score": 0.91,
      "text": "Decided to use Prometheus + Grafana on ai-max. Retention set to 30 days.",
      "source": "conversation",
      "timestamp": "2025-03-15T14:22:00Z"
    }
  ]
}
Store a document in Qdrant:
curl -X POST https://your-jarvis-host/api/memory/store \
  -H "Content-Type: application/json" \
  -d '{
    "layer": "qdrant",
    "content": "The Prometheus scrape interval is set to 15s for all mesh nodes.",
    "metadata": { "topic": "monitoring", "source": "runbook" }
  }'

How agents use memory automatically

You don’t need to manage memory explicitly. Agents follow this pattern on every task:
  1. Session start — the agent loads relevant Mem0 facts and recent conversation history from Qdrant
  2. During the task — the agent queries Qdrant and Neo4j when it needs background context, past decisions, or relationship data
  3. After the task — important outputs, decisions, and new facts are written back to the appropriate memory layer
Memory retrieval is transparent. If the agent uses a memory result, you’ll see it referenced in the agent’s reasoning output.

Memory persistence across conversations

All three layers persist indefinitely. When you start a new conversation, your agents pick up where they left off:
  • Mem0 carries forward your preferences and standing instructions
  • Qdrant makes past conversations searchable by meaning
  • Neo4j retains the knowledge graph built from previous tasks
Memory is stored locally on your infrastructure. Nothing is sent to external services unless you configure an integration that explicitly does so.

Query memory with the MCP tool

From inside an agent session, you can ask the agent to search memory directly:
“Search memory for everything we discussed about the Prometheus setup.”
The agent invokes jarvis.memory.search across all layers and returns a synthesized summary of what it finds. You can also ask it to store something explicitly:
“Remember that the scrape interval on ai-max should never be changed without a maintenance window.”
The agent writes this to Mem0 with appropriate tags so future sessions can retrieve it.