Brain Mesh
Five nodes providing distributed GPU and CPU compute across your homelab.
Model serving
Local models served via Ollama, routed and load-balanced through LiteLLM.
Agent layer
Paperclip and Hermes orchestrate tasks across specialized agent roles.
Automation
57+ n8n workflows handle infrastructure, monitoring, and deployment.
Memory
Qdrant, Neo4j, and Mem0 persist context across conversations and workflows.
Brain Mesh
The Brain Mesh is the compute foundation of Jarvis — a five-node cluster that distributes AI workloads across dedicated hardware. Each node plays a specific role based on its resources:| Node | Role |
|---|---|
| ai-max | Primary GPU node — runs the most demanding inference workloads |
| ai-mini-x1 | Secondary GPU node — handles concurrent model requests |
| jarvis-brain | Orchestration node — routes requests, runs the agent layer |
| dell-micro | Utility node — background tasks, monitoring, and automation |
| synologynas | Storage node — model weights, vector indexes, and persistent data |
Model serving
Jarvis serves models through two components that work together: Ollama runs on your GPU nodes and manages the model lifecycle — downloading weights, loading models into VRAM, and handling inference requests. Each model runs as an isolated process on the node best suited to its size. LiteLLM sits in front of Ollama as a unified API gateway. It normalizes requests across models and providers, handles load balancing between nodes, and exposes a single endpoint for agents and integrations to call. You interact with models through LiteLLM — you don’t need to target individual Ollama instances directly.Browse models
See every model in your fleet and which nodes they run on.
Agents
The agent layer is where Jarvis takes action. Two multi-agent systems handle task delegation: Paperclip is the primary orchestrator. It breaks down complex tasks into subtasks and delegates them to specialized roles — CEO (strategy), CTO (technical decisions), Engineer (execution), and Writer (output). Agents communicate through a shared message bus on thejarvis-brain node.
Hermes handles external communication and coordination — translating agent outputs into structured responses and routing them to the right destination (API call, n8n webhook, or direct reply).
Agents call models through LiteLLM, use tools via MCP, and read from and write to the memory layer to maintain context across sessions.
Paperclip
Multi-agent orchestrator with specialized roles.
Hermes
Communication and output routing agent.
Automation
n8n provides the workflow automation layer. You can trigger workflows from agents, webhooks, schedules, or infrastructure events. Workflows handle tasks that don’t need interactive reasoning — deployments, health checks, notifications, and data pipelines. Agents can invoke n8n workflows as tools via MCP, and n8n workflows can call back into the agent layer via the Jarvis API. This creates a closed loop between reactive automation and deliberate agent reasoning.n8n workflows
Browse the 57+ workflows bundled with Jarvis.
Memory
Jarvis uses three complementary memory systems so agents retain context across conversations and workflows: Qdrant is a vector database that stores semantic embeddings. When an agent needs to recall past conversations or search a document corpus, it queries Qdrant for the most relevant chunks. Neo4j is a graph database that stores relationships between entities — people, tasks, systems, and decisions. Agents use Neo4j when reasoning requires understanding how things are connected, not just what they are. Mem0 is a persistent memory layer that sits above both databases. It handles the decision of what to remember, when to retrieve it, and how to present it to the agent as context. You don’t need to manage embeddings or graph queries directly — Mem0 handles that automatically.Memory integrations
Configure Qdrant, Neo4j, and Mem0 for your environment.