Architecture

Jarvis is built in five layers, each responsible for a distinct part of the platform. Understanding how these layers connect helps you reason about where tasks run, how data flows, and where to look when something needs attention.

Brain Mesh

Five nodes providing distributed GPU and CPU compute across your homelab.

Model serving

Local models served via Ollama, routed and load-balanced through LiteLLM.

Agent layer

Paperclip and Hermes orchestrate tasks across specialized agent roles.

Automation

57+ n8n workflows handle infrastructure, monitoring, and deployment.

Memory

Qdrant, Neo4j, and Mem0 persist context across conversations and workflows.

Brain Mesh

The Brain Mesh is the compute foundation of Jarvis — a five-node cluster that distributes AI workloads across dedicated hardware. Each node plays a specific role based on its resources:

Node	Role
ai-max	Primary GPU node — runs the most demanding inference workloads
ai-mini-x1	Secondary GPU node — handles concurrent model requests
jarvis-brain	Orchestration node — routes requests, runs the agent layer
dell-micro	Utility node — background tasks, monitoring, and automation
synologynas	Storage node — model weights, vector indexes, and persistent data

All nodes communicate over your local network. No traffic leaves your infrastructure during inference.

See Nodes for hardware specs and how to check node health.

Model serving

Jarvis serves models through two components that work together: Ollama runs on your GPU nodes and manages the model lifecycle — downloading weights, loading models into VRAM, and handling inference requests. Each model runs as an isolated process on the node best suited to its size. LiteLLM sits in front of Ollama as a unified API gateway. It normalizes requests across models and providers, handles load balancing between nodes, and exposes a single endpoint for agents and integrations to call. You interact with models through LiteLLM — you don’t need to target individual Ollama instances directly.

Browse models

See every model in your fleet and which nodes they run on.

Agents

The agent layer is where Jarvis takes action. Two multi-agent systems handle task delegation: Paperclip is the primary orchestrator. It breaks down complex tasks into subtasks and delegates them to specialized roles — CEO (strategy), CTO (technical decisions), Engineer (execution), and Writer (output). Agents communicate through a shared message bus on the jarvis-brain node. Hermes handles external communication and coordination — translating agent outputs into structured responses and routing them to the right destination (API call, n8n webhook, or direct reply). Agents call models through LiteLLM, use tools via MCP, and read from and write to the memory layer to maintain context across sessions.

Paperclip

Multi-agent orchestrator with specialized roles.

Hermes

Communication and output routing agent.

Automation

n8n provides the workflow automation layer. You can trigger workflows from agents, webhooks, schedules, or infrastructure events. Workflows handle tasks that don’t need interactive reasoning — deployments, health checks, notifications, and data pipelines. Agents can invoke n8n workflows as tools via MCP, and n8n workflows can call back into the agent layer via the Jarvis API. This creates a closed loop between reactive automation and deliberate agent reasoning.

n8n workflows

Browse the 57+ workflows bundled with Jarvis.

Memory

Jarvis uses three complementary memory systems so agents retain context across conversations and workflows: Qdrant is a vector database that stores semantic embeddings. When an agent needs to recall past conversations or search a document corpus, it queries Qdrant for the most relevant chunks. Neo4j is a graph database that stores relationships between entities — people, tasks, systems, and decisions. Agents use Neo4j when reasoning requires understanding how things are connected, not just what they are. Mem0 is a persistent memory layer that sits above both databases. It handles the decision of what to remember, when to retrieve it, and how to present it to the agent as context. You don’t need to manage embeddings or graph queries directly — Mem0 handles that automatically.

Memory integrations

Configure Qdrant, Neo4j, and Mem0 for your environment.

Overview

Brain Mesh

Agents

Integrations

Brain Mesh

Model serving

Agent layer

Automation

Memory

Brain Mesh

Model serving

Browse models

Agents

Paperclip

Hermes

Automation

n8n workflows

Memory

Memory integrations

Overview

Brain Mesh

Agents

Integrations

Brain Mesh

Model serving

Agent layer

Automation

Memory

​Brain Mesh

​Model serving

Browse models

​Agents

Paperclip

Hermes

​Automation

n8n workflows

​Memory

Memory integrations

Brain Mesh

Model serving

Agents

Automation

Memory