API overview - Jarvis

All Jarvis APIs share a common base URL, authentication model, and error format. Whether you’re calling model inference, agent orchestration, or MCP tools, the same conventions apply.

Base URL

https://your-jarvis-host

Replace your-jarvis-host with the hostname or IP of your Jarvis node. All endpoints are served over HTTPS.

Authentication

Every request must include an API key in the Authorization header.

Authorization

string

required

Your Jarvis API key, prefixed with Bearer. Example: Bearer jrv_yourkey123.

Keep your API key out of version control. Use environment variables or a secrets manager to inject it at runtime.

OpenAI compatibility

The Jarvis model API is OpenAI-compatible via LiteLLM. Any client that works with the OpenAI API — including the official Python and Node.js SDKs, LangChain, and LlamaIndex — works with Jarvis. Point the client at your Jarvis host and supply your Jarvis API key.

curl https://your-jarvis-host/chat/completions \
  -H "Authorization: Bearer jrv_yourkey123" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "ollama/llama3",
    "messages": [
      { "role": "user", "content": "Summarize today'\''s system alerts." }
    ]
  }'

Rate limiting

Jarvis does not enforce hard rate limits by default, but available throughput depends on your hardware and how many models are loaded.

If you’re running batch workloads, use streaming responses ("stream": true) to avoid long-blocking requests and reduce memory pressure on the inference nodes.

Best practices:

Reuse HTTP connections — avoid opening a new connection per request.
Use the model that fits your task. Smaller models like ollama/mistral respond faster for simple tasks.
Monitor GPU utilization via the monitoring guide to spot saturation early.

Error responses

All errors return a JSON body with a consistent structure.

{
  "error": {
    "code": "unauthorized",
    "message": "Invalid or missing API key.",
    "status": 401
  }
}

error

object

The top-level error container.

Show error fields

error.code

string

A machine-readable error identifier (e.g., unauthorized, not_found, rate_limited, internal_error).

error.message

string

A human-readable description of what went wrong.

error.status

integer

The HTTP status code (e.g., 400, 401, 404, 500).

Next steps

Model API — list models and run completions
Agent API — invoke Paperclip and Hermes
Tools API — call MCP tools directly

Endpoints

​Base URL

​Authentication

​OpenAI compatibility

​Rate limiting

​Error responses

​Next steps

Base URL

Authentication

OpenAI compatibility

Rate limiting

Error responses

Next steps