Skip to main content
All Jarvis APIs share a common base URL, authentication model, and error format. Whether you’re calling model inference, agent orchestration, or MCP tools, the same conventions apply.

Base URL

https://your-jarvis-host
Replace your-jarvis-host with the hostname or IP of your Jarvis node. All endpoints are served over HTTPS.

Authentication

Every request must include an API key in the Authorization header.
Authorization
string
required
Your Jarvis API key, prefixed with Bearer. Example: Bearer jrv_yourkey123.
Keep your API key out of version control. Use environment variables or a secrets manager to inject it at runtime.

OpenAI compatibility

The Jarvis model API is OpenAI-compatible via LiteLLM. Any client that works with the OpenAI API — including the official Python and Node.js SDKs, LangChain, and LlamaIndex — works with Jarvis. Point the client at your Jarvis host and supply your Jarvis API key.
curl https://your-jarvis-host/chat/completions \
  -H "Authorization: Bearer jrv_yourkey123" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "ollama/llama3",
    "messages": [
      { "role": "user", "content": "Summarize today'\''s system alerts." }
    ]
  }'

Rate limiting

Jarvis does not enforce hard rate limits by default, but available throughput depends on your hardware and how many models are loaded.
If you’re running batch workloads, use streaming responses ("stream": true) to avoid long-blocking requests and reduce memory pressure on the inference nodes.
Best practices:
  • Reuse HTTP connections — avoid opening a new connection per request.
  • Use the model that fits your task. Smaller models like ollama/mistral respond faster for simple tasks.
  • Monitor GPU utilization via the monitoring guide to spot saturation early.

Error responses

All errors return a JSON body with a consistent structure.
{
  "error": {
    "code": "unauthorized",
    "message": "Invalid or missing API key.",
    "status": 401
  }
}
error
object
The top-level error container.

Next steps