base_url and API key — no other code changes needed.
All requests require an Authorization header. See the API overview for authentication details.
Model naming
Jarvis runs models via Ollama, and model names follow theprovider/model convention:
| Model name | Description |
|---|---|
ollama/llama3 | Meta Llama 3 (8B or 70B depending on your node) |
ollama/mistral | Mistral 7B — fast, general-purpose |
ollama/deepseek-coder | DeepSeek Coder — optimized for code tasks |
ollama/phi3 | Microsoft Phi-3 — efficient, low-memory |
ollama/gemma2 | Google Gemma 2 |
GET /models to see the exact list available on your instance.
GET /models
List all models currently available for inference.Response
Always
"list".An array of model objects.
Example
curl
Response
POST /chat/completions
Send a multi-turn conversation to a model and receive a completion. This endpoint is fully OpenAI-compatible.Request
The model to use (e.g.,
ollama/llama3). Use GET /models to see available options.An array of message objects representing the conversation history.
If
true, the response streams as server-sent events. Defaults to false.Sampling temperature between
0 and 2. Higher values produce more varied output. Defaults to 1.Maximum number of tokens to generate. Defaults to model-dependent limits.
Response
Unique identifier for this completion.
Always
"chat.completion".The model that generated the response.
An array of completion choices (usually one).
Token usage for the request.
Examples
Response
POST /completions
Generate a text completion from a raw prompt string (non-chat format).Request
The model to use (e.g.,
ollama/mistral).The input text to complete.
Maximum tokens to generate.
Sampling temperature. Defaults to
1.Stream the response as server-sent events. Defaults to
false.Response
Unique identifier for this completion.
Always
"text_completion".An array of completion choices.
Example
curl
For most tasks, prefer
POST /chat/completions. The chat format gives the model more context about the conversation role and produces better results with instruction-tuned models.