Skip to content

Chat Completions

POST https://api.studiolm.dev/v1/chat/completions

Quick example

import studiolm
client = studiolm.Client(api_key="sk-...")

response = client.chat.completions.create(
    model="gemma-3-12b-it-qat",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of France?"},
    ],
)
print(response["choices"][0]["message"]["content"])
curl https://api.studiolm.dev/v1/chat/completions \
  -H "Authorization: Bearer sk-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemma-3-12b-it-qat",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is the capital of France?"}
    ]
  }'

Parameters

Parameter Type Default Description
model string required Model ID (see GET /v1/models)
messages array required Conversation history
temperature number 0.7 Randomness (0–2)
max_tokens integer 1000 Max tokens to generate
stream boolean false Stream tokens via SSE
top_p number 1.0 Nucleus sampling
frequency_penalty number 0.0 Penalise repeated tokens
presence_penalty number 0.0 Penalise used tokens
web_search string/bool false "auto", "force", "images", or false
response_format string "json" to enable JSON mode

Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1677858242,
  "model": "gemma-3-12b-it-qat",
  "choices": [{
    "index": 0,
    "message": {"role": "assistant", "content": "Paris."},
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 18,
    "completion_tokens": 4,
    "total_tokens": 22
  }
}

Streaming

Set stream: true to receive tokens as Server-Sent Events.

for chunk in client.chat.completions.create(
    model="gemma-3-12b-it-qat",
    messages=[{"role": "user", "content": "Tell me a joke"}],
    stream=True,
):
    print(chunk["choices"][0].get("delta", {}).get("content", ""), end="", flush=True)
curl https://api.studiolm.dev/v1/chat/completions \
  -H "Authorization: Bearer sk-..." \
  -H "Content-Type: application/json" \
  -d '{"model":"gemma-3-12b-it-qat","messages":[{"role":"user","content":"Tell me a joke"}],"stream":true}'

Each SSE chunk looks like:

data: {"id":"chatcmpl-abc","choices":[{"delta":{"content":"Why"},"index":0}]}
data: [DONE]
response = client.chat.completions.create(
    model="gemma-3-27b-it-qat",
    messages=[{"role": "user", "content": "Latest AI news?"}],
    web_search="auto",
)
Mode Behaviour
"auto" Searches when the model decides it's needed
"force" Always performs a web search
"images" Image search; returns inline image results
false No web search (default)

JSON mode

response = client.chat.completions.create(
    model="gemma-3-12b-it-qat",
    messages=[{"role": "user", "content": "List 3 languages as JSON."}],
    response_format="json",
)

See JSON Response Format for schema support.