Chat Completions¶

POST https://api.studiolm.dev/v1/chat/completions

Quick example¶

Python SDKcURL

import studiolm
client = studiolm.Client(api_key="sk-...")

response = client.chat.completions.create(
    model="gemma-3-12b-it-qat",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of France?"},
    ],
)
print(response["choices"][0]["message"]["content"])

curl https://api.studiolm.dev/v1/chat/completions \
  -H "Authorization: Bearer sk-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemma-3-12b-it-qat",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is the capital of France?"}
    ]
  }'

Parameters¶

Parameter	Type	Default	Description
`model`	string	required	Model ID (see `GET /v1/models`)
`messages`	array	required	Conversation history
`temperature`	number	`0.7`	Randomness (0–2)
`max_tokens`	integer	`1000`	Max tokens to generate
`stream`	boolean	`false`	Stream tokens via SSE
`top_p`	number	`1.0`	Nucleus sampling
`frequency_penalty`	number	`0.0`	Penalise repeated tokens
`presence_penalty`	number	`0.0`	Penalise used tokens
`web_search`	string/bool	`false`	`"auto"`, `"force"`, `"images"`, or `false`
`response_format`	string	—	`"json"` to enable JSON mode

Response¶

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1677858242,
  "model": "gemma-3-12b-it-qat",
  "choices": [{
    "index": 0,
    "message": {"role": "assistant", "content": "Paris."},
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 18,
    "completion_tokens": 4,
    "total_tokens": 22
  }
}

Streaming¶

Set stream: true to receive tokens as Server-Sent Events.

Python SDKcURL

for chunk in client.chat.completions.create(
    model="gemma-3-12b-it-qat",
    messages=[{"role": "user", "content": "Tell me a joke"}],
    stream=True,
):
    print(chunk["choices"][0].get("delta", {}).get("content", ""), end="", flush=True)

curl https://api.studiolm.dev/v1/chat/completions \
  -H "Authorization: Bearer sk-..." \
  -H "Content-Type: application/json" \
  -d '{"model":"gemma-3-12b-it-qat","messages":[{"role":"user","content":"Tell me a joke"}],"stream":true}'

Each SSE chunk looks like:

data: {"id":"chatcmpl-abc","choices":[{"delta":{"content":"Why"},"index":0}]}
data: [DONE]

Web search¶

response = client.chat.completions.create(
    model="gemma-3-27b-it-qat",
    messages=[{"role": "user", "content": "Latest AI news?"}],
    web_search="auto",
)

Mode	Behaviour
`"auto"`	Searches when the model decides it's needed
`"force"`	Always performs a web search
`"images"`	Image search; returns inline image results
`false`	No web search (default)

JSON mode¶

response = client.chat.completions.create(
    model="gemma-3-12b-it-qat",
    messages=[{"role": "user", "content": "List 3 languages as JSON."}],
    response_format="json",
)

See JSON Response Format for schema support.