JSON Response Format¶
Force the model to respond with valid JSON by setting response_format: "json".
Modes¶
| Mode | How to use |
|---|---|
| Simple object | Ask for JSON in the prompt, set response_format: "json" |
| Array / multiple items | Ask for a JSON array in the prompt |
| Custom schema | Include your schema in the system or user message |
Simple JSON¶
Custom schema¶
Include the schema in your prompt and the model will conform to it:
response = client.chat.completions.create(
model="gemma-3-12b-it-qat",
messages=[{
"role": "user",
"content": (
"List 3 programming languages. "
'Respond as JSON matching this schema: '
'{"languages": [{"name": string, "year": number}]}'
),
}],
response_format="json",
)
Streaming with JSON mode¶
Buffer the full response before parsing — partial JSON is not valid:
chunks = []
for chunk in client.chat.completions.create(
model="gemma-3-12b-it-qat",
messages=[{"role": "user", "content": "Give me a JSON object with a joke."}],
stream=True,
response_format="json",
):
chunks.append(chunk["choices"][0].get("delta", {}).get("content", ""))
import json
data = json.loads("".join(chunks))
Tip
Always tell the model in your prompt what JSON shape you expect — the instruction in the prompt is what guides the structure.