Working with Images (Multimodal)¶
Send images alongside text in a chat completion request. The model will analyse the image and respond to your question.
Example¶
import base64, studiolm
client = studiolm.Client(api_key="sk-...")
with open("photo.jpg", "rb") as f:
b64 = base64.b64encode(f.read()).decode()
response = client.chat.completions.create(
model="gemma-3-12b-it-qat",
messages=[{
"role": "user",
"content": [
{"type": "text", "text": "What is in this image?"},
{"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{b64}"}},
],
}],
)
print(response["choices"][0]["message"]["content"])
B64=$(base64 -w0 photo.jpg)
curl https://api.studiolm.dev/v1/chat/completions \
-H "Authorization: Bearer sk-..." \
-H "Content-Type: application/json" \
-d "{
\"model\": \"gemma-3-12b-it-qat\",
\"messages\": [{
\"role\": \"user\",
\"content\": [
{\"type\": \"text\", \"text\": \"What is in this image?\"},
{\"type\": \"image_url\", \"image_url\": {\"url\": \"data:image/jpeg;base64,${B64}\"}}
]
}]
}"
Notes¶
- Images are passed as
data:image/jpeg;base64,...ordata:image/png;base64,...URLs - Recommended max dimension: 1024 px (larger images are automatically resized)
- Each image counts as approximately 85 tokens
- JPEG and PNG are supported
Warning
Make sure your selected model supports image input. Check the model list if you are unsure.