Working with Images (Multimodal)¶

Send images alongside text in a chat completion request. The model will analyse the image and respond to your question.

Example¶

Python SDKcURL

import base64, studiolm

client = studiolm.Client(api_key="sk-...")

with open("photo.jpg", "rb") as f:
    b64 = base64.b64encode(f.read()).decode()

response = client.chat.completions.create(
    model="gemma-3-12b-it-qat",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "What is in this image?"},
            {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{b64}"}},
        ],
    }],
)
print(response["choices"][0]["message"]["content"])

B64=$(base64 -w0 photo.jpg)

curl https://api.studiolm.dev/v1/chat/completions \
  -H "Authorization: Bearer sk-..." \
  -H "Content-Type: application/json" \
  -d "{
    \"model\": \"gemma-3-12b-it-qat\",
    \"messages\": [{
      \"role\": \"user\",
      \"content\": [
        {\"type\": \"text\", \"text\": \"What is in this image?\"},
        {\"type\": \"image_url\", \"image_url\": {\"url\": \"data:image/jpeg;base64,${B64}\"}}
      ]
    }]
  }"

Notes¶

Images are passed as data:image/jpeg;base64,... or data:image/png;base64,... URLs
Recommended max dimension: 1024 px (larger images are automatically resized)
Each image counts as approximately 85 tokens
JPEG and PNG are supported

Warning

Make sure your selected model supports image input. Check the model list if you are unsure.