Skip to content

Working with Images (Multimodal)

Send images alongside text in a chat completion request. The model will analyse the image and respond to your question.

Example

import base64, studiolm

client = studiolm.Client(api_key="sk-...")

with open("photo.jpg", "rb") as f:
    b64 = base64.b64encode(f.read()).decode()

response = client.chat.completions.create(
    model="gemma-3-12b-it-qat",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "What is in this image?"},
            {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{b64}"}},
        ],
    }],
)
print(response["choices"][0]["message"]["content"])
B64=$(base64 -w0 photo.jpg)

curl https://api.studiolm.dev/v1/chat/completions \
  -H "Authorization: Bearer sk-..." \
  -H "Content-Type: application/json" \
  -d "{
    \"model\": \"gemma-3-12b-it-qat\",
    \"messages\": [{
      \"role\": \"user\",
      \"content\": [
        {\"type\": \"text\", \"text\": \"What is in this image?\"},
        {\"type\": \"image_url\", \"image_url\": {\"url\": \"data:image/jpeg;base64,${B64}\"}}
      ]
    }]
  }"

Notes

  • Images are passed as data:image/jpeg;base64,... or data:image/png;base64,... URLs
  • Recommended max dimension: 1024 px (larger images are automatically resized)
  • Each image counts as approximately 85 tokens
  • JPEG and PNG are supported

Warning

Make sure your selected model supports image input. Check the model list if you are unsure.