One of the most effective ways to give OpenCode useful context is to paste in an image. A picture really is worth a thousand words — instead of describing a design you like or a bug you’re seeing, you can just show it.
This only works if the model you’re using has vision capability. Vision models can look at an image and understand what’s in it: they can describe photographs, extract text from screenshots, and interpret UI details like layout, spacing, typography, and color.
How to add an image
There are two ways to add an image to your OpenCode conversation:
- Drag and drop — drag an image file from your desktop or file manager directly into the OpenCode window
- Paste — copy an image to your clipboard and press
Cmd+V(Mac) orCtrl+V(Windows/Linux)
That’s it. The image is added to your message just like text. You can combine an image with a prompt in the same message.
Which models support vision
Not all models can interpret images. Here’s what’s confirmed:
| Model family | Provider | Vision support |
|---|---|---|
| Claude (Haiku, Sonnet, Opus — any version) | Anthropic / Zen | Yes |
| GPT-4, GPT-4.1, GPT-5 family | OpenAI / Zen | Yes |
| Gemini 3 Flash, Gemini 3.1 Pro | Google / Zen | Yes |
| Kimi K2.5 | OpenCode Go | Yes |
| Big Pickle, MiMo, Nemotron, MiniMax Free | Zen (free tier) | Not confirmed |
If you’re on a free Zen model and the model doesn’t acknowledge your image or seems to ignore it, that’s expected — switch to a confirmed vision model. Kimi K2.5 via OpenCode Go ($10/month) is the most affordable confirmed option. Any Claude or GPT-4+ model from your own Anthropic or OpenAI key will also work.
What vision is useful for
Here are some concrete ways images help:
Replicate a design — paste a screenshot of a UI you admire and ask OpenCode to recreate it. The model can read the layout, colors, spacing, and typography from the image.
Debug a visual bug — paste a screenshot of something that looks wrong. The model can see what you’re seeing and help diagnose it without you having to describe every detail.
Implement a mockup — paste a wireframe or design file export and ask OpenCode to build it. This is often faster and more accurate than trying to describe the design in words.
Extract text from an image — paste a screenshot containing text (a menu, a dialog, an error message) and the model can read it for you.
Try it
Make sure you’re using a vision-capable model, then try one of these:
- Take a screenshot of any webpage or app and ask: What do you see in this image?
- Paste a screenshot of a UI and ask: How would you describe the design of this interface?
If the model responds with details from the image, vision is working.