Skip to main content
Use this tutorial when you want reliable captions, alt text, or lightweight image metadata.

Best fit

  • accessibility alt text
  • product image descriptions
  • editorial preview text
  • image metadata enrichment
  • vision-capable model
  • structured outputs if your app expects typed fields like alt_text
  • batch when captioning large image collections offline

Workflow

  1. send the image as a data URI
  2. ask for a concise, objective caption
  3. switch to a JSON schema if your app needs structured fields
  4. batch the workflow when the volume grows