Tools

Image Generation

Agents can generate images using DALL-E 3. The image generation tool accepts a natural language prompt and returns a URL or base64-encoded image.

Using image generation

Agents invoke the generate_image tool directly in their tool loop. The tool accepts a prompt and optional size, quality, and style parameters and returns either a temporary URL or a base64 payload depending on configuration.

json
{
  "tool": "generate_image",
  "params": {
    "prompt": "A minimalist architectural diagram of a microservices system with pastel colors",
    "size": "1792x1024",
    "quality": "hd",
    "style": "natural"
  }
}

Supported parameters

ParameterTypeRequiredOptions / DefaultDescription
promptstringYesNatural language description of the image to generate (max 4000 characters)
sizestringNo1024x1024Output dimensions: 1024x1024, 1792x1024, or 1024x1792
qualitystringNostandardRender quality: standard or hd
stylestringNovividVisual style: vivid (hyper-real) or natural (subdued, realistic)
nnumberNo1Number of images to generate (DALL-E 3 supports 1 only)
persistbooleanNofalseReturn base64-encoded image data instead of a temporary URL

Configuration

Image generation requires an OpenAI API key with DALL-E 3 access.

yaml
tools:
  imageGeneration:
    enabled: true
bash
export OPENAI_API_KEY=sk-...

Returned output

By default the tool returns a temporary OpenAI-hosted URL valid for one hour. Set persist: true to receive a base64-encoded image instead, which the gateway stores and serves from its own CDN.

ModeResponse fieldExpiry
Default (persist: false)url — OpenAI CDN link1 hour
Persisted (persist: true)url — Gateway CDN link, base64 — raw dataNo expiry
HD images cost significantly more than standard quality. Use quality: "standard" for drafts and iteration, reserving quality: "hd" for final outputs where detail matters.