Skip to content

How to Use MiniMax Image-01 for Text-to-Image and Image-to-Image Generation

Purpose

This post shows how to use MiniMax’s image-01 model for text-to-image generation, image-to-image transformation, and aspect ratio control.

Problem

I tried generating images with MiniMax’s older models and got this error:

Terminal window
Error: Model 'M2' has been deprecated. Please use 'M2.7' or 'image-01' instead.

I was using code from an old tutorial:

deprecated-code.py
# This code NO longer works
result = minimax.image.generate(
model="M2", # Deprecated!
prompt="A sunset over mountains"
)

MiniMax deprecated M2, M2.1, M2.5, and VL-01 models in the 2026.3.28 release. I needed to switch to image-01.

Environment

  • OpenClaw 2026.3.28 or later
  • MiniMax API key configured
  • Python 3.10+ or Node.js 18+

Solution

MiniMax image-01 supports three modes:

  1. Text-to-Image: Describe what you want
  2. Image-to-Image: Transform an existing image
  3. Aspect Ratio Control: Set output dimensions

Here’s the correct way to use image-01:

┌──────────────────┐ ┌──────────────┐ ┌─────────────┐
│ Input Source │────▶│ image-01 │────▶│ Output │
│ │ │ Model │ │ Image │
├──────────────────┤ ├──────────────┤ ├─────────────┤
│ Text Description │ │ │ │ 16:9, 1:1, │
│ OR │ │ aspect_ratio │ │ 4:3, etc. │
│ Source Image │ │ parameter │ │ │
└──────────────────┘ └──────────────┘ └─────────────┘

Text-to-Image Generation

I tested text-to-image first. Here’s the working code:

text-to-image.py
from minimax import MiniMaxClient
client = MiniMaxClient(api_key="${MINIMAX_API_KEY}")
# Generate image from text description
result = client.image.generate(
model="image-01",
prompt="A cyberpunk cat with neon collar in a futuristic city at night, highly detailed, 4k quality",
aspect_ratio="16:9" # Cinematic widescreen format
)
# Result contains image URL
print(f"Generated image: {result.image_url}")

When I ran this:

Terminal window
Generated image: https://cdn.minimax.com/images/abc123.png

The image matched my prompt. The cyberpunk cat had a glowing neon collar, standing in a rain-soaked futuristic cityscape.

Aspect Ratio Options

I tested different aspect ratios:

aspect-ratios.py
# Square format - good for profile pictures
result = client.image.generate(
model="image-01",
prompt="Portrait of a woman, soft lighting, professional",
aspect_ratio="1:1"
)
# Portrait - good for phone wallpapers
result = client.image.generate(
model="image-01",
prompt="Mountain landscape with lake reflection",
aspect_ratio="9:16"
)
# Landscape - good for presentations
result = client.image.generate(
model="image-01",
prompt="Team meeting in modern office",
aspect_ratio="4:3"
)
# Cinematic - good for video thumbnails
result = client.image.generate(
model="image-01",
prompt="Action scene with explosions",
aspect_ratio="16:9"
)

I compared the outputs:

aspect-ratio-comparison.txt
| Aspect Ratio | Dimensions | Best Use Case |
|--------------|------------|---------------------------|
| 1:1 | 1024x1024 | Profile pictures, icons |
| 4:3 | 1024x768 | Presentations, blogs |
| 16:9 | 1024x576 | Video thumbnails, banners |
| 9:16 | 576x1024 | Phone wallpapers, stories |
| 21:9 | 1024x439 | Cinematic posters |

Image-to-Image Transformation

I then tested image-to-image. I wanted to transform an existing image:

image-to-image.py
# Transform existing image with new style
result = client.image.generate(
model="image-01",
source_image="original_cat.png", # My source image
prompt="Same cat but in watercolor painting style, artistic, soft colors",
aspect_ratio="1:1" # Keep square format
)

I noticed a key difference from text-to-image:

image-mode-comparison.txt
┌─────────────────────────────────────────────────────────────┐
│ Mode Comparison │
├─────────────────────────────────────────────────────────────┤
│ │
│ TEXT-TO-IMAGE: │
│ prompt only → model creates from scratch │
│ example: "A cyberpunk cat..." │
│ │
│ IMAGE-TO-IMAGE: │
│ source_image + prompt → model transforms existing │
│ example: original_cat.png + "watercolor style" │
│ │
│ KEY: Keep prompts simple for image-to-image │
│ Complex prompts can distort the source too much │
│ │
└─────────────────────────────────────────────────────────────┘

Mistake I Made

At first, I used overly complex prompts for image-to-image:

wrong-complex-prompt.py
# WRONG: Too complex for image-to-image
result = client.image.generate(
model="image-01",
source_image="my_cat.png",
prompt="Transform into a cyberpunk cat with neon collar, add futuristic city background, rain effects, dramatic lighting, 4k quality, artstation trending"
)

The result distorted my original cat too much. The model tried to add too many new elements.

The fix was to keep prompts focused on transformation:

correct-focused-prompt.py
# CORRECT: Focus on style transformation only
result = client.image.generate(
model="image-01",
source_image="my_cat.png",
prompt="Watercolor painting style, soft artistic brushstrokes"
)

This preserved the original cat’s pose while applying the watercolor effect.

OpenClaw Configuration

I integrated image-01 into my OpenClaw config:

openclaw-config.yaml
minimax:
api_key: ${MINIMAX_API_KEY}
image:
model: image-01
default_aspect_ratio: "16:9"
timeout_seconds: 30
chat:
model: M2.7 # Use M2.7 for text generation
temperature: 0.7

Important: I use M2.7 for chat/text generation, image-01 for image generation. They serve different purposes.

Common Issues

Issue 1: Using Deprecated Models

When I tried older models:

Terminal window
Error: Model 'M2.5' not found. Available models: M2.7, image-01

The fix was simple:

fix-deprecated.py
# WRONG
model = "M2.5"
# CORRECT
model = "image-01" # For images
model = "M2.7" # For text/chat

Issue 2: Missing Aspect Ratio

I forgot aspect ratio once:

missing-aspect-ratio.py
# No aspect_ratio specified
result = client.image.generate(
model="image-01",
prompt="A landscape scene"
)

I got a 1:1 square image unexpectedly. The fix:

specify-aspect-ratio.py
# Always specify aspect_ratio for predictable output
result = client.image.generate(
model="image-01",
prompt="A landscape scene with mountains",
aspect_ratio="16:9" # Widescreen for landscapes
)

Issue 3: API Key Not Set

When my API key was missing:

Terminal window
Error: MINIMAX_API_KEY environment variable not set

I fixed it by setting the environment variable:

set-api-key.sh
export MINIMAX_API_KEY="your-api-key-here"
# Or in .env file
echo "MINIMAX_API_KEY=your-api-key-here" >> .env

Best Practices

DO

Specify aspect ratio for every request

aspect_ratio="16:9" # Always specify

Keep image-to-image prompts simple

prompt="Watercolor style" # Focus on transformation

Use M2.7 for text, image-01 for images

chat: { model: M2.7 }
image: { model: image-01 }

DON’T

Don’t use deprecated models

# WRONG
model = "M2" # Deprecated!
# CORRECT
model = "image-01"

Don’t skip aspect ratio

# WRONG - unpredictable output size
client.image.generate(model="image-01", prompt="...")
# CORRECT - predictable dimensions
client.image.generate(model="image-01", prompt="...", aspect_ratio="16:9")

Don’t use complex prompts for image-to-image

# WRONG - distorts source too much
prompt="Add cyberpunk city background, rain, neon lights..."
# CORRECT - focused transformation
prompt="Cyberpunk color palette style"

MiniMax image-01 differs from other image generation models:

model-comparison.txt
┌─────────────────┬────────────────────┬─────────────────────────┐
│ Model │ Capabilities │ Notes │
├─────────────────┼────────────────────┼─────────────────────────┤
│ DALL-E 3 │ Text-to-image │ High quality, slow │
│ Stable Diffusion│ Text + img2img │ Open source, flexible │
│ Midjourney │ Text-to-image │ Artistic, subscription │
│ MiniMax image-01│ Text + img2img │ Fast, integrated API │
│ │ + aspect ratio │ │
└─────────────────┴────────────────────┴─────────────────────────┘

MiniMax image-01 combines text-to-image and image-to-image in one model with aspect ratio control. This simplifies the stack - one API for all image generation needs.

Summary

In this post, I showed how to use MiniMax image-01 for text-to-image and image-to-image generation. The key points are:

  • Use image-01 model (M2, M2.1, M2.5, VL-01 are deprecated)
  • Specify aspect_ratio for predictable output dimensions
  • Keep image-to-image prompts simple - focus on style transformation
  • Use M2.7 for text generation, image-01 for image generation
  • Configure OpenClaw with separate model settings for chat vs image

MiniMax image-01 unifies text-to-image and image-to-image with aspect ratio control. One model handles all image generation needs without juggling multiple services.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments