Complete Guide to Running z-image-turbo on RunPod Serverless | Building an API Environment

Complete Guide to Running z-image-turbo on RunPod Serverless | Building an API Environment

For advanced users who want to use z-image-turbo for API-based bulk generation and automation, this guide explains how to set up a RunPod Serverless environment.

If you’re a beginner, we recommend starting with ConoHa AI Canvas first.

What Is RunPod Serverless?

RunPod is a cloud GPU platform. Using its Serverless function, you can run AI inference as an API endpoint.

Advantages:

  • Pay-as-you-go billing (zero cost when idle)
  • Auto-scaling support
  • Callable via API for automation and batch processing
  • Parallel execution with multiple workers possible

Disadvantages:

  • Technical knowledge required (Docker, API)
  • Latency of a few seconds to tens of seconds on cold start

For cost comparisons with other services, see the Cloud GPU Comparison article.

Setup Steps

Step 1: Create a RunPod Account

  1. Visit RunPod and create an account
  2. Register a credit card (for pay-as-you-go billing)
  3. Add initial credits ($10+ recommended)

Step 2: Create a Serverless Endpoint

  1. In the RunPod dashboard, navigate to the Serverless section
  2. Click New Endpoint
  3. Enter the following settings:
SettingRecommended ValueDescription
Endpoint Namez-image-turboAny name
Docker Imagez-image-turbo compatible imageDocker image containing the model
GPU TypeRTX A4000 / RTX 3090A4000 for cost efficiency
Min Workers0Zero cost when idle
Max Workers3Upper limit for concurrent requests
Idle Timeout5 secondsWorker idle timeout

Step 3: Get an API Key

  1. In the RunPod dashboard, go to SettingsAPI Keys
  2. Generate a new API key
  3. Also note down the Endpoint ID

Step 4: Set Environment Variables

# Save to ~/.config/z-image/.env
RUNPOD_API_KEY=your_api_key_here
RUNPOD_ENDPOINT_ID=your_endpoint_id_here

Step 5: Test Request

Use curl to test the API:

curl -X POST "https://api.runpod.ai/v2/${RUNPOD_ENDPOINT_ID}/runsync" \
  -H "Authorization: Bearer ${RUNPOD_API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "input": {
      "prompt": "a beautiful Japanese woman, portrait, natural lighting, 85mm lens",
      "width": 1024,
      "height": 1024,
      "steps": 8,
      "sampler": "euler",
      "scheduler": "ddim_uniform"
    }
  }'

Cost Optimization

GPU Selection

GPURate (/sec)SpeedCost Efficiency
RTX A4000$0.00016Good
RTX 3090$0.00022Very Good
RTX A5000$0.00028Very Good
A100$0.00076Fastest△ (overkill)

z-image-turbo can generate in 8 steps, so expensive GPUs aren’t necessary. RTX A4000 is the best cost-performance option.

Worker Configuration Tips

  • Min Workers = 0: Zero cost when idle. However, cold starts will occur.
  • Min Workers = 1: Keep one worker always running. For cases where latency is unacceptable.
  • Max Workers: Upper limit for concurrent requests. Set according to your batch processing parallelism.

Batch Processing

Using RunPod API’s asynchronous endpoint (/run) lets you send multiple requests in parallel for batch processing.

Prompt Configuration

For prompt writing, see these articles:

The same parameter settings used in the ComfyUI workflow can also be used with the API.

Troubleshooting

Cold Start Is Slow

  • Set Min Workers to 1 to keep a worker always running (increased cost)
  • Enable the FlashBoot option (if supported for the GPU)

Images Not Generating

  • Check that the API key is correct
  • Check that the Endpoint ID is correct
  • Check the endpoint status in the RunPod dashboard

Costs Higher Than Expected

  • Shorten the Idle Timeout (5 seconds recommended)
  • Verify Min Workers is set to 0
  • Delete unused endpoints

Summary

z-image-turbo environment setup on RunPod Serverless:

  1. Create account → Configure endpoint → Get API key
  2. RTX A4000 is the best cost-performance option
  3. Min Workers = 0 for zero idle cost
  4. curl or script for API requests