AI Image Generation Resolution & Aspect Ratio Guide | How to Choose the Best Size

AI Image Generation Resolution & Aspect Ratio Guide | How to Choose the Best Size

In AI image generation, resolution and aspect ratio settings are critical parameters that directly affect image quality. If not set correctly, they can cause human anatomy distortion or composition breakdown. This article systematically covers everything from recommended sizes per model to optimal aspect ratios by use case and upscaling techniques.

Why Resolution and Aspect Ratio Matter

The Relationship Between a Model’s Training Resolution and Generation Quality

AI image generation models are trained on images at a specific resolution. For example, Stable Diffusion 1.5 was trained on 512×512 pixel images, and SDXL on 1024×1024 pixel images. Generating at sizes significantly different from the training resolution can degrade quality.

Specifically, specifying a size much larger than the training resolution can cause the same subject to appear multiple times within the frame, or human body proportions to break down.

The Influence of Aspect Ratio on Composition

The aspect ratio (width-to-height ratio) greatly influences the composition of the generated image. A square (1:1) tends to produce bust-up compositions, though full-body or long shots are also possible depending on the prompt. A landscape (16:9) format tends to produce wider compositions including more background. Choosing an aspect ratio suited to your purpose is the shortcut to getting the composition you intend.

Each model has a base resolution from training time, with recommended sizes based on that.

SD 1.5

Aspect ratioResolutionUse case
1:1512×512Base size
2:3512×768For portraits
3:2768×512For landscapes and horizontal compositions

SD 1.5’s training resolution is 512×512. Quality degradation becomes prominent above 768 pixels, so using an upscaler is recommended when larger images are needed.

SDXL

Aspect ratioResolutionUse case
1:11024×1024Base size
2:3832×1216For portraits
3:21216×832For landscapes and horizontal compositions
9:16768×1344For smartphone wallpapers
16:91344×768For wide compositions

SDXL is trained with 1024×1024 as the baseline, and resolution combinations that keep the total pixel count around 1 megapixel (approximately 1 million pixels) are stable.

※ Each resolution is an approximation to keep the total pixel count around 1MP and may differ slightly from the exact aspect ratio

SD3 / SD3.5

Aspect ratioResolutionUse case
1:11024×1024Base size

The SD3 series is designed around 1024×1024. When changing aspect ratios, you can use SDXL resolutions as a guide.

Flux

Aspect ratioResolutionUse case
1:11024×1024Base size
Any~1MP total pixelsFree ratio

Flux is a model with high aspect ratio flexibility. Keeping total pixel count around 1 million pixels yields stable quality across a wide range of aspect ratios.

The optimal aspect ratio varies depending on where you plan to use the generated image.

Use caseAspect ratioSDXL recommended resolutionNotes
Social media posts (Instagram, etc.)1:11024×1024Ideal for feed posts
Social media posts (Instagram, etc.)4:5896×1120Portrait posts occupy more screen space
Blog thumbnails16:91344×768Also suitable as OGP images
Portraits2:3832×1216Full body to upper body fits naturally
PC wallpapers16:91344×768Assuming upscaling
Ultra-wide wallpapers21:91536×660Upscaling required
Smartphone wallpapers9:16768×1344Assuming upscaling

For wallpaper use cases, the common approach is to output at the resolutions above and then enlarge to the final resolution using an upscaler described below.

What Happens When Generating at Sizes Different from Training Resolution?

Common Problems

Specifying sizes significantly different from the training resolution tends to cause the following issues:

  • Anatomy distortion: Human body proportions break down, or the same subject generates multiple times
  • Composition breakdown: Unintended zoom or subject duplication
  • Detail breakdown: Increased occurrence of abnormal finger counts or facial distortion

These problems are particularly pronounced when specifying 1024×1024 or larger directly in SD 1.5.

Solution: Using Hires.fix

Hires.fix (High Resolution Fix) is a feature that first generates an image at the training resolution, then upscales it and runs denoising again. This allows obtaining high-resolution images while suppressing composition breakdown.

  1. Confirm composition at training resolution (e.g., 512×512)
  2. Upscale by specified multiplier (e.g., 2x)
  3. Generate again with set denoise strength

A denoise strength of 0.4–0.6 is typical. Too low leaves blurriness; too high changes the composition.

Upscaling Techniques

Here are the main methods for making generated images even higher resolution.

Hires.fix

As described above, this is the built-in upscaling feature used during generation. It comes standard in WebUIs like AUTOMATIC1111 and Forge. No additional installation is required and it’s easy to use, but VRAM consumption increases.

Ultimate SD Upscale

An extension that combines img2img with tile splitting. By splitting the image into tiles (small areas) and processing them in order, large images can be generated while keeping VRAM usage down. If tile boundary seams are noticeable, adjust the overlap settings.

Tiled Diffusion (MultiDiffusion)

A method that splits the image into tiles and denoises each tile in parallel. Similar in purpose to Ultimate SD Upscale, but differs in that the diffusion process itself is done at the tile level. VRAM consumption can be further reduced by combining with Tiled VAE.

External Upscalers

A method that uses AI-based super-resolution models to enlarge images after generation.

ToolFeatures
Real-ESRGANVersatile, supports both photorealistic and illustration
4x-UltraSharpStrong detail enhancement
SwinIRSwin Transformer-based upscaler
Topaz Gigapixel AICommercial software, easy-to-use GUI

External upscalers are independent of the generation process and have the advantage of being applicable to any model’s generated images.

Summary

Resolution and aspect ratio settings are fundamental elements that influence AI image generation quality.

  • Match the model’s training resolution — the basic principle for stable quality
  • Choosing an aspect ratio suited to your use case makes it easier to get the intended composition
  • Use upscalers when high resolution is needed — output near the training resolution during generation

It’s recommended to start by generating at each model’s recommended resolution, then adjust the aspect ratio according to your use case.