Conclusions
- All 3 god prompts fit within 75 tokens — ablation testing removed unnecessary elements, reducing Summer Festival to 44 tokens, Morning Bed to 54 tokens, and Cafe Snap to 25 tokens
- Simple, non-contradictory environments are the key to stability — fewer elements means fewer failures; consistent lighting and location descriptions matter
- Hand management contributes to stability — techniques like holding cotton candy or resting a chin on hands help stabilize hand rendering
- Don’t add unnecessary quality words —
coherent anatomy,natural skin texture, and8Khave no effect in z-image-turbo - Style specification at the front is most effective — declaring the photo style at the start, like
A Polaroid instant photoorAn intimate close-up portrait, stabilizes composition and atmosphere
Each element’s necessity was verified one by one via ablation testing, and these are the minimal versions with unnecessary elements removed. Even generating 9 images in a row, every single one hits the mark.
Selection Criteria
- The first-draft prompt produced the intended image with no revisions
- Generating 9 images, every one was stable at goal quality
- Unnecessary elements removed via ablation testing
How Token Count Is Measured
Token counts in this article are measured values using the CLIP tokenizer (openai/clip-vit-large-patch14). Word count and token count do not match. For details, see Prompt Basics - How to Count Tokens.
God Prompt 1: Summer Festival Polaroid
Intent: A Polaroid photo of a woman in a yukata holding cotton candy and smiling under festival lanterns.
Minimal Version — 9 Sample Images
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
Elements Removed by Ablation Testing
See detailed test results here
| Removed element | Reason |
|---|---|
paper lantern warm light | Lanterns naturally appear from the summer festival association (test result) |
food stalls blurred in background | summer festival alone produces stalls |
Polaroid instant film look, slightly faded colors, soft vignette, warm nostalgic tint, fixed focus. | The opening A Polaroid instant photo is sufficient |
coherent anatomy. | No effect in z-image-turbo |
75 tokens → 44 tokens (31 tokens removed). The Polaroid border, faded colors, lanterns, yukata, and cotton candy are all preserved.
Comparison with original (click to expand)
Original version samples (for reference):
![]() | ![]() | ![]() |
Why It’s Stable
A Polaroid instant photois extremely effective in CLIP — the training data contains large amounts of Polaroid photography, so white borders, faded colors, and vignette are reproduced all at once- The components of “summer festival” are simple and unambiguous —
paper lantern+yukata+duskpins down the scene with no ambiguity cotton candy in handcontributes to hand stability — giving the subject something to hold stabilizes finger rendering- 44 tokens leaves plenty of room below the 75-token limit — fitting entirely within one chunk means every element gets full effect
God Prompt 2: Morning Bed Intimate Portrait
Intent: An intimate shot of a woman lying in bed with rumpled white sheets, lit by morning light through curtains.
Minimal Version — 9 Sample Images
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
Elements Removed by Ablation Testing
| Removed element | Reason |
|---|---|
intimate portrait quality, shallow depth of field, soft bokeh background, gentle lighting on face. | Shallow bokeh and intimate quality are maintained from the opening An intimate close-up portrait (test result) |
coherent anatomy, natural skin texture. | No effect in z-image-turbo |
85 tokens → 54 tokens (31 tokens removed). Even with the trailing quality instructions removed, the opening An intimate close-up portrait sufficiently defines the composition, bokeh, and intimate atmosphere. Now fits within 75 tokens.
Comparison with original (click to expand)
Original version samples (for reference):
![]() | ![]() | ![]() |
Why It’s Stable
An intimate close-up portraitsimultaneously determines composition and atmosphere — “close-up” and “intimate” covered in one efficient phrase- The environment is simple with no contradictions — just
in bed, white sheets, morning light through curtains chin resting on handsprevents hand rendering failures — the chin-on-hands pose stabilizes finger depictionhalf-closed eyescontrols expression — the half-open eyes stably convey a “just woke up” / “drowsy” mood
God Prompt 3: Cafe Window Snap
Intent: A natural, casual-looking shot — like a friend caught zoning out at a cafe window seat, taken on a smartphone.
Minimal Version — 9 Sample Images
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
Elements Removed by Ablation Testing
| Removed element | Reason |
|---|---|
A candid iPhone snapshot of an actress in her everyday life. (entire opening sentence) | Scene description tags sufficiently define composition and atmosphere (test result) |
The photo feels imperfect and unposed: ... (21-word block) | Removed along with opening; scene description tags are sufficient |
photorealistic, snapshot aesthetic. | z-image-turbo is photorealistic by default (test result) |
natural skin texture, coherent anatomy. | No effect in z-image-turbo |
83 tokens → 25 tokens — a massive reduction. The opening natural language sentence, quality keywords, and unnecessary modifiers are all gone. Scene description tags alone reproduce the composition, lighting, and atmosphere.
Comparison with original (click to expand)
Original version samples (for reference):
![]() | ![]() | ![]() |
Why It’s Stable
- Scene description tags are specific and unambiguous —
small cafe window seat+natural overcast daylight+beige oversized knit sweaterpins down the scene looking out windowshifts the gaze away from camera — gives the candid, natural-moment nuanceactressguides facial direction — steers toward expressive, attractive facial features (test result)- 25 tokens leaves a huge margin below the 75-token limit — fits entirely within one chunk, every element gets full effect
Common Patterns Across God Prompts
| Feature | Summer Festival Polaroid | Morning Bed | Cafe Snap |
|---|---|---|---|
| CLIP token count | 44 | 54 | 25 |
| Reduction from original | -31 tokens | -31 tokens | Drastic reduction |
| Within 75 tokens | ◎ | ◎ | ◎ |
| Environment complexity | Low | Low | Low |
| Hand management | Holding cotton candy | Chin on hands | N/A (hands not visible) |
| Contradicting elements | None | None | None |
The 5 Conditions for a God Prompt
- Simple environment — fewer elements means fewer failures
- Hand management — give the subject something to hold, or use a pose where hands aren’t visible
- No contradicting instructions — lighting and location descriptions must be consistent
- Within 75 tokens — fitting within one chunk means every element gets full effect
- No unnecessary quality words —
coherent anatomy,natural skin texture,8Kare ineffective
Related Articles
⚠ 関連記事が見つかりません: /en/tips/god-prompt-ablation
⚠ 関連記事が見つかりません: /en/tips/coherent-anatomy-test
⚠ 関連記事が見つかりません: /en/tips/prompt-basics
⚠ 関連記事が見つかりません: /en/tips/prompt-refinement










































