Conclusion
- face mask and surgical mask produce very similar outputs. Both render as disposable non-woven masks; face mask tends to produce more color variation (light blue, black, white)
- masquerade mask has a large impact on outfits. Without an explicit outfit in the prompt, Japanese-style kimono appeared in all 3 images. However, specifying “casual outfit” or “black evening dress” suppressed kimono appearance to 0/3
- gas mask renders as a full-face military-style gas mask that almost completely hides the face. Without an outfit specification, kimono appeared in all 3 images, but specifying “casual outfit” suppressed kimono to 0/3
- ski mask triggers a full ski outfit transformation. Goggles, helmet, and ski jacket appear as a set, with the background shifting to a ski resort
- Explicit outfit keywords are effective at overriding side effects. The kimono effect from masquerade mask and gas mask was completely suppressed by specifying an outfit (0/9 kimono appearances across all)
What This Article Covers
- Output differences across mask types specified in English prompts
- Whether and how much mask types affect outfits and backgrounds as side effects
- Rendering stability of each mask (reproducibility across 3 seeds)
- Whether specifying an outfit keyword can override the kimono side effect from masks
Experimental Conditions
| Item | Value |
|---|
| Model | z-image-turbo |
| Steps | 8 |
| Sampler | euler |
| Scheduler | ddim_uniform |
| CFG | 1.0 |
| Image size | 1024×1024 |
| Seeds | 42, 123, 789 (3 fixed seeds) |
Base Prompt
1girl, 32yo japanese actress, {MASK}, standing, looking at viewer, indoor
{MASK} is swapped per condition.
Condition A: face mask (2 tokens)
1girl, 32yo japanese actress, face mask, standing, looking at viewer, indoor
Observations
- All 3 images rendered a disposable non-woven mask covering the nose and mouth
- Mask colors varied: light blue (seed42), black (seed123), white (seed789)
- Outfits were casual clothing (knit, coat, striped t-shirt) — no outfit impact from the mask
- Background was inside a train station (seed42) or against a wall (seed123, seed789), broadly following the indoor instruction
- Only seed42 included bystanders in the background, producing a candid snapshot-style composition
Condition B: surgical mask (2 tokens)
1girl, 32yo japanese actress, surgical mask, standing, looking at viewer, indoor
Observations
- All 3 images rendered a light blue non-woven mask. Unlike face mask, the color is consistently light blue
- Mask shape and placement are nearly identical to face mask — the only visible difference is the color consistency
- Outfits were casual (knit, coat + black top, checked jacket)
- All backgrounds were against a wall, without the station interior variation seen in face mask seed42
- Overall, slightly higher compositional and background stability than face mask
Condition C: masquerade mask (2 tokens)
1girl, 32yo japanese actress, masquerade mask, standing, looking at viewer, indoor
Observations
- All 3 images rendered a Venetian-style mask covering the eyes, with gold edging throughout
- Mask designs varied: black + gold trim (seed42, seed789), pale green + gold trim + floral pattern (seed123)
- All outfits changed to Japanese kimono. Despite no outfit specification in the prompt, patterned kimono appeared in combination with masquerade mask (3/3 images)
- Seed42 included a hand touching the mask, adding a masquerade ball gesture on top of the “looking at viewer” instruction
- Backgrounds consistently showed Japanese-style interiors (picture frames, walls)
- Composition shifted toward bust-up, with the face more zoomed in than face mask/surgical mask
Condition D: gas mask (2 tokens)
1girl, 32yo japanese actress, gas mask, standing, looking at viewer, indoor
Observations
- All 3 images rendered a full-face military gas mask with transparent goggle section and cylindrical filters on both sides
- Mask color is consistently olive green across all 3, with high reproducibility
- All outfits changed to Japanese kimono. Similar to masquerade mask, though seed42 also added leather straps over the kimono from the mask’s harness
- Face visibility is the lowest of all 5 conditions — only the eyes can barely be seen through the goggles
- Background is Japanese-style wall settings, similar to masquerade mask results
Condition E: ski mask (2 tokens)
1girl, 32yo japanese actress, ski mask, standing, looking at viewer, indoor
Observations
- While “ski mask” normally refers to a balaclava, all 3 images rendered it as a ski face covering + goggles as part of full ski gear
- Mask portions varied: helmet + goggles + blue cloth mask (seed42), goggles + black neck warmer mask (seed123), goggles + white non-woven mask (seed789)
- Outfits changed to ski wear (jacket, goggles, helmet). Gray ski jackets appeared in all 3 images
- Backgrounds changed to a ski resort (indoor ski facility) (most prominent in seed42). The indoor instruction was followed but interpreted as a ski facility interior
- The strongest outfit + background side effect of all 5 conditions — essentially overwrites the entire scene context with “ski resort”
Experiment 2: Testing Side Effect Override
Experiment 1 confirmed kimono side effects from masquerade mask and gas mask. Can these be suppressed by explicitly specifying an outfit keyword?
Experimental Conditions
Same model, parameters, and seeds as Experiment 1. Adding outfit keywords alongside the mask keyword.
Condition F: masquerade mask + casual outfit
1girl, 32yo japanese actress, masquerade mask, casual outfit, standing, looking at viewer, indoor
Observations
- All 3 images rendered eye-covering masks: plain gold (seed42), black decorative (seed123), white decorative (seed789)
- Kimono appeared in 0/3 images. A stark contrast to Condition C (masquerade mask alone), where kimono appeared in all 3
- Outfits were casual: denim jacket + striped t-shirt + denim shorts (seed42), gray overcoat + black top + jeans (seed123), striped long t-shirt (seed789)
- Mask designs are simpler than the gold-trimmed Venetian style seen in Condition C
- Backgrounds are wall settings, different from the Japanese-style interiors in Condition C
Condition G: masquerade mask + black evening dress
1girl, 32yo japanese actress, masquerade mask, black evening dress, standing, looking at viewer, indoor
Observations
- All 3 images rendered black Venetian masks — all 3 featured feather decorations, more elaborate than Condition C
- Kimono appeared in 0/3 images. The black evening dress specification suppressed kimono
- All 3 images rendered black long dresses, with variation in silhouette: sequined V-neck with slit (seed42), V-neck mermaid line (seed123), corset-style strapless (seed789)
- Only seed789 used a sideways pose, slightly deviating from “looking at viewer”
- Backgrounds were hotel lobby (seed42) and wall settings (seed123, seed789) — different from the Japanese-style interiors in C
- The masquerade mask + dress combination strongly evokes a masquerade ball atmosphere
Condition H: gas mask + casual outfit
1girl, 32yo japanese actress, gas mask, casual outfit, standing, looking at viewer, indoor
Observations
- All 3 images rendered full-face military gas masks. Shape (olive green body + cylindrical filters on both sides) identical to Condition D
- Kimono appeared in 0/3 images. A stark contrast to Condition D (gas mask alone), where kimono appeared in all 3
- Outfits were casual: gray t-shirt + beige chinos (seed42), gray hoodie (seed123), gray sweatshirt + jeans (seed789)
- Gas mask shape and color unchanged by the addition of outfit keywords — no influence on the mask rendering itself
- Backgrounds are wall settings, different from the Japanese-style walls in Condition D
Summary
Cross-Comparison: Experiment 1 (Mask Type)
| Condition | Seed 42 | Seed 123 | Seed 789 |
|---|
| face mask |  |  |  |
| surgical mask |  |  |  |
| masquerade mask |  |  |  |
| gas mask |  |  |  |
| Condition | Seed 42 | Seed 123 | Seed 789 |
|---|
| ski mask |  |  |  |
Outfit and background side effects vary significantly by mask type.
- No side effects: face mask, surgical mask — outfit and background follow the base prompt
- Outfit side effects: masquerade mask, gas mask — kimono change in 3/3 images
- Outfit + background side effects: ski mask — ski wear + ski resort change in 3/3 images
face mask and surgical mask produce nearly identical outputs, making them interchangeable. Since surgical mask consistently produces a light blue mask, choose it when targeting a specific color.
Cross-Comparison: Experiment 2 (Side Effect Override)
| Condition | Seed 42 | Seed 123 | Seed 789 |
|---|
| masquerade mask (no outfit) |  |  |  |
| masquerade mask + casual outfit |  |  |  |
| masquerade mask + black evening dress |  |  |  |
| Condition | Seed 42 | Seed 123 | Seed 789 |
|---|
| gas mask (no outfit) |  |  |  |
| gas mask + casual outfit |  |  |  |
Explicitly specifying outfit keywords completely suppressed the kimono side effect from masquerade mask and gas mask.
| Condition | Kimono Rate | Outfit Side Effect |
|---|
| masquerade mask (no outfit) | 3/3 | Yes |
| masquerade mask + casual outfit | 0/3 | No |
| masquerade mask + black evening dress | 0/3 | No |
| gas mask (no outfit) | 3/3 | Yes |
| gas mask + casual outfit | 0/3 | No |
- Adding outfit keywords suppressed kimono to 0/9 images
- Even an abstract specification like “casual outfit” is effective enough for the override
- Adding outfit keywords showed no apparent influence on the mask itself (shape/color unchanged)
Why does masquerade mask produce a kimono? And ski mask just drags the whole scene to a ski resort. But then one word — casual outfit — makes the kimono disappear entirely. The overriding power of outfit keywords is wild. The key takeaway here is “mask-type keywords have a big impact on outfits, so you should always specify an outfit at the same time.” Worth remembering that even vague specs like casual outfit work fine.
Related Articles