Readme
Z-Anime
Generate high-quality anime-style images from natural-language prompts. Z-Anime is a full fine-tune of Alibaba’s Z-Image Base on anime aesthetics — not a LoRA merge, but a fully retrained 6B-parameter diffusion transformer (SeeSee21/Z-Anime on Hugging Face, Apache-2.0). It excels at expressive characters, rich lighting, and detailed line work, and works best with descriptive natural-language prompts rather than tag lists.
What it’s good at
- Cinematic anime portraits with detailed eyes, hair, and skin shading.
- Full-body characters, action scenes, and atmospheric backgrounds.
- Fantasy, sci-fi, slice-of-life, and shōnen/shōjo styles.
- Strong prompt adherence — describe what you want in full sentences and Z-Anime delivers.
- Full negative-prompt support for steering away from unwanted artifacts.
Inputs
- prompt — natural-language description of the image you want. Describe the subject, setting, lighting, color palette, and style. Longer, more descriptive prompts produce better results than tag lists.
- negative_prompt (optional) — things to avoid. Leave blank to disable. A common starter is “low quality, worst quality, blurry, extra fingers, bad anatomy, text, watermark”.
- aspect_ratio — choose square, portrait, landscape, tall, or wide. Portrait (832×1216) is the default and best for character art; landscape (1216×832) for scenes; tall (768×1344) for full-body; wide (1344×768) for cinematic compositions.
- num_inference_steps — denoising steps. The default of 36 is the sweet spot. Lower values (20-28) trade quality for speed; higher values (50-80) give marginal improvements.
- guidance_scale — how strongly to follow the prompt. The default of 4.0 is balanced. 3.0-5.0 is the sweet spot. Above 7.0 risks oversaturation and rigid compositions.
- seed — set to -1 for a random seed, or pin a specific number to reproduce results across runs.
Output
Returns a single PNG image at the chosen aspect ratio.
Prompting tips
- Write in natural sentences, not comma-separated tags.
- Lead with the most important details — Z-Anime weights early prompt tokens more heavily.
- Specify lighting (“warm afternoon light”, “soft rim lighting”), composition (“close-up portrait”, “wide cinematic shot”), and style cues (“expressive eyes with detailed reflections”, “fine line work”) to lift quality.
- For character consistency across multiple generations, pin the seed.
Use cases
- Concept art and character design for games, comics, and animation.
- Storyboarding and reference imagery for creative projects.
- Social media and marketing visuals with an anime aesthetic.
- Personal art exploration and rapid prototyping of visual ideas.
Limitations
- The model is anime-focused — photorealistic or non-anime styles are not its strength.
- Very long prompts (over ~512 tokens) are truncated.
- Like all diffusion models, hands and fine text can occasionally render imperfectly; a negative prompt helps.
- Output is non-deterministic without a fixed seed.
License
The model wrapper is Apache-2.0 (see GitHub repo). Underlying weights are governed by the upstream SeeSee21/Z-Anime model card.