About this model
Qwen Edit 2511 is the image-editing variant in Alibaba Tongyi Lab's Qwen-Image line, an inpainting model built on the 20B-parameter Qwen-Image diffusion (MMDiT) architecture. Rather than requiring masks or control points, it accepts a source image and a text prompt, handling spatial reasoning and pixel-level transformations internally for tasks like object removal, background replacement, material swaps, and lighting adjustments. Architecturally, it feeds the input image into Qwen2.5-VL for visual semantic control and into a VAE encoder for appearance control, supporting both low-level appearance edits and higher-level semantic edits such as style transfer and object rotation.
This release is positioned by the developers as an enhanced version over the earlier Qwen-Image-Edit-2509 checkpoint, with stated improvements including mitigated image drift, improved character consistency, integrated LoRA capabilities, enhanced industrial-design generation, and strengthened geometric reasoning. Notably, Qwen-Image-Edit-2511 bakes selected popular community-created LoRAs directly into the base model, so their effects are available without extra tuning.
Compared with the foundational Qwen Image generation model and the Qwen Image 2 editing sibling, the 2511 build emphasizes consistency and identity preservation across edits. The model is open and accessible through Alibaba Cloud's DashScope API as well as third-party serverless endpoints.
This About section is AI-generated from public sources (Claude Opus 4.8), with no human editing. It may contain inaccuracies — verify critical details against the sources listed above.
Data sources: Venice API · HuggingFace · Wikipedia — enrichment updated 1d ago