AlibabaAlibaba·🖌️ Inpainting

Qwen Image 2

anonymized
Try on Venice.ai ↗
Quick reference
Qwen Image 2 — TLDR
  • - 🆕 Editing variant of Alibaba's Qwen Image 2 family, released 2026
  • - 🔧 High-fidelity inpainting: object insertion, removal, and replacement
  • - 🎯 Mask and instruction-driven edits that preserve surrounding composition
  • - 📚 Strong in-image text rendering across English and Chinese typography
  • - 👁️ Dual-path design: semantic understanding plus appearance fidelity
  • - 💬 Natural-language editing instructions for low- and high-level changes
  • - 🏢 Built by Alibaba's Qwen team, sibling to a Pro edit tier
💰 Pricing
$0.050
per edit
📅 On Venice since
Mar 4, 2026
92 days ago
Provider

Alibaba Group is a Chinese multinational technology company founded in 1999 and headquartered in Hangzhou, Zhejiang. Originally built around e-commerce and cloud computing, Alibaba has become one of the most prolific contributors to open-weight AI research,…

Read full profile →
46 models on Venice
17 text · 16 video · 5 image · 4 inpaint · 2 embedding · 2 tts
Since Jan 11, 2025

About this model

Qwen Image 2 (edit) is the inpainting and image-editing member of Alibaba's Qwen Image 2 family, released in March 2026. It pairs with the text-to-image model Qwen Image 2 and the higher-fidelity Qwen Image 2 Pro edit tier, focusing on mask-guided inpainting, object editing, and text manipulation while keeping the rest of the frame intact.

Architecturally, the Qwen Image editing line feeds an input image into a vision-language encoder for semantic control and into a VAE encoder for appearance fidelity, enabling both low-level edits — adding, removing, or modifying elements — and high-level semantic changes such as style transfer.

Relative to its same-family predecessor, the original editing model Qwen Edit 2511 was built on the 20B Qwen-Image backbone that introduced precise bilingual text editing. Qwen Image 2 (edit) continues that lineage as the editing arm of the second-generation family, carrying forward instruction-driven inpainting in a refreshed release.

The line continues to emphasize in-image text rendering across English and Chinese, a capability documented for the editing model and carried forward from the first-generation Qwen Image model, with accurate typography and style matching across languages. Note that widely circulated benchmark figures for this release come from third-party blogs rather than Alibaba's own materials, so they are omitted here pending primary confirmation.

This About section is AI-generated from public sources (Claude Opus 4.8), with no human editing. It may contain inaccuracies — verify critical details against the sources listed above.

Data sources: Venice API · HuggingFace · Wikipedia — enrichment updated 1d ago