{"id":779,"date":"2026-04-23T03:40:51","date_gmt":"2026-04-22T19:40:51","guid":{"rendered":"https:\/\/connectword.dpdns.org\/?p=779"},"modified":"2026-04-23T03:40:51","modified_gmt":"2026-04-22T19:40:51","slug":"alibaba-qwen-team-releases-qwen3-6-27b-a-dense-open-weight-model-outperforming-397b-moe-on-agentic-coding-benchmarks","status":"publish","type":"post","link":"https:\/\/connectword.dpdns.org\/?p=779","title":{"rendered":"Alibaba Qwen Team Releases Qwen3.6-27B: A Dense Open-Weight Model Outperforming 397B MoE on Agentic Coding Benchmarks"},"content":{"rendered":"<p>Alibaba\u2019s Qwen Team has released Qwen3.6-27B, the first dense open-weight model in the Qwen3.6 family \u2014 and arguably the most capable 27-billion-parameter model available today for coding agents. It brings substantial improvements in agentic coding, a novel Thinking Preservation mechanism, and a hybrid architecture that blends Gated DeltaNet linear attention with traditional self-attention \u2014 all under an Apache 2.0 license.<\/p>\n<p>The release comes weeks after the Qwen3.6-35B-A3B, a sparse Mixture-of-Experts (MoE) model with only 3B active parameters which itself followed the broader Qwen3.5 series. Qwen3.6-27B is the family\u2019s second model and the first fully dense variant \u2014 and on several key benchmarks, it actually outperforms both Qwen3.6-35B-A3B and the much larger Qwen3.5-397B-A17B MoE model. The Qwen team describes the release as prioritizing \u201cstability and real-world utility,\u201d shaped by direct community feedback rather than benchmark optimization.<\/p>\n<p>The Qwen team releases two weight variants on Hugging Face Hub: <code>Qwen\/Qwen3.6-27B<\/code> in BF16 and <code>Qwen\/Qwen3.6-27B-FP8<\/code>, a quantized version using fine-grained FP8 quantization with a block size of 128, with performance metrics nearly identical to the original model. Both variants are compatible with SGLang (&gt;=0.5.10), vLLM (&gt;=0.19.0), KTransformers, and Hugging Face Transformers.<\/p>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"1722\" height=\"918\" data-attachment-id=\"79223\" data-permalink=\"https:\/\/www.marktechpost.com\/2026\/04\/22\/alibaba-qwen-team-releases-qwen3-6-27b-a-dense-open-weight-model-outperforming-397b-moe-on-agentic-coding-benchmarks\/screenshot-2026-04-22-at-12-39-13-pm-2\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-22-at-12.39.13-PM-1.png\" data-orig-size=\"1722,918\" data-comments-opened=\"0\" data-image-meta='{\"aperture\":\"0\",\"credit\":\"\",\"camera\":\"\",\"caption\":\"\",\"created_timestamp\":\"0\",\"copyright\":\"\",\"focal_length\":\"0\",\"iso\":\"0\",\"shutter_speed\":\"0\",\"title\":\"\",\"orientation\":\"0\"}' data-image-title=\"Screenshot 2026-04-22 at 12.39.13\u202fPM\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-22-at-12.39.13-PM-1-1024x546.png\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-22-at-12.39.13-PM-1.png\" alt=\"\" class=\"wp-image-79223\" \/><figcaption class=\"wp-element-caption\">https:\/\/qwen.ai\/blog?id=qwen3.6-27b<\/figcaption><\/figure>\n<\/div>\n<h3 class=\"wp-block-heading\"><strong>What\u2019s New: Two Key Features<\/strong><\/h3>\n<p><strong>Agentic Coding<\/strong> is the first major upgrade. The model has been specifically optimized to handle frontend workflows and repository-level reasoning \u2014 tasks that require understanding a large codebase, navigating file structures, editing across multiple files, and producing consistent, runnable output. On QwenWebBench, an internal bilingual (EN\/CN) front-end code generation benchmark spanning seven categories \u2014 Web Design, Web Apps, Games, SVG, Data Visualization, Animation, and 3D \u2014 Qwen3.6-27B scores 1487, a significant jump from 1068 for Qwen3.5-27B and 1397 for Qwen3.6-35B-A3B. On NL2Repo, which tests repository-level code generation, the model scores 36.2 versus 27.3 for Qwen3.5-27B. On SWE-bench Verified \u2014 the community standard for autonomous software engineering agents \u2014 it reaches 77.2, up from 75.0, and competitive with Claude 4.5 Opus\u2019s 80.9.<\/p>\n<p><strong>Thinking Preservation<\/strong> is the second, and arguably more architecturally interesting, addition. By default, most LLMs only retain the chain-of-thought (CoT) reasoning generated for the current user message; reasoning from earlier turns is discarded. Qwen3.6 introduces a new option \u2014 enabled via <code>\"chat_template_kwargs\": {\"preserve_thinking\": True}<\/code> in the API \u2014 to retain and leverage thinking traces from historical messages across the entire conversation. For iterative agent workflows, this is practically significant: the model carries forward previous reasoning context rather than re-deriving it each turn. This can reduce overall token consumption by minimizing redundant reasoning and also improve KV cache utilization.<\/p>\n<h3 class=\"wp-block-heading\"><strong>Under the Hood: A Hybrid Architecture<\/strong><\/h3>\n<p>Qwen3.6-27B is a Causal Language Model with a Vision Encoder. It is natively multimodal, supporting text, image, and video inputs \u2014 trained through both pre-training and post-training stages. <\/p>\n<p>The model has 27B parameters distributed across 64 layers, with a hidden dimension of 5120 and a token embedding space of 248,320 (padded). The hidden layout follows a distinctive repeating pattern: 16 blocks, each structured as <code>3 \u00d7 (Gated DeltaNet \u2192 FFN) \u2192 1 \u00d7 (Gated Attention \u2192 FFN)<\/code>. This means three out of every four sublayers use Gated DeltaNet \u2014 a form of linear attention \u2014 with only every fourth sublayer using standard Gated Attention.<\/p>\n<p><strong>What is Gated DeltaNet?<\/strong> Traditional self-attention computes relationships between every token pair, which scales quadratically (O(n\u00b2)) with sequence length \u2014 expensive for long contexts. Linear attention mechanisms like DeltaNet approximate this with linear complexity (O(n)), making them significantly faster and more memory-efficient. Gated DeltaNet adds a gating mechanism on top, essentially learning when to update or retain information, similar in spirit to LSTM gating but applied to the attention computation. In Qwen3.6-27B, Gated DeltaNet sublayers use 48 linear attention heads for values (V) and 16 for queries and keys (QK), with a head dimension of 128.<\/p>\n<p>The Gated Attention sublayers use 24 attention heads for queries (Q) and only 4 for keys and values (KV) \u2014 a configuration that significantly reduces KV cache memory at inference time. These layers have a head dimension of 256 and use Rotary Position Embedding (RoPE) with a rotation dimension of 64. The FFN intermediate dimension is 17,408.<\/p>\n<p>The model also uses Multi-Token Prediction (MTP), trained with multi-steps. At inference time, this enables speculative decoding \u2014 where the model generates multiple candidate tokens simultaneously and verifies them in parallel \u2014 improving throughput without compromising quality.<\/p>\n<h3 class=\"wp-block-heading\"><strong>Context Window: 262K Native, 1M with YaRN<\/strong><\/h3>\n<p>Natively, Qwen3.6-27B supports a context length of 262,144 tokens \u2014 enough to hold a large codebase or a book-length document. For tasks exceeding this, the model supports YaRN (Yet another RoPE extension) scaling, extensible up to 1,010,000 tokens. The Qwen team advises keeping context at least 128K tokens to preserve the model\u2019s thinking capabilities.<\/p>\n<h3 class=\"wp-block-heading\"><strong>Benchmark Performance<\/strong><\/h3>\n<p>On agentic coding benchmarks, the gains over Qwen3.5-27B are substantial. SWE-bench Pro scores 53.5 versus 51.2 for Qwen3.5-27B and 50.9 for the much larger Qwen3.5-397B-A17B \u2014 meaning the 27B dense model exceeds a 397B MoE on this task. SWE-bench Multilingual scores 71.3 versus 69.3 for Qwen3.5-27B. Terminal-Bench 2.0, evaluated under a 3-hour timeout with 32 CPUs and 48 GB RAM, reaches 59.3 \u2014 matching Claude 4.5 Opus exactly, and outperforming Qwen3.6-35B-A3B (51.5). SkillsBench Avg5 shows the most striking gain: 48.2 versus 27.2 for Qwen3.5-27B, a 77% relative improvement, also well above Qwen3.6-35B-A3B\u2019s 28.7.<\/p>\n<p>On reasoning benchmarks, GPQA Diamond reaches 87.8 (up from 85.5), AIME26 hits 94.1 (up from 92.6), and LiveCodeBench v6 scores 83.9 (up from 80.7).<\/p>\n<p>Vision-language benchmarks show consistent parity or improvement over Qwen3.5-27B. VideoMME (with subtitles) reaches 87.7, AndroidWorld (visual agent benchmark) scores 70.3, and VlmsAreBlind \u2014 which probes for common visual understanding failure modes \u2014 scores 97.0.<\/p>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"1456\" height=\"1412\" data-attachment-id=\"79226\" data-permalink=\"https:\/\/www.marktechpost.com\/2026\/04\/22\/alibaba-qwen-team-releases-qwen3-6-27b-a-dense-open-weight-model-outperforming-397b-moe-on-agentic-coding-benchmarks\/screenshot-2026-04-22-at-12-40-26-pm\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-22-at-12.40.26-PM.png\" data-orig-size=\"1456,1412\" data-comments-opened=\"0\" data-image-meta='{\"aperture\":\"0\",\"credit\":\"\",\"camera\":\"\",\"caption\":\"\",\"created_timestamp\":\"0\",\"copyright\":\"\",\"focal_length\":\"0\",\"iso\":\"0\",\"shutter_speed\":\"0\",\"title\":\"\",\"orientation\":\"0\"}' data-image-title=\"Screenshot 2026-04-22 at 12.40.26\u202fPM\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-22-at-12.40.26-PM-1024x993.png\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-22-at-12.40.26-PM.png\" alt=\"\" class=\"wp-image-79226\" \/><figcaption class=\"wp-element-caption\">https:\/\/qwen.ai\/blog?id=qwen3.6-27b<\/figcaption><\/figure>\n<\/div>\n<h3 class=\"wp-block-heading\"><strong>Key Takeaways<\/strong><\/h3>\n<ul class=\"wp-block-list\">\n<li><strong>Qwen3.6-27B is Alibaba\u2019s first dense open-weight model in the Qwen3.6 family<\/strong>, built to prioritize real-world coding utility over benchmark performance \u2014 licensed under Apache 2.0.<\/li>\n<li><strong>The model introduces Thinking Preservation<\/strong>, a new feature that retains reasoning traces across conversation history, reducing redundant token generation and improving KV cache efficiency in multi-turn agent workflows.<\/li>\n<li><strong>Agentic coding performance is the key strength<\/strong> \u2014 Qwen3.6-27B scores 77.2 on SWE-bench Verified, 59.3 on Terminal-Bench 2.0 (matching Claude 4.5 Opus), and 1487 on QwenWebBench, outperforming both its predecessor Qwen3.5-27B and the larger Qwen3.5-397B-A17B MoE model on several tasks.<\/li>\n<li><strong>The architecture uses a hybrid Gated DeltaNet + Gated Attention layout<\/strong> across 64 layers \u2014 three out of every four sublayers use efficient linear attention (Gated DeltaNet), with Multi-Token Prediction (MTP) enabling speculative decoding at serving time.<\/li>\n<li><strong>Two weight variants are available on Hugging Face Hub<\/strong> \u2014 <code>Qwen3.6-27B<\/code> (BF16) and <code>Qwen3.6-27B-FP8<\/code> (fine-grained FP8 with block size 128) \u2014 both supporting SGLang, vLLM, KTransformers, and Hugging Face Transformers, with a native 262,144-token context window extensible to 1,010,000 tokens via YaRN.<\/li>\n<\/ul>\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-wide\" \/>\n<p>Check out\u00a0the<strong>\u00a0<a href=\"https:\/\/qwen.ai\/blog?id=qwen3.6-27b\" target=\"_blank\" rel=\"noreferrer noopener\">Technical details<\/a>, <a href=\"https:\/\/huggingface.co\/Qwen\/Qwen3.6-27B\" target=\"_blank\" rel=\"noreferrer noopener\">Qwen\/Qwen3.6-27B<\/a><\/strong> and <strong><a href=\"https:\/\/huggingface.co\/Qwen\/Qwen3.6-27B-FP8\" target=\"_blank\" rel=\"noreferrer noopener\">Qwen\/Qwen3.6-27B-FP8<\/a><\/strong>.<strong>\u00a0<\/strong>Also,\u00a0feel free to follow us on\u00a0<strong><a href=\"https:\/\/x.com\/intent\/follow?screen_name=marktechpost\" target=\"_blank\" rel=\"noreferrer noopener\"><mark>Twitter<\/mark><\/a><\/strong>\u00a0and don\u2019t forget to join our\u00a0<strong><a href=\"https:\/\/www.reddit.com\/r\/machinelearningnews\/\" target=\"_blank\" rel=\"noreferrer noopener\">130k+ ML SubReddit<\/a><\/strong>\u00a0and Subscribe to\u00a0<strong><a href=\"https:\/\/www.aidevsignals.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">our Newsletter<\/a><\/strong>. Wait! are you on telegram?\u00a0<strong><a href=\"https:\/\/t.me\/machinelearningresearchnews\" target=\"_blank\" rel=\"noreferrer noopener\">now you can join us on telegram as well.<\/a><\/strong><\/p>\n<p>Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.?\u00a0<strong><a href=\"https:\/\/forms.gle\/MTNLpmJtsFA3VRVd9\" target=\"_blank\" rel=\"noreferrer noopener\"><mark>Connect with us<\/mark><\/a><\/strong><\/p>\n<p>The post <a href=\"https:\/\/www.marktechpost.com\/2026\/04\/22\/alibaba-qwen-team-releases-qwen3-6-27b-a-dense-open-weight-model-outperforming-397b-moe-on-agentic-coding-benchmarks\/\">Alibaba Qwen Team Releases Qwen3.6-27B: A Dense Open-Weight Model Outperforming 397B MoE on Agentic Coding Benchmarks<\/a> appeared first on <a href=\"https:\/\/www.marktechpost.com\/\">MarkTechPost<\/a>.<\/p>","protected":false},"excerpt":{"rendered":"<p>Alibaba\u2019s Qwen Team has releas&hellip;<\/p>\n","protected":false},"author":1,"featured_media":780,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-779","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/posts\/779","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=779"}],"version-history":[{"count":0,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/posts\/779\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/media\/780"}],"wp:attachment":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=779"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=779"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=779"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}