{"id":1017,"date":"2026-06-02T17:07:48","date_gmt":"2026-06-02T09:07:48","guid":{"rendered":"https:\/\/connectword.dpdns.org\/?p=1017"},"modified":"2026-06-02T17:07:48","modified_gmt":"2026-06-02T09:07:48","slug":"alibabas-qwen-team-launches-qwen3-7-plus-adding-vision-deep-reasoning-tool-invocation-and-autonomous-iteration-on-the-bailian-platform","status":"publish","type":"post","link":"https:\/\/connectword.dpdns.org\/?p=1017","title":{"rendered":"Alibaba\u2019s Qwen Team Launches Qwen3.7-Plus, Adding Vision, Deep Reasoning, Tool Invocation, and Autonomous Iteration on the Bailian Platform"},"content":{"rendered":"<p class=\"wp-block-paragraph\">Alibaba\u2019s Qwen team has released <a href=\"https:\/\/qwen.ai\/blog?id=qwen3.7-plus\" target=\"_blank\" rel=\"noreferrer noopener\">Qwen3.7-Plus<\/a>. The model is now available through Alibaba Cloud\u2019s Bailian platform. Bailian is the console international users access as Model Studio. It offers API services to external developers. The release follows Alibaba\u2019s May unveiling of the Qwen3.7 generation.<\/p>\n<h2 class=\"wp-block-heading\"><strong>Qwen3.7-Plus<\/strong><\/h2>\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/modelstudio.console.alibabacloud.com\/ap-southeast-1?tab=doc#\/doc\/?type=model&amp;url=2840914_2&amp;modelId=qwen3.7-plus&amp;serviceSite=international\" target=\"_blank\" rel=\"noreferrer noopener\">Qwen3.7-Plus<\/a> is a multimodal large language model. The model understands images and video, alongside written prompts. Its sibling, Qwen3.7-Max, is text-only.<\/p>\n<p class=\"wp-block-paragraph\">This is visual understanding, not generation. The model reads images and video; it does not create them. Alibaba\u2019s image and video generation work sits in separate model families.<\/p>\n<p class=\"wp-block-paragraph\">Alibaba team describes the release as a step in multimodal hybrid agent technology. An agent is a model that plans and acts across steps. Building on image and video understanding, Qwen3.7-Plus adds five abilities. <strong>These are deep reasoning, self-programming, tool invocation, verification and testing, and autonomous iteration.<\/strong><\/p>\n<p class=\"wp-block-paragraph\">Self-programming means the model writes and revises its own code. Tool invocation means it calls external functions or APIs. Verification and testing means it runs outputs and checks results. Autonomous iteration means it loops until the task is done. Together, they describe a model built to act, not just answer.<\/p>\n<h2 class=\"wp-block-heading\"><strong>The Vision Case<\/strong><\/h2>\n<p class=\"wp-block-paragraph\">Qwen3.7-Plus is the multimodal half of the 3.7 family. Its preview already posted measurable vision results. In Vision Arena, Qwen3.7-Plus-Preview ranked #16 overall. That placed Alibaba as the #5 lab in vision. The model rank and the lab rank are separate figures.<\/p>\n<p class=\"wp-block-paragraph\">Vision Arena is a neutral leaderboard run by LM Arena. Users vote on image-understanding answers in blind matchups. The #16 result sits behind the top US labs, but inside the field. For image-heavy work, this is the signal that matters. Think OCR at scale, chart reading, or video-frame analysis.<\/p>\n<p class=\"wp-block-paragraph\">The text-only Max sibling anchors the generation\u2019s reasoning. Max scored 56.6 on the Artificial Analysis Intelligence Index. That was the highest placement for a Chinese model at release. <\/p>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"1918\" height=\"1122\" data-attachment-id=\"80253\" data-permalink=\"https:\/\/www.marktechpost.com\/2026\/06\/02\/alibabas-qwen-team-launches-qwen3-7-plus-adding-vision-deep-reasoning-tool-invocation-and-autonomous-iteration-on-the-bailian-platform\/screenshot-2026-06-02-at-2-02-16-am\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/06\/Screenshot-2026-06-02-at-2.02.16-AM.png\" data-orig-size=\"1918,1122\" data-comments-opened=\"0\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;,&quot;alt&quot;:&quot;&quot;}\" data-image-title=\"Screenshot 2026-06-02 at 2.02.16\u202fAM\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/06\/Screenshot-2026-06-02-at-2.02.16-AM-1024x599.png\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/06\/Screenshot-2026-06-02-at-2.02.16-AM.png\" alt=\"\" class=\"wp-image-80253\" \/><figcaption class=\"wp-element-caption\">https:\/\/qwen.ai\/blog?id=qwen3.7-plus<\/figcaption><\/figure>\n<\/div>\n<h2 class=\"wp-block-heading\"><strong>The Agentic Loop<\/strong><\/h2>\n<p class=\"wp-block-paragraph\">The clear shift in Qwen3.7 is its agentic focus. Alibaba team is positioning the models for long-running tasks. Bailian, the host platform, adds two relevant pieces.<\/p>\n<p class=\"wp-block-paragraph\">The first is an Agentic RL (reinforcement learning) mechanism. The platform uses real-world execution feedback to refine model accuracy over time. The second is a set of built-in safety guardrails. These keep autonomous tools inside preset operational limits. That detail matters when an agent runs commands or edits files.<\/p>\n<h2 class=\"wp-block-heading\"><strong>Marktechpost\u2019s Visual Explainer<\/strong><\/h2>\n<div class=\"mtp-root\">\n<div class=\"mtp-card\">\n<div class=\"mtp-bar\">\n      <span class=\"mtp-brand\">AI Models \u00b7 Field Guide<\/span><br \/>\n      <span class=\"mtp-count\"><b>1<\/b> \/ 7<\/span>\n    <\/div>\n<div class=\"mtp-stage\">\n<p>      <!-- 1 --><\/p>\n<section class=\"mtp-slide is-on\">\n<div class=\"mtp-eyebrow\">Alibaba Qwen \u00b7 June 2, 2026<\/div>\n<h1 class=\"mtp-title\">Qwen3.7-Plus<small>Alibaba\u2019s multimodal agent model, now on Bailian<\/small><\/h1>\n<div class=\"mtp-meta\">A multimodal large language model with <b>image and video understanding<\/b>, deep reasoning, and agentic features. Available via API on Alibaba Cloud\u2019s <b>Bailian<\/b> platform, accessed internationally as <b>Model Studio<\/b>.<\/div>\n<div class=\"mtp-hint\">Use the arrows or swipe to explore \u2192<\/div>\n<\/section>\n<p>      <!-- 2 --><\/p>\n<section class=\"mtp-slide\">\n<div class=\"mtp-kick\">01 \u00b7 What it is<\/div>\n<h2 class=\"mtp-h\">A multimodal large language model<\/h2>\n<ul class=\"mtp-list\">\n<li><b>Multimodal<\/b> \u2014 <span>it reads images and video, alongside text input.<\/span><\/li>\n<li>Visual <b>understanding, not generation<\/b> <span>\u2014 it reads media, it does not create it.<\/span><\/li>\n<li>The multimodal sibling to the <b>text-only Qwen3.7-Max<\/b>.<\/li>\n<li>Alibaba describes it as <b>multimodal hybrid agent<\/b> technology.<\/li>\n<\/ul>\n<\/section>\n<p>      <!-- 3 --><\/p>\n<section class=\"mtp-slide\">\n<div class=\"mtp-kick\">02 \u00b7 Capabilities<\/div>\n<h2 class=\"mtp-h\">Five abilities beyond seeing<\/h2>\n<ul class=\"mtp-list\">\n<li><b>Deep reasoning<\/b> <span>\u2014 works through problems step by step.<\/span><\/li>\n<li><b>Self-programming<\/b> <span>\u2014 writes and revises its own code.<\/span><\/li>\n<li><b>Tool invocation<\/b> <span>\u2014 calls external functions or APIs.<\/span><\/li>\n<li><b>Verification and testing<\/b> <span>\u2014 runs outputs and checks results.<\/span><\/li>\n<li><b>Autonomous iteration<\/b> <span>\u2014 loops until the task is done.<\/span><\/li>\n<\/ul>\n<\/section>\n<p>      <!-- 4 --><\/p>\n<section class=\"mtp-slide\">\n<div class=\"mtp-kick\">03 \u00b7 Vision benchmarks<\/div>\n<h2 class=\"mtp-h\">Where it stands on vision<\/h2>\n<ul class=\"mtp-list\">\n<li>The preview ranked <b>#16 overall<\/b> in Vision Arena (LM Arena).<\/li>\n<li>That placed Alibaba as the <b>#5 lab<\/b> in vision.<\/li>\n<li>Model rank and lab rank are <b>separate figures<\/b>.<\/li>\n<li>Relevant for OCR, chart reading, and video-frame analysis.<\/li>\n<\/ul>\n<div class=\"mtp-note\">For reference, the text-only Max sibling scored <b>56.6<\/b> on the Artificial Analysis Intelligence Index, the highest Chinese model at release.<\/div>\n<\/section>\n<p>      <!-- 5 --><\/p>\n<section class=\"mtp-slide\">\n<div class=\"mtp-kick\">04 \u00b7 The agentic loop<\/div>\n<h2 class=\"mtp-h\">Built for long-running tasks<\/h2>\n<ul class=\"mtp-list\">\n<li>Bailian adds an <b>Agentic RL<\/b> (reinforcement learning) mechanism.<\/li>\n<li>It uses <b>real-world execution feedback<\/b> to refine accuracy.<\/li>\n<li>Built-in <b>safety guardrails<\/b> keep autonomous tools within limits.<\/li>\n<li>That matters when an agent runs commands or edits files.<\/li>\n<\/ul>\n<\/section>\n<p>      <!-- 6 --><\/p>\n<section class=\"mtp-slide\">\n<div class=\"mtp-kick\">05 \u00b7 Confirmed vs unconfirmed<\/div>\n<h2 class=\"mtp-h\">What we know today<\/h2>\n<div class=\"mtp-cols\">\n<div class=\"mtp-col ok\">\n<h3>Confirmed<\/h3>\n<ul>\n<li>Image and video understanding<\/li>\n<li>Agentic feature set<\/li>\n<li>Bailian API access<\/li>\n<li>Proprietary, API-only<\/li>\n<\/ul><\/div>\n<div class=\"mtp-col\">\n<h3>Not yet published<\/h3>\n<ul>\n<li>Public price sheet<\/li>\n<li>Context window size<\/li>\n<li>Output token limits<\/li>\n<li>Open weights<\/li>\n<\/ul><\/div>\n<\/div>\n<\/section>\n<p>      <!-- 7 --><\/p>\n<section class=\"mtp-slide\">\n<div class=\"mtp-kick\">06 \u00b7 Why it matters<\/div>\n<h2 class=\"mtp-h\">The practical read<\/h2>\n<ul class=\"mtp-list\">\n<li>A <b>vision-capable agent backend<\/b> through one API.<\/li>\n<li>Suits workloads mixing images, video, and tool use.<\/li>\n<li>A leaderboard rank shows <b>promise, not a guarantee<\/b>.<\/li>\n<li>Validate accuracy on your own data before committing.<\/li>\n<\/ul>\n<\/section><\/div>\n<div class=\"mtp-nav\">\n<div class=\"mtp-dots\"><\/div>\n<div class=\"mtp-arrows\">\n        <button class=\"mtp-btn\" aria-label=\"Previous slide\">\u2039<\/button><br \/>\n        <button class=\"mtp-btn\" aria-label=\"Next slide\">\u203a<\/button>\n      <\/div>\n<\/div>\n<\/div>\n<div class=\"mtp-tag\">\n    <span class=\"mtp-logo\">Marktech<span>post<\/span><\/span><br \/>\n    <span class=\"mtp-tag-txt\">AI research, news, and developer signal for engineers and data scientists. Read more at <a href=\"https:\/\/www.marktechpost.com\/\" target=\"_blank\" rel=\"noopener\">marktechpost.com<\/a>.<\/span>\n  <\/div>\n<\/div>\n<h2 class=\"wp-block-heading\"><strong>Key Takeaways<\/strong><\/h2>\n<ul class=\"wp-block-list\">\n<li>Alibaba released Qwen3.7-Plus, a multimodal model now available via API on its Bailian platform (Model Studio).<\/li>\n<li>It understands images and video as input \u2014 understanding, not generation \u2014 and adds agentic features.<\/li>\n<li>Capabilities include deep reasoning, self-programming, tool invocation, verification and testing, and autonomous iteration.<\/li>\n<li>Its preview ranked #16 in Vision Arena, making Alibaba the #5 lab in vision.<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n<\/p><p class=\"wp-block-paragraph\">\n<\/p><p class=\"wp-block-paragraph\">Check out\u00a0the\u00a0<strong><a href=\"https:\/\/qwen.ai\/blog?id=qwen3.7-plus\" target=\"_blank\" rel=\"noreferrer noopener\">Technical details<\/a>.\u00a0<\/strong>Also,\u00a0feel free to follow us on\u00a0<strong><a href=\"https:\/\/x.com\/intent\/follow?screen_name=marktechpost\" target=\"_blank\" rel=\"noreferrer noopener\"><mark>Twitter<\/mark><\/a><\/strong>\u00a0and don\u2019t forget to join our\u00a0<strong><a href=\"https:\/\/www.reddit.com\/r\/machinelearningnews\/\" target=\"_blank\" rel=\"noreferrer noopener\">150k+ ML SubReddit<\/a><\/strong>\u00a0and Subscribe to\u00a0<strong><a href=\"https:\/\/www.aidevsignals.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">our Newsletter<\/a><\/strong>. Wait! are you on telegram?\u00a0<strong><a href=\"https:\/\/t.me\/machinelearningresearchnews\" target=\"_blank\" rel=\"noreferrer noopener\">now you can join us on telegram as well.<\/a><\/strong><\/p>\n<p class=\"wp-block-paragraph\">Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.?\u00a0<strong><a href=\"https:\/\/forms.gle\/wbash1wF6efRj8G58\" target=\"_blank\" rel=\"noreferrer noopener\"><mark>Connect with us<\/mark><\/a><\/strong><\/p>\n<p>The post <a href=\"https:\/\/www.marktechpost.com\/2026\/06\/02\/alibabas-qwen-team-launches-qwen3-7-plus-adding-vision-deep-reasoning-tool-invocation-and-autonomous-iteration-on-the-bailian-platform\/\">Alibaba\u2019s Qwen Team Launches Qwen3.7-Plus, Adding Vision, Deep Reasoning, Tool Invocation, and Autonomous Iteration on the Bailian Platform<\/a> appeared first on <a href=\"https:\/\/www.marktechpost.com\/\">MarkTechPost<\/a>.<\/p>","protected":false},"excerpt":{"rendered":"<p>Alibaba\u2019s Qwen team has releas&hellip;<\/p>\n","protected":false},"author":1,"featured_media":1018,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-1017","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/posts\/1017","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1017"}],"version-history":[{"count":0,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/posts\/1017\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/media\/1018"}],"wp:attachment":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1017"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1017"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1017"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}