{"id":406,"date":"2026-02-13T07:24:03","date_gmt":"2026-02-12T23:24:03","guid":{"rendered":"https:\/\/connectword.dpdns.org\/?p=406"},"modified":"2026-02-13T07:24:03","modified_gmt":"2026-02-12T23:24:03","slug":"openai-releases-a-research-preview-of-gpt-5-3-codex-spark-a-15x-faster-ai-coding-model-delivering-over-1000-tokens-per-second-on-cerebras-hardware","status":"publish","type":"post","link":"https:\/\/connectword.dpdns.org\/?p=406","title":{"rendered":"OpenAI Releases a Research Preview of GPT\u20115.3-Codex-Spark: A 15x Faster AI Coding Model Delivering Over 1000 Tokens Per Second on Cerebras Hardware"},"content":{"rendered":"<p>OpenAI just launched a new research preview called <strong>GPT-5.3 Codex-Spark<\/strong>. This model is built for 1 thing: extreme speed. While the standard GPT-5.3 Codex focuses on deep reasoning, Spark is designed for near-instant response times. It is the result of a deep hardware-software integration between OpenAI and Cerebras.<\/p>\n<p>The results are game-changing. Spark is <strong>15x faster<\/strong> than the flagship GPT-5.3 Codex. It consistently delivers over <strong>1000 tokens per second<\/strong>. This speed effectively removes the delay between a developer\u2019s thought and the model\u2019s code output.<\/p>\n<h3 class=\"wp-block-heading\"><strong>The Hardware: Wafer-Scale Engineering<\/strong><\/h3>\n<p>The massive performance jump is powered by the <strong>Cerebras Wafer-Scale Engine 3 (WSE-3)<\/strong>. Traditional AI models run on clusters of small GPUs. These GPUs must communicate to each other over cables, which creates a \u2018bottleneck.\u2019 This bottleneck slows down the speed of the model.<\/p>\n<p>The <strong>WSE-3<\/strong> is different. It is a single, giant chip the size of a whole silicon wafer. Because the entire model lives on 1 piece of silicon, there are no cables to slow it down. <strong>This architecture provides:<\/strong><\/p>\n<ul class=\"wp-block-list\">\n<li>Massive on-chip memory.<\/li>\n<li>Ultra-high bandwidth.<\/li>\n<li>Low-latency compute.<\/li>\n<\/ul>\n<p>By using the <strong>Cerebras CS-3 system<\/strong>, OpenAI can run inference at speeds that traditional GPU clusters cannot reach.<\/p>\n<h3 class=\"wp-block-heading\"><strong>Software Optimizations and Low Latency<\/strong><\/h3>\n<p>Speed is not just about the chip. OpenAI re-engineered the way the model communicates with your computer. They moved away from traditional request methods and introduced a <strong>persistent WebSocket connection<\/strong>.<\/p>\n<p><strong>This change leads to several technical improvements:<\/strong><\/p>\n<ol start=\"1\" class=\"wp-block-list\">\n<li><strong>Round-Trip Time (RTT):<\/strong> Client-server overhead is reduced by <strong>80%<\/strong>.<\/li>\n<li><strong>Time-to-First-Token (TTFT):<\/strong> This is improved by <strong>50%<\/strong>, meaning the code starts appearing almost the moment you hit enter.<\/li>\n<li><strong>Per-Token Overhead:<\/strong> Internal processing time per token is cut by <strong>30%<\/strong>.<\/li>\n<\/ol>\n<p>These optimizations allow for \u2018Real-Time Steering.\u2019 You can interrupt the model while it is typing and redirect its logic without waiting for the full block to finish.<\/p>\n<h3 class=\"wp-block-heading\"><strong>The Trade-offs: Speed vs. Reasoning<\/strong><\/h3>\n<p>GPT-5.3 Codex-Spark is optimized for throughput, not deep complexity. It is a \u2018smaller\u2019 model than the flagship GPT-5.3 Codex. Because of this, it has lower reasoning depth.<\/p>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"1598\" height=\"1148\" data-attachment-id=\"77863\" data-permalink=\"https:\/\/www.marktechpost.com\/2026\/02\/12\/openai-releases-a-research-preview-of-gpt-5-3-codex-spark-a-15x-faster-ai-coding-model-delivering-over-1000-tokens-per-second-on-cerebras-hardware\/swe-bench-pro-2\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/02\/SWE-Bench-Pro-1.png\" data-orig-size=\"1598,1148\" data-comments-opened=\"1\" data-image-meta='{\"aperture\":\"0\",\"credit\":\"\",\"camera\":\"\",\"caption\":\"\",\"created_timestamp\":\"0\",\"copyright\":\"\",\"focal_length\":\"0\",\"iso\":\"0\",\"shutter_speed\":\"0\",\"title\":\"\",\"orientation\":\"0\"}' data-image-title=\"SWE-Bench Pro\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/02\/SWE-Bench-Pro-1-300x216.png\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/02\/SWE-Bench-Pro-1-1024x736.png\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/02\/SWE-Bench-Pro-1.png\" alt=\"\" class=\"wp-image-77863\" \/><figcaption class=\"wp-element-caption\">https:\/\/openai.com\/index\/introducing-gpt-5-3-codex-spark\/<\/figcaption><\/figure>\n<\/div>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"756\" height=\"878\" data-attachment-id=\"77865\" data-permalink=\"https:\/\/www.marktechpost.com\/2026\/02\/12\/openai-releases-a-research-preview-of-gpt-5-3-codex-spark-a-15x-faster-ai-coding-model-delivering-over-1000-tokens-per-second-on-cerebras-hardware\/terminal-bench-2-0-2\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/02\/Terminal-Bench-2.0-1.png\" data-orig-size=\"756,878\" data-comments-opened=\"1\" data-image-meta='{\"aperture\":\"0\",\"credit\":\"\",\"camera\":\"\",\"caption\":\"\",\"created_timestamp\":\"0\",\"copyright\":\"\",\"focal_length\":\"0\",\"iso\":\"0\",\"shutter_speed\":\"0\",\"title\":\"\",\"orientation\":\"0\"}' data-image-title=\"Terminal-Bench 2.0\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/02\/Terminal-Bench-2.0-1-258x300.png\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/02\/Terminal-Bench-2.0-1.png\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/02\/Terminal-Bench-2.0-1.png\" alt=\"\" class=\"wp-image-77865\" \/><figcaption class=\"wp-element-caption\">https:\/\/openai.com\/index\/introducing-gpt-5-3-codex-spark\/<\/figcaption><\/figure>\n<\/div>\n<p><strong>Devs should be aware of these performance differences:<\/strong><\/p>\n<ul class=\"wp-block-list\">\n<li><strong>Benchmarks:<\/strong> Spark scores lower on <strong>SWE-Bench Pro<\/strong> and <strong>Terminal-Bench 2.0<\/strong> compared to the flagship model. It may struggle with very complex, multi-file architecture changes.<\/li>\n<li><strong>Security:<\/strong> Under OpenAI\u2019s <strong>Preparedness Framework<\/strong>, the flagship GPT-5.3 Codex is rated as <strong>\u2018High\u2019 capability<\/strong> for cybersecurity. <strong>Spark does not meet this high threshold.<\/strong> It should not be used for sensitive security logic or autonomous authentication tasks.<\/li>\n<\/ul>\n<h3 class=\"wp-block-heading\"><strong>Quick Specs and Access<\/strong><\/h3>\n<p>Spark is available now for <strong>ChatGPT Pro<\/strong> users and developers. You can access it through the following tools:<\/p>\n<ul class=\"wp-block-list\">\n<li><strong>Codex App:<\/strong> Use the model picker to select \u2018Spark.\u2019<\/li>\n<li><strong>VS Code Extension:<\/strong> Integrated directly into the composer.<\/li>\n<li><strong>CLI:<\/strong> Access it via the command <code>codex --model gpt-5.3-codex-spark<\/code>.<\/li>\n<\/ul>\n<figure class=\"wp-block-table is-style-stripes\">\n<table class=\"has-fixed-layout\">\n<thead>\n<tr>\n<td><strong>Feature<\/strong><\/td>\n<td><strong>GPT-5.3 Codex-Spark<\/strong><\/td>\n<td><strong>GPT-5.3 Codex (Flagship)<\/strong><\/td>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td><strong>Tokens per Second<\/strong><\/td>\n<td><strong>1000+<\/strong><\/td>\n<td>~70<\/td>\n<\/tr>\n<tr>\n<td><strong>Context Window<\/strong><\/td>\n<td><strong>128k<\/strong><\/td>\n<td>128k<\/td>\n<\/tr>\n<tr>\n<td><strong>Hardware<\/strong><\/td>\n<td><strong>Cerebras WSE-3<\/strong><\/td>\n<td>NVIDIA GPU Clusters<\/td>\n<\/tr>\n<tr>\n<td><strong>Best For<\/strong><\/td>\n<td>Fast Iteration<\/td>\n<td>Deep Reasoning \/ Security<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/figure>\n<h3 class=\"wp-block-heading\"><strong>Key Takeaways<\/strong><\/h3>\n<ul class=\"wp-block-list\">\n<li><strong>Great Speed:<\/strong> Spark is <strong>15x faster<\/strong> than the flagship GPT-5.3 Codex, delivering an unprecedented throughput of over <strong>1,000 tokens per second<\/strong> to enable near-instant code generation.<\/li>\n<li><strong>Custom Silicon Infrastructure:<\/strong> This is OpenAI\u2019s first model to run on <strong>Cerebras Wafer-Scale Engine 3 (WSE-3)<\/strong> hardware rather than traditional NVIDIA GPUs, using \u2018wafer-scale\u2019 memory to eliminate data bottlenecks.<\/li>\n<li><strong>Drastic Latency Reduction:<\/strong> The integration of a <strong>persistent WebSocket connection<\/strong> reduces client-server round-trip overhead by <strong>80%<\/strong> and improves the time-to-first-token by <strong>50%<\/strong>.<\/li>\n<li><strong>Real-Time Steering:<\/strong> Designed for \u2018micro-iterations,\u2019 the model\u2019s speed allows developers to <strong>interrupt and redirect<\/strong> logic in real-time, shifting the workflow from batch-processing to live pair-programming.<\/li>\n<li><strong>Targeted Capability Trade-offs:<\/strong> While faster, Spark has lower reasoning depth than the flagship model and does <strong>not<\/strong> meet the \u2018High capability\u2019 threshold for cybersecurity in OpenAI\u2019s Preparedness Framework, making it unsuitable for sensitive auth or security tasks.<\/li>\n<\/ul>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n<p>Check out the\u00a0<strong><a href=\"https:\/\/openai.com\/index\/introducing-gpt-5-3-codex-spark\/\" target=\"_blank\" rel=\"noreferrer noopener\">Technical details here<\/a>.\u00a0<\/strong>Also,\u00a0feel free to follow us on\u00a0<strong><a href=\"https:\/\/x.com\/intent\/follow?screen_name=marktechpost\" target=\"_blank\" rel=\"noreferrer noopener\"><mark>Twitter<\/mark><\/a><\/strong>\u00a0and don\u2019t forget to join our\u00a0<strong><a href=\"https:\/\/www.reddit.com\/r\/machinelearningnews\/\" target=\"_blank\" rel=\"noreferrer noopener\">100k+ ML SubReddit<\/a><\/strong>\u00a0and Subscribe to\u00a0<strong><a href=\"https:\/\/www.aidevsignals.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">our Newsletter<\/a><\/strong>. Wait! are you on telegram?\u00a0<strong><a href=\"https:\/\/t.me\/machinelearningresearchnews\" target=\"_blank\" rel=\"noreferrer noopener\">now you can join us on telegram as well.<\/a><\/strong><\/p>\n<p>The post <a href=\"https:\/\/www.marktechpost.com\/2026\/02\/12\/openai-releases-a-research-preview-of-gpt-5-3-codex-spark-a-15x-faster-ai-coding-model-delivering-over-1000-tokens-per-second-on-cerebras-hardware\/\">OpenAI Releases a Research Preview of GPT\u20115.3-Codex-Spark: A 15x Faster AI Coding Model Delivering Over 1000 Tokens Per Second on Cerebras Hardware<\/a> appeared first on <a href=\"https:\/\/www.marktechpost.com\/\">MarkTechPost<\/a>.<\/p>","protected":false},"excerpt":{"rendered":"<p>OpenAI just launched a new res&hellip;<\/p>\n","protected":false},"author":1,"featured_media":407,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-406","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/posts\/406","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=406"}],"version-history":[{"count":0,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/posts\/406\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/media\/407"}],"wp:attachment":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=406"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=406"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=406"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}