{"id":991,"date":"2026-05-28T01:09:38","date_gmt":"2026-05-27T17:09:38","guid":{"rendered":"https:\/\/connectword.dpdns.org\/?p=991"},"modified":"2026-05-28T01:09:38","modified_gmt":"2026-05-27T17:09:38","slug":"nvidia-releases-polar-a-token-faithful-rollout-framework-for-grpo-training-across-codex-claude-code-and-qwen-code","status":"publish","type":"post","link":"https:\/\/connectword.dpdns.org\/?p=991","title":{"rendered":"NVIDIA Releases Polar, a Token-Faithful Rollout Framework for GRPO Training Across Codex, Claude Code, and Qwen Code"},"content":{"rendered":"<p class=\"wp-block-paragraph\">Reinforcement learning for language agents is growing more complex. Agents now manage multi-turn tool use, long-running contexts, and multi-agent orchestration. The main engineering challenge is connecting existing agent software to training pipelines without breaking how those tools work.<\/p>\n<p class=\"wp-block-paragraph\">NVIDIA\u2019s research team introduced <strong><a href=\"https:\/\/arxiv.org\/pdf\/2605.24220\" target=\"_blank\" rel=\"noreferrer noopener\">Polar<\/a><\/strong>, a rollout framework that lets researchers run reinforcement learning over any agent harness without modifying that harness.<\/p>\n<h2 class=\"wp-block-heading\"><strong>The Core Problem Polar Solves<\/strong><\/h2>\n<p class=\"wp-block-paragraph\">An \u2018agent harness\u2019 is a tool like Codex CLI, Claude Code, Qwen Code, or Pi. These harnesses manage system prompts, tool formatting, context engineering, and how the agent submits patches. These details directly affect agent behavior at evaluation time.<\/p>\n<p class=\"wp-block-paragraph\">Traditional RL infrastructure requires harness logic to be rewritten behind a framework-owned environment API \u2014 typically <code>env.init()<\/code>, <code>env.step()<\/code>, <code>env.reset()<\/code> in the OpenAI Gym style. Every new harness requires new integration code. That integration can also lose execution details specific to the native harness path.<\/p>\n<p class=\"wp-block-paragraph\">Polar\u2019s key observation is that every LLM-based agent must call a model. That model API boundary is a common interface outside the agent itself. Instead of integrating inside the harness, Polar places a proxy at that boundary.<\/p>\n<h2 class=\"wp-block-heading\"><strong>How the Proxy Works<\/strong><\/h2>\n<p class=\"wp-block-paragraph\"><strong>For each incoming model request, the gateway proxy performs four steps:<\/strong><\/p>\n<ol class=\"wp-block-list\">\n<li><strong>Detect the provider API<\/strong> \u2014 using the request path and headers, it distinguishes Anthropic Messages, OpenAI Chat Completions, OpenAI Responses, and Google generateContent-style calls.<\/li>\n<li><strong>Normalize the request<\/strong> \u2014 converts roles, content parts, tool definitions, and generation parameters into the OpenAI Chat Completions shape used by the local inference server.<\/li>\n<li><strong>Capture token-level data<\/strong> \u2014 stores request messages, response messages, prompt token IDs, sampled response token IDs, finish reason, and log probabilities.<\/li>\n<li><strong>Return the provider shape<\/strong> \u2014 transforms the response back into the schema the harness expects.<\/li>\n<\/ol>\n<p class=\"wp-block-paragraph\">For streaming requests, Polar obtains a non-streaming upstream response and emits a synthetic provider-shaped stream. This preserves compatibility with harnesses that expect server-sent events while ensuring complete token capture.<\/p>\n<p class=\"wp-block-paragraph\">The only required change to an existing harness is pointing its model base URL at the gateway.<\/p>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"1334\" height=\"872\" data-attachment-id=\"80140\" data-permalink=\"https:\/\/www.marktechpost.com\/2026\/05\/27\/nvidia-releases-polar-a-token-faithful-rollout-framework-for-grpo-training-across-codex-claude-code-and-qwen-code\/screenshot-2026-05-27-at-10-08-51-am-2\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/05\/Screenshot-2026-05-27-at-10.08.51-AM-1.png\" data-orig-size=\"1334,872\" data-comments-opened=\"0\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;,&quot;alt&quot;:&quot;&quot;}\" data-image-title=\"Screenshot 2026-05-27 at 10.08.51\u202fAM\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/05\/Screenshot-2026-05-27-at-10.08.51-AM-1-1024x669.png\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/05\/Screenshot-2026-05-27-at-10.08.51-AM-1.png\" alt=\"\" class=\"wp-image-80140\" \/><figcaption class=\"wp-element-caption\">https:\/\/arxiv.org\/pdf\/2605.24220<\/figcaption><\/figure>\n<\/div>\n<h2 class=\"wp-block-heading\"><strong>Architecture: Rollout Server and Gateway Nodes<\/strong><\/h2>\n<p class=\"wp-block-paragraph\"><strong>Polar has two core components<\/strong>:<\/p>\n<p class=\"wp-block-paragraph\">The <strong>rollout server<\/strong> accepts a <code>TaskRequest<\/code> and expands it into <code>num_samples<\/code> independent sessions. Each session carries a session ID, task ID, timeout budget, runtime specification, agent specification, trajectory builder, evaluator, and callback URL. The server dispatches sessions to gateway nodes and accepts callbacks when sessions complete.<\/p>\n<p class=\"wp-block-paragraph\"><strong>Gateway nodes<\/strong> own the lifecycle of each session \u2014 starting the runtime, running the harness, building trajectories, evaluating output, and teardown. The gateway also hosts the proxy endpoint for that session\u2019s model calls, keeping completion capture tied to the session registry.<\/p>\n<p class=\"wp-block-paragraph\">Within each gateway, isolated worker pools handle INIT, RUNNING, and POSTRUN stages. A bounded READY buffer holds initialized runtimes until a run slot is available. CPU-heavy runtime preparation and evaluator prewarm proceed off the critical path, without blocking active GPU-bound agent execution. If a harness times out after model calls have been captured, the gateway still enters POSTRUN so partial traces can be recovered.<\/p>\n<p class=\"wp-block-paragraph\">Built-in evaluators include a session-completion reward, a configurable test-on-output evaluator, and a SWE-Bench\/SWE-Gym harness evaluator. Custom evaluators can be added through a registry interface.<\/p>\n<p class=\"wp-block-paragraph\">Polar currently supports Docker and rootless Apptainer runtimes. Built-in harness shortcuts include <code>codex<\/code>, <code>claude_code<\/code>, <code>gemini_cli<\/code>, <code>qwen_code<\/code>, <code>opencode<\/code>, and <code>pi<\/code>. <\/p>\n<h2 class=\"wp-block-heading\"><strong>Trajectory Reconstruction: Per Request vs. Prefix Merging<\/strong><\/h2>\n<p class=\"wp-block-paragraph\">After a session completes, Polar reconstructs trainable trajectories from captured model calls. <\/p>\n<p class=\"wp-block-paragraph\"><strong>Two strategies are available:<\/strong><\/p>\n<p class=\"wp-block-paragraph\">The <strong><code>per_request<\/code><\/strong> builder treats every model call as one independent trace. It is lossless per individual call but fragments multi-turn sessions. A single coding problem can produce hundreds of per-request traces, increasing the burden on downstream trainers.<\/p>\n<p class=\"wp-block-paragraph\">The <strong><code>prefix_merging<\/code><\/strong> builder reconstructs longer traces where the harness session preserves append-only conversation histories. It partitions completions into ordered chains by verifying a strict token-prefix relation between adjacent completions. Sub-agents, context compaction boundaries, and parallel agent branches naturally form separate chains. Within each merged trace, only sampled assistant tokens are marked trainable. Canonical interstitial tokens receive a loss mask of zero.<\/p>\n<h3 class=\"wp-block-heading\"><strong>Ablation Results<\/strong><\/h3>\n<p class=\"wp-block-paragraph\">The research team benchmarks both strategies on the same model, hardware, and topology over three training steps.<\/p>\n<figure class=\"wp-block-table\">\n<table class=\"has-fixed-layout\">\n<thead>\n<tr>\n<th>Metric<\/th>\n<th><code>per_request<\/code><\/th>\n<th><code>prefix_merging<\/code><\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Trainer updates<\/td>\n<td>1,185<\/td>\n<td>218<\/td>\n<\/tr>\n<tr>\n<td>Wall-clock time<\/td>\n<td>189.5 min<\/td>\n<td>35.2 min<\/td>\n<\/tr>\n<tr>\n<td>Speedup<\/td>\n<td>\u2014<\/td>\n<td><strong>5.39\u00d7<\/strong><\/td>\n<\/tr>\n<tr>\n<td>Avg. rollout GPU utilization<\/td>\n<td>20.4%<\/td>\n<td>87.7%<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/figure>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n<h2 class=\"wp-block-heading\"><strong>SWE-Bench Verified Results<\/strong><\/h2>\n<p class=\"wp-block-paragraph\">Training uses standard GRPO on the Qwen3.5-4B base model. The dataset is SkyRL-v0-293-data SWE-Gym (293 tasks, 1 epoch, rollout batch size 4, 16 samples per prompt) with the Slime trainer. All experiments use <code>prefix_merging<\/code> for trajectory construction.<\/p>\n<h4 class=\"wp-block-heading\"><strong>Training Rollout Reward Progress (pass@1)<\/strong><\/h4>\n<figure class=\"wp-block-table\">\n<table class=\"has-fixed-layout\">\n<thead>\n<tr>\n<th>Harness<\/th>\n<th>First 10 Steps<\/th>\n<th>Last 10 Steps<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Codex<\/td>\n<td>9.5%<\/td>\n<td>54.5%<\/td>\n<\/tr>\n<tr>\n<td>Claude Code<\/td>\n<td>28.8%<\/td>\n<td>67.0%<\/td>\n<\/tr>\n<tr>\n<td>Qwen Code<\/td>\n<td>61.6%<\/td>\n<td>66.0%<\/td>\n<\/tr>\n<tr>\n<td>Pi<\/td>\n<td>61.6%<\/td>\n<td>76.2%<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/figure>\n<h4 class=\"wp-block-heading\"><strong>SWE-Bench Verified Final Scores<\/strong><\/h4>\n<figure class=\"wp-block-table\">\n<table class=\"has-fixed-layout\">\n<thead>\n<tr>\n<th>Harness<\/th>\n<th>Base<\/th>\n<th>Polar RL<\/th>\n<th>Gain<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Codex<\/td>\n<td>3.8%<\/td>\n<td>26.4%<\/td>\n<td>+22.6 pts<\/td>\n<\/tr>\n<tr>\n<td>Claude Code<\/td>\n<td>29.8%<\/td>\n<td>34.6%<\/td>\n<td>+4.8 pts<\/td>\n<\/tr>\n<tr>\n<td>Qwen Code<\/td>\n<td>34.6%<\/td>\n<td>35.2%<\/td>\n<td>+0.6 pts<\/td>\n<\/tr>\n<tr>\n<td>Pi<\/td>\n<td>34.2%<\/td>\n<td>40.4%<\/td>\n<td>+6.2 pts<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/figure>\n<p class=\"wp-block-paragraph\">The largest gain is under Codex. Codex presents an unfamiliar action protocol and patch-submission style to a Qwen model not originally trained on that harness. Polar attaches the reward signal to the actual sampled tokens flowing through the Codex execution path, so GRPO optimizes the behavior the model uses at evaluation time. Under the native Qwen Code harness, where the base model is already well-aligned, Polar still delivers a 0.6 point gain.<\/p>\n<h2 class=\"wp-block-heading\"><strong>Offline SFT Data Generation<\/strong><\/h2>\n<p class=\"wp-block-paragraph\">Polar can also serve as a distributed offline data generation service with no changes to the runtime. The research team demonstrates this using Qwen3.5-122B-A10B on an 8\u00d7H100 server (TP=8, max_model_len=32,768) with the pi harness against 1,638 instances from seven SWE-Gym repositories.<\/p>\n<p class=\"wp-block-paragraph\">A trajectory is accepted into the SFT corpus only if the SWE-Bench evaluation harness confirms the agent\u2019s patch resolves every <code>FAIL_TO_PASS<\/code> test and leaves every <code>PASS_TO_PASS<\/code> test green.<\/p>\n<figure class=\"wp-block-table\">\n<table class=\"has-fixed-layout\">\n<thead>\n<tr>\n<th>Repository<\/th>\n<th>Attempts<\/th>\n<th>Accepted<\/th>\n<th>Rate<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>getmoto\/moto<\/td>\n<td>343<\/td>\n<td>184<\/td>\n<td>53.6%<\/td>\n<\/tr>\n<tr>\n<td>python\/mypy<\/td>\n<td>257<\/td>\n<td>101<\/td>\n<td>39.3%<\/td>\n<\/tr>\n<tr>\n<td>conan-io\/conan<\/td>\n<td>71<\/td>\n<td>27<\/td>\n<td>38.0%<\/td>\n<\/tr>\n<tr>\n<td>pydantic\/pydantic<\/td>\n<td>81<\/td>\n<td>24<\/td>\n<td>29.6%<\/td>\n<\/tr>\n<tr>\n<td>iterative\/dvc<\/td>\n<td>219<\/td>\n<td>45<\/td>\n<td>20.5%<\/td>\n<\/tr>\n<tr>\n<td>pandas-dev\/pandas<\/td>\n<td>477<\/td>\n<td>98<\/td>\n<td>19.7%<\/td>\n<\/tr>\n<tr>\n<td>dask\/dask<\/td>\n<td>141<\/td>\n<td>25<\/td>\n<td>17.7%<\/td>\n<\/tr>\n<tr>\n<td><strong>Total<\/strong><\/td>\n<td><strong>1,638<\/strong><\/td>\n<td><strong>504<\/strong><\/td>\n<td><strong>30.8%<\/strong><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/figure>\n<p class=\"wp-block-paragraph\">The run cost roughly 64 GPU-hours. Accepted trajectories average 104 messages per session and 51 assistant turns. <\/p>\n<h2 class=\"wp-block-heading\"><strong>Framework Comparison<\/strong><\/h2>\n<figure class=\"wp-block-table\">\n<table class=\"has-fixed-layout\">\n<thead>\n<tr>\n<th>System<\/th>\n<th>Async RL<\/th>\n<th>Async Rollout Staging<\/th>\n<th>Rollout as Service<\/th>\n<th>Harness Agnostic<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td><strong>Polar<\/strong><\/td>\n<td>\u2713<\/td>\n<td>\u2713<\/td>\n<td>\u2713<\/td>\n<td>\u2713<\/td>\n<\/tr>\n<tr>\n<td>ProRL Agent<\/td>\n<td>\u2713<\/td>\n<td>\u2713<\/td>\n<td>\u2713<\/td>\n<td>\u2717<\/td>\n<\/tr>\n<tr>\n<td>SkyRL-Agent<\/td>\n<td>\u2713<\/td>\n<td>\u2713<\/td>\n<td>\u2717<\/td>\n<td>partial<\/td>\n<\/tr>\n<tr>\n<td>PRIME-RL<\/td>\n<td>\u2713<\/td>\n<td>\u2717<\/td>\n<td>\u2717<\/td>\n<td>\u2717<\/td>\n<\/tr>\n<tr>\n<td>Agent Lightning<\/td>\n<td>partial<\/td>\n<td>\u2717<\/td>\n<td>partial<\/td>\n<td>partial<\/td>\n<\/tr>\n<tr>\n<td>rLLM<\/td>\n<td>partial<\/td>\n<td>\u2717<\/td>\n<td>\u2717<\/td>\n<td>\u2717<\/td>\n<\/tr>\n<tr>\n<td>OpenClaw-RL<\/td>\n<td>\u2713<\/td>\n<td>\u2717<\/td>\n<td>\u2717<\/td>\n<td>partial<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/figure>\n<p class=\"wp-block-paragraph\">Polar is the only system in this comparison with first-class support across all four properties.<\/p>\n<h2 class=\"wp-block-heading\"><strong>Strengths and Limitations<\/strong><\/h2>\n<h4 class=\"wp-block-heading\"><strong>Strengths<\/strong><\/h4>\n<ul class=\"wp-block-list\">\n<li>No harness code changes required \u2014 the proxy intercepts at the model API boundary<\/li>\n<li>Provider-agnostic: supports Anthropic, OpenAI Chat, OpenAI Responses, and Google API formats natively<\/li>\n<li><code>prefix_merging<\/code> reduces trainer updates from 1,185 to 218 and cuts wall-clock time 5.39\u00d7<\/li>\n<li>Works for both online RL and offline SFT data generation with the same runtime<\/li>\n<li>Harness-native RL delivers large gains for unfamiliar execution paths \u2014 22.6 pts on Codex<\/li>\n<li>Partial traces are recovered when a harness times out mid-session<\/li>\n<li>Released as open source under NeMo Gym<\/li>\n<\/ul>\n<h4 class=\"wp-block-heading\"><strong>Limitations<\/strong><\/h4>\n<ul class=\"wp-block-list\">\n<li>Reward design, evaluator quality, and distribution shift remain the researcher\u2019s responsibility<\/li>\n<li>Requires the harness to support a configurable model base URL<\/li>\n<li>Token-level capture depends on the serving stack supplying reliable token IDs and log probabilities<\/li>\n<li><code>per_request<\/code> strategy produced reward hacking in experiments due to noisy credit assignment at the session level; session normalization and PRM-style credit assignment are on the roadmap<\/li>\n<\/ul>\n<h2 class=\"wp-block-heading\"><strong>Key Takeaways<\/strong><\/h2>\n<ul class=\"wp-block-list\">\n<li>Polar trains LLM agents via a model API proxy \u2014 no harness code changes required<\/li>\n<li>Supports Anthropic Messages, OpenAI Chat Completions, OpenAI Responses, and Google generateContent APIs<\/li>\n<li>Using GRPO on Qwen3.5-4B, Polar improves SWE-Bench Verified by up to 22.6 points across four coding harnesses<\/li>\n<li><code>prefix_merging<\/code> trajectory reconstruction delivers a 5.39\u00d7 wall-clock speedup over <code>per_request<\/code><\/li>\n<li>Generated 504 accepted SFT trajectories from 1,638 attempts (30.8%) at ~64 GPU-hours; released under Apache-2.0<\/li>\n<li>Rewrites ProRL Agent; registered as a NeMo Gym environment<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n<\/p><p class=\"wp-block-paragraph\">\n<\/p><p class=\"wp-block-paragraph\">Check out\u00a0the\u00a0<strong><a href=\"https:\/\/arxiv.org\/pdf\/2605.24220\" target=\"_blank\" rel=\"noreferrer noopener\">Paper<\/a> <\/strong>and<strong> <a href=\"https:\/\/github.com\/NVIDIA-NeMo\/ProRL-Agent-Server\" target=\"_blank\" rel=\"noreferrer noopener\">GitHub Repo<\/a>.\u00a0<\/strong>Also,\u00a0feel free to follow us on\u00a0<strong><a href=\"https:\/\/x.com\/intent\/follow?screen_name=marktechpost\" target=\"_blank\" rel=\"noreferrer noopener\"><mark>Twitter<\/mark><\/a><\/strong>\u00a0and don\u2019t forget to join our\u00a0<strong><a href=\"https:\/\/www.reddit.com\/r\/machinelearningnews\/\" target=\"_blank\" rel=\"noreferrer noopener\">150k+ ML SubReddit<\/a><\/strong>\u00a0and Subscribe to\u00a0<strong><a href=\"https:\/\/www.aidevsignals.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">our Newsletter<\/a><\/strong>. Wait! are you on telegram?\u00a0<strong><a href=\"https:\/\/t.me\/machinelearningresearchnews\" target=\"_blank\" rel=\"noreferrer noopener\">now you can join us on telegram as well.<\/a><\/strong><\/p>\n<p class=\"wp-block-paragraph\">Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.?\u00a0<strong><a href=\"https:\/\/forms.gle\/MTNLpmJtsFA3VRVd9\" target=\"_blank\" rel=\"noreferrer noopener\"><mark>Connect with us<\/mark><\/a><\/strong><\/p>\n<p>The post <a href=\"https:\/\/www.marktechpost.com\/2026\/05\/27\/nvidia-releases-polar-a-token-faithful-rollout-framework-for-grpo-training-across-codex-claude-code-and-qwen-code\/\">NVIDIA Releases Polar, a Token-Faithful Rollout Framework for GRPO Training Across Codex, Claude Code, and Qwen Code<\/a> appeared first on <a href=\"https:\/\/www.marktechpost.com\/\">MarkTechPost<\/a>.<\/p>","protected":false},"excerpt":{"rendered":"<p>Reinforcement learning for lan&hellip;<\/p>\n","protected":false},"author":1,"featured_media":992,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-991","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/posts\/991","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=991"}],"version-history":[{"count":0,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/posts\/991\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/media\/992"}],"wp:attachment":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=991"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=991"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=991"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}