{"id":1033,"date":"2026-06-04T03:15:07","date_gmt":"2026-06-03T19:15:07","guid":{"rendered":"https:\/\/connectword.dpdns.org\/?p=1033"},"modified":"2026-06-04T03:15:07","modified_gmt":"2026-06-03T19:15:07","slug":"how-to-build-a-document-intelligence-backend-with-iii-using-workers-functions-and-cron-triggers","status":"publish","type":"post","link":"https:\/\/connectword.dpdns.org\/?p=1033","title":{"rendered":"How to Build a Document Intelligence Backend with iii Using Workers, Functions, and Cron Triggers"},"content":{"rendered":"<p class=\"wp-block-paragraph\">In this tutorial, we build a document-intelligence workflow with<a href=\"https:\/\/github.com\/iii-hq\/iii\"> <strong>iii<\/strong><\/a>. We begin by installing the iii engine and Python SDK, then start the engine as a background process and connect a Python worker to it. After the setup, we register separate functions for text normalization, tokenization, sentiment analysis, keyword extraction, reporting, and heartbeat tracking. We then combine these functions into a single analysis pipeline and run the same logic via direct invocation, an HTTP endpoint, fire-and-forget execution, and a scheduled cron trigger. Along the way, we also track basic runtime state, making the workflow feel closer to a real backend system than a static notebook demo. Check out\u00a0the\u00a0<strong><a href=\"https:\/\/github.com\/MARKTECHPOST-AI-MEDIA-INC\/AI-Agents-Projects-Tutorials\/blob\/main\/Distributed%20Systems\/iii_live_document_intelligence_backend_marktechpost.py\" target=\"_blank\" rel=\"noreferrer noopener\">FULL CODES here<\/a>.<\/strong><\/p>\n<div class=\"dm-code-snippet dark dm-normal-version default no-background-mobile\">\n<div class=\"control-language\">\n<div class=\"dm-buttons\">\n<div class=\"dm-buttons-left\">\n<div class=\"dm-button-snippet red-button\"><\/div>\n<div class=\"dm-button-snippet orange-button\"><\/div>\n<div class=\"dm-button-snippet green-button\"><\/div>\n<\/div>\n<div class=\"dm-buttons-right\"><a><span class=\"dm-copy-text\">Copy Code<\/span><span class=\"dm-copy-confirmed\">Copied<\/span><span class=\"dm-error-message\">Use a different Browser<\/span><\/a><\/div>\n<\/div>\n<pre class=\" no-line-numbers\"><code class=\" no-wrap language-php\">import os, sys, subprocess, time, socket, json, threading\nfrom collections import Counter\nHOME    = os.path.expanduser(\"~\")\nBIN_DIR = f\"{HOME}\/.local\/bin\"\nos.environ[\"PATH\"] = BIN_DIR + os.pathsep + os.environ.get(\"PATH\", \"\")\ndef sh(cmd):\n   print(f\"$ {cmd}\")\n   subprocess.run(cmd, shell=True, check=True)\nif not os.path.exists(f\"{BIN_DIR}\/iii\"):\n   sh(f\"curl -fsSL https:\/\/install.iii.dev\/iii\/main\/install.sh | BIN_DIR={BIN_DIR} sh\")\nsh(f\"{sys.executable} -m pip install -q iii-sdk requests\")\nIII = f\"{BIN_DIR}\/iii\"\nsh(f\"{III} --version\")<\/code><\/pre>\n<\/div>\n<\/div>\n<p class=\"wp-block-paragraph\">We start by importing the required Python modules and setting up the local binary path for the III engine. We define a small helper function to run shell commands and install the III engine if it is not already available. We also install the Python SDK and requests package, then verify the iii installation by checking its version.<\/p>\n<div class=\"dm-code-snippet dark dm-normal-version default no-background-mobile\">\n<div class=\"control-language\">\n<div class=\"dm-buttons\">\n<div class=\"dm-buttons-left\">\n<div class=\"dm-button-snippet red-button\"><\/div>\n<div class=\"dm-button-snippet orange-button\"><\/div>\n<div class=\"dm-button-snippet green-button\"><\/div>\n<\/div>\n<div class=\"dm-buttons-right\"><a><span class=\"dm-copy-text\">Copy Code<\/span><span class=\"dm-copy-confirmed\">Copied<\/span><span class=\"dm-error-message\">Use a different Browser<\/span><\/a><\/div>\n<\/div>\n<pre class=\" no-line-numbers\"><code class=\" no-wrap language-php\">WS_URL, HTTP_URL = \"ws:\/\/localhost:49134\", \"http:\/\/localhost:3111\"\nengine_log = open(\"\/tmp\/iii-engine.log\", \"w\")\nengine = subprocess.Popen([III, \"--use-default-config\"],\n                         stdout=engine_log, stderr=subprocess.STDOUT)\ndef wait_port(host, port, timeout=90):\n   end = time.time() + timeout\n   while time.time() &lt; end:\n       with socket.socket() as s:\n           s.settimeout(1)\n           try:\n               s.connect((host, port)); return True\n           except OSError:\n               time.sleep(0.5)\n   return False\nassert wait_port(\"localhost\", 49134), \"engine never came up \u2014 see \/tmp\/iii-engine.log\"\nprint(f\"\u2713 engine up \u2014 WS {WS_URL} | HTTP {HTTP_URL}\")\nfrom iii import register_worker\ntry:\n   from iii import TriggerAction\nexcept Exception:\n   TriggerAction = None\nworker = register_worker(WS_URL)\n_STATE = {\"docs_analyzed\": 0, \"heartbeats\": 0, \"keyword_totals\": Counter()}\n_LOCK  = threading.Lock()\nPOSITIVE = {\"good\",\"great\",\"love\",\"excellent\",\"happy\",\"fast\",\"reliable\",\"amazing\",\"best\",\"win\"}\nNEGATIVE = {\"bad\",\"terrible\",\"hate\",\"slow\",\"broken\",\"sad\",\"worst\",\"bug\",\"crash\",\"fail\"}<\/code><\/pre>\n<\/div>\n<\/div>\n<p class=\"wp-block-paragraph\">We launch the iii engine as a background process and wait for its WebSocket port to become available. We then connect a Python worker to the running engine and prepare optional support for fire-and-forget triggers. We also define a shared in-memory state, a thread lock, and simple positive and negative word sets for sentiment analysis.<\/p>\n<div class=\"dm-code-snippet dark dm-normal-version default no-background-mobile\">\n<div class=\"control-language\">\n<div class=\"dm-buttons\">\n<div class=\"dm-buttons-left\">\n<div class=\"dm-button-snippet red-button\"><\/div>\n<div class=\"dm-button-snippet orange-button\"><\/div>\n<div class=\"dm-button-snippet green-button\"><\/div>\n<\/div>\n<div class=\"dm-buttons-right\"><a><span class=\"dm-copy-text\">Copy Code<\/span><span class=\"dm-copy-confirmed\">Copied<\/span><span class=\"dm-error-message\">Use a different Browser<\/span><\/a><\/div>\n<\/div>\n<pre class=\" no-line-numbers\"><code class=\" no-wrap language-php\">def normalize(data):\n   return {\"text\": (data.get(\"text\") or \"\").strip().lower()}\ndef tokenize(data):\n   text   = data.get(\"text\", \"\")\n   cleaned = \"\".join(c if (c.isalnum() or c.isspace()) else \" \" for c in text)\n   tokens = [t for t in cleaned.split() if t]\n   return {\"tokens\": tokens, \"count\": len(tokens)}\ndef sentiment(data):\n   toks  = data.get(\"tokens\", [])\n   pos   = sum(t in POSITIVE for t in toks)\n   neg   = sum(t in NEGATIVE for t in toks)\n   score = pos - neg\n   label = \"positive\" if score &gt; 0 else \"negative\" if score &lt; 0 else \"neutral\"\n   return {\"label\": label, \"score\": score, \"pos\": pos, \"neg\": neg}\ndef keywords(data):\n   toks = data.get(\"tokens\", [])\n   stop = {\"the\",\"a\",\"an\",\"is\",\"it\",\"to\",\"of\",\"and\",\"in\",\"for\",\"on\",\"how\"}\n   freq = Counter(t for t in toks if t not in stop and len(t) &gt; 2)\n   return {\"keywords\": freq.most_common(data.get(\"top_n\", 5))}\ndef analyze(data):\n   norm = worker.trigger({\"function_id\": \"text::normalize\", \"payload\": {\"text\": data.get(\"text\",\"\")}})\n   toks = worker.trigger({\"function_id\": \"text::tokenize\",  \"payload\": norm})\n   sent = worker.trigger({\"function_id\": \"text::sentiment\", \"payload\": toks})\n   keys = worker.trigger({\"function_id\": \"text::keywords\",  \"payload\": {**toks, \"top_n\": data.get(\"top_n\", 5)}})\n   with _LOCK:\n       _STATE[\"docs_analyzed\"] += 1\n       for k, c in keys[\"keywords\"]:\n           _STATE[\"keyword_totals\"][k] += c\n       n = _STATE[\"docs_analyzed\"]\n   return {\"tokens\": toks[\"count\"], \"sentiment\": sent, \"keywords\": keys[\"keywords\"], \"docs_analyzed\": n}\ndef report(data):\n   with _LOCK:\n       return {\"docs_analyzed\": _STATE[\"docs_analyzed\"],\n               \"heartbeats\":    _STATE[\"heartbeats\"],\n               \"top_keywords_all_docs\": _STATE[\"keyword_totals\"].most_common(5)}\ndef http_analyze(data):\n   body   = data.get(\"body\") or {}\n   result = worker.trigger({\"function_id\": \"pipeline::analyze\", \"payload\": body})\n   return {\"status_code\": 200, \"body\": result, \"headers\": {\"Content-Type\": \"application\/json\"}}\ndef heartbeat(data):\n   with _LOCK:\n       _STATE[\"heartbeats\"] += 1\n   return {\"ok\": True}\nfor fid, fn in [\n   (\"text::normalize\", normalize), (\"text::tokenize\", tokenize),\n   (\"text::sentiment\", sentiment), (\"text::keywords\", keywords),\n   (\"pipeline::analyze\", analyze), (\"stats::report\", report),\n   (\"http::analyze\", http_analyze), (\"cron::heartbeat\", heartbeat),\n]:\n   worker.register_function(fid, fn)<\/code><\/pre>\n<\/div>\n<\/div>\n<p class=\"wp-block-paragraph\">We define the core functions used in the text-analysis workflow, including normalization, tokenization, sentiment detection, and keyword extraction. We then create an analysis function that routes each step through the III engine instead of calling everything directly. We also add reporting, HTTP handling, and heartbeat functions before registering all of them with the worker.<\/p>\n<div class=\"dm-code-snippet dark dm-normal-version default no-background-mobile\">\n<div class=\"control-language\">\n<div class=\"dm-buttons\">\n<div class=\"dm-buttons-left\">\n<div class=\"dm-button-snippet red-button\"><\/div>\n<div class=\"dm-button-snippet orange-button\"><\/div>\n<div class=\"dm-button-snippet green-button\"><\/div>\n<\/div>\n<div class=\"dm-buttons-right\"><a><span class=\"dm-copy-text\">Copy Code<\/span><span class=\"dm-copy-confirmed\">Copied<\/span><span class=\"dm-error-message\">Use a different Browser<\/span><\/a><\/div>\n<\/div>\n<pre class=\" no-line-numbers\"><code class=\" no-wrap language-php\">worker.register_trigger({\"type\": \"http\", \"function_id\": \"http::analyze\",\n                        \"config\": {\"api_path\": \"\/analyze\", \"http_method\": \"POST\"}})\ncron_ok = False\ntry:\n   worker.register_trigger({\"type\": \"cron\", \"function_id\": \"cron::heartbeat\",\n                            \"config\": {\"schedule\": \"*\/2 * * * * *\"}})\n   cron_ok = True\nexcept Exception as e:\n   print(\"cron trigger skipped:\", e)\ntry:\n   worker.connect()\nexcept Exception:\n   pass\ntime.sleep(2)<\/code><\/pre>\n<\/div>\n<\/div>\n<p class=\"wp-block-paragraph\">We register an HTTP trigger so that the analysis pipeline can be invoked via a POST request. We also try to register a cron trigger that runs the heartbeat function on a fixed schedule, while safely skipping it if the engine build does not support that schema. We then connect the worker and pause briefly so the registered functions and triggers are ready to use.<\/p>\n<div class=\"dm-code-snippet dark dm-normal-version default no-background-mobile\">\n<div class=\"control-language\">\n<div class=\"dm-buttons\">\n<div class=\"dm-buttons-left\">\n<div class=\"dm-button-snippet red-button\"><\/div>\n<div class=\"dm-button-snippet orange-button\"><\/div>\n<div class=\"dm-button-snippet green-button\"><\/div>\n<\/div>\n<div class=\"dm-buttons-right\"><a><span class=\"dm-copy-text\">Copy Code<\/span><span class=\"dm-copy-confirmed\">Copied<\/span><span class=\"dm-error-message\">Use a different Browser<\/span><\/a><\/div>\n<\/div>\n<pre class=\" no-line-numbers\"><code class=\" no-wrap language-php\">print(\"n=== A) Direct invocation \u2014 orchestrated through the engine ===\")\ndocs = [\n   \"iii makes the backend amazing and fast, I love how reliable it is\",\n   \"The legacy gateway was slow and broken, a terrible buggy experience\",\n   \"Workers register functions and triggers; the engine routes every call\",\n]\nfor d in docs:\n   r = worker.trigger({\"function_id\": \"pipeline::analyze\", \"payload\": {\"text\": d, \"top_n\": 4}})\n   print(f\"  [{r['sentiment']['label']:&gt;8}] tokens={r['tokens']:&gt;2}  keywords={r['keywords']}\")\nprint(\"n=== B) The SAME function over HTTP (:3111) \u2014 zero handler changes ===\")\nimport requests\ntry:\n   resp = requests.post(f\"{HTTP_URL}\/analyze\",\n                        json={\"text\": \"great great product, best ever\", \"top_n\": 3}, timeout=10)\n   print(\"  HTTP\", resp.status_code, \"-&gt;\", resp.json())\nexcept Exception as e:\n   print(\"  HTTP call failed (engine HTTP module\/version?):\", e)\nprint(\"n=== C) Fire-and-forget invocation ===\")\nif TriggerAction:\n   worker.trigger({\"function_id\": \"pipeline::analyze\",\n                   \"payload\": {\"text\": \"async win, no waiting\"},\n                   \"action\": TriggerAction.Void()})\n   print(\"  dispatched (no result awaited)\")\nelse:\n   print(\"  TriggerAction not in this SDK build \u2014 skipping\")\nprint(\"n=== D) Cron trigger firing on its own ===\")\nif cron_ok:\n   time.sleep(5)\n   print(\"  heartbeats so far:\",\n         worker.trigger({\"function_id\": \"stats::report\", \"payload\": {}})[\"heartbeats\"])\nelse:\n   print(\"  cron not registered on this engine build\")\nprint(\"n=== E) Aggregate state report ===\")\nprint(json.dumps(worker.trigger({\"function_id\": \"stats::report\", \"payload\": {}}), indent=2))\nprint(\"nTraces\/metrics: run `iii console` locally, or scrape Prometheus at :9464\")\nprint(\"engine log tail:\")\nprint(subprocess.run([\"tail\", \"-n\", \"8\", \"\/tmp\/iii-engine.log\"],\n                    capture_output=True, text=True).stdout)<\/code><\/pre>\n<\/div>\n<\/div>\n<p class=\"wp-block-paragraph\">We test the complete III workflow by sending sample text documents through the registered analysis pipeline. We then call the same logic through HTTP, try fire-and-forget execution, and check whether the cron heartbeat is running. Finally, we print the aggregate state report and show the engine log tail for basic runtime visibility.<\/p>\n<p class=\"wp-block-paragraph\">In conclusion, we have a working III system that processes text using modular, registered functions rather than a single fixed script. We analyzed sample documents, exposed the pipeline through HTTP, tested async-style execution, tracked heartbeat activity, and printed an aggregate state report. The tutorial keeps the example readable while showing the main working pattern of iii: define functions once, register them with a worker, and reuse them through different triggers and execution paths. It also shows how small functions can be cleanly connected as the workflow grows into something more production-ready.<\/p>\n<p class=\"wp-block-paragraph\">\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n<\/p><p class=\"wp-block-paragraph\">\n<\/p><p class=\"wp-block-paragraph\">Check out\u00a0the\u00a0<strong><a href=\"https:\/\/github.com\/MARKTECHPOST-AI-MEDIA-INC\/AI-Agents-Projects-Tutorials\/blob\/main\/Distributed%20Systems\/iii_live_document_intelligence_backend_marktechpost.py\" target=\"_blank\" rel=\"noreferrer noopener\">FULL CODES here<\/a>.\u00a0<\/strong>Also,\u00a0feel free to follow us on\u00a0<strong><a href=\"https:\/\/x.com\/intent\/follow?screen_name=marktechpost\" target=\"_blank\" rel=\"noreferrer noopener\"><mark>Twitter<\/mark><\/a><\/strong>\u00a0and don\u2019t forget to join our\u00a0<strong><a href=\"https:\/\/www.reddit.com\/r\/machinelearningnews\/\" target=\"_blank\" rel=\"noreferrer noopener\">150k+ ML SubReddit<\/a><\/strong>\u00a0and Subscribe to\u00a0<strong><a href=\"https:\/\/www.aidevsignals.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">our Newsletter<\/a><\/strong>. Wait! are you on telegram?\u00a0<strong><a href=\"https:\/\/t.me\/machinelearningresearchnews\" target=\"_blank\" rel=\"noreferrer noopener\">now you can join us on telegram as well.<\/a><\/strong><\/p>\n<p class=\"wp-block-paragraph\">Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.?\u00a0<strong><a href=\"https:\/\/forms.gle\/wbash1wF6efRj8G58\" target=\"_blank\" rel=\"noreferrer noopener\"><mark>Connect with us<\/mark><\/a><\/strong><\/p>\n<p>The post <a href=\"https:\/\/www.marktechpost.com\/2026\/06\/03\/how-to-build-a-document-intelligence-backend-with-iii-using-workers-functions-and-cron-triggers\/\">How to Build a Document Intelligence Backend with iii Using Workers, Functions, and Cron Triggers<\/a> appeared first on <a href=\"https:\/\/www.marktechpost.com\/\">MarkTechPost<\/a>.<\/p>","protected":false},"excerpt":{"rendered":"<p>In this tutorial, we build a d&hellip;<\/p>\n","protected":false},"author":1,"featured_media":29,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-1033","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/posts\/1033","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1033"}],"version-history":[{"count":0,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/posts\/1033\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/media\/29"}],"wp:attachment":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1033"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1033"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1033"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}