{"id":190,"date":"2025-12-26T13:33:15","date_gmt":"2025-12-26T05:33:15","guid":{"rendered":"https:\/\/connectword.dpdns.org\/?p=190"},"modified":"2025-12-26T13:33:15","modified_gmt":"2025-12-26T05:33:15","slug":"a-coding-implementation-on-building-self-organizing-zettelkasten-knowledge-graphs-and-sleep-consolidation-mechanisms","status":"publish","type":"post","link":"https:\/\/connectword.dpdns.org\/?p=190","title":{"rendered":"A Coding Implementation on Building Self-Organizing Zettelkasten Knowledge Graphs and Sleep-Consolidation Mechanisms"},"content":{"rendered":"<p>In this tutorial, we dive into the cutting edge of Agentic AI by building a \u201cZettelkasten\u201d memory system, a \u201cliving\u201d architecture that organizes information much like the human brain. We move beyond standard retrieval methods to construct a dynamic knowledge graph where an agent autonomously decomposes inputs into atomic facts, links them semantically, and even \u201csleeps\u201d to consolidate memories into higher-order insights. Using Google\u2019s Gemini, we implement a robust solution that addresses real-world API constraints, ensuring our agent stores data and also actively understands the evolving context of our projects. Check out the\u00a0<strong><a href=\"https:\/\/github.com\/Marktechpost\/AI-Tutorial-Codes-Included\/blob\/main\/Agentic%20AI%20Memory\/Agentic_Zettelkasten_Memory_Martechpost.ipynb\" target=\"_blank\" rel=\"noreferrer noopener\">FULL CODES here<\/a><\/strong>.<\/p>\n<div class=\"dm-code-snippet dark dm-normal-version default no-background-mobile\">\n<div class=\"control-language\">\n<div class=\"dm-buttons\">\n<div class=\"dm-buttons-left\">\n<div class=\"dm-button-snippet red-button\"><\/div>\n<div class=\"dm-button-snippet orange-button\"><\/div>\n<div class=\"dm-button-snippet green-button\"><\/div>\n<\/div>\n<div class=\"dm-buttons-right\"><a><span class=\"dm-copy-text\">Copy Code<\/span><span class=\"dm-copy-confirmed\">Copied<\/span><span class=\"dm-error-message\">Use a different Browser<\/span><\/a><\/div>\n<\/div>\n<pre class=\" no-line-numbers\"><code class=\" no-wrap language-php\">!pip install -q -U google-generativeai networkx pyvis scikit-learn numpy\n\n\nimport os\nimport json\nimport uuid\nimport time\nimport getpass\nimport random\nimport networkx as nx\nimport numpy as np\nimport google.generativeai as genai\nfrom dataclasses import dataclass, field\nfrom typing import List\nfrom sklearn.metrics.pairwise import cosine_similarity\nfrom IPython.display import display, HTML\nfrom pyvis.network import Network\nfrom google.api_core import exceptions\n\n\ndef retry_with_backoff(func, *args, **kwargs):\n   max_retries = 5\n   base_delay = 5\n  \n   for attempt in range(max_retries):\n       try:\n           return func(*args, **kwargs)\n       except exceptions.ResourceExhausted:\n           wait_time = base_delay * (2 ** attempt) + random.uniform(0, 1)\n           print(f\"   <img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/16.0.1\/72x72\/23f3.png\" alt=\"\u23f3\" class=\"wp-smiley\" \/> Quota limit hit. Cooling down for {wait_time:.1f}s...\")\n           time.sleep(wait_time)\n       except Exception as e:\n           if \"429\" in str(e):\n               wait_time = base_delay * (2 ** attempt) + random.uniform(0, 1)\n               print(f\"   <img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/16.0.1\/72x72\/23f3.png\" alt=\"\u23f3\" class=\"wp-smiley\" \/> Quota limit hit (HTTP 429). Cooling down for {wait_time:.1f}s...\")\n               time.sleep(wait_time)\n           else:\n               print(f\"   <img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/16.0.1\/72x72\/26a0.png\" alt=\"\u26a0\" class=\"wp-smiley\" \/> Unexpected Error: {e}\")\n               return None\n   print(\"   <img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/16.0.1\/72x72\/274c.png\" alt=\"\u274c\" class=\"wp-smiley\" \/> Max retries reached.\")\n   return None\n\n\nprint(\"Enter your Google AI Studio API Key (Input will be hidden):\")\nAPI_KEY = getpass.getpass()\n\n\ngenai.configure(api_key=API_KEY)\nMODEL_NAME = \"gemini-2.5-flash\" \nEMBEDDING_MODEL = \"models\/text-embedding-004\"\n\n\nprint(f\"<img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/16.0.1\/72x72\/2705.png\" alt=\"\u2705\" class=\"wp-smiley\" \/> API Key configured. Using model: {MODEL_NAME}\")<\/code><\/pre>\n<\/div>\n<\/div>\n<p>We begin by importing essential libraries for graph management and AI model interaction, while also securing our API key input. Crucially, we define a robust retry_with_backoff function that automatically handles rate limit errors, ensuring our agent gracefully pauses and recovers when the API quota is exceeded during heavy processing. Check out the\u00a0<strong><a href=\"https:\/\/github.com\/Marktechpost\/AI-Tutorial-Codes-Included\/blob\/main\/Agentic%20AI%20Memory\/Agentic_Zettelkasten_Memory_Martechpost.ipynb\" target=\"_blank\" rel=\"noreferrer noopener\">FULL CODES here<\/a><\/strong>.<\/p>\n<div class=\"dm-code-snippet dark dm-normal-version default no-background-mobile\">\n<div class=\"control-language\">\n<div class=\"dm-buttons\">\n<div class=\"dm-buttons-left\">\n<div class=\"dm-button-snippet red-button\"><\/div>\n<div class=\"dm-button-snippet orange-button\"><\/div>\n<div class=\"dm-button-snippet green-button\"><\/div>\n<\/div>\n<div class=\"dm-buttons-right\"><a><span class=\"dm-copy-text\">Copy Code<\/span><span class=\"dm-copy-confirmed\">Copied<\/span><span class=\"dm-error-message\">Use a different Browser<\/span><\/a><\/div>\n<\/div>\n<pre class=\" no-line-numbers\"><code class=\" no-wrap language-php\">@dataclass\nclass MemoryNode:\n   id: str\n   content: str\n   type: str\n   embedding: List[float] = field(default_factory=list)\n   timestamp: int = 0\n\n\nclass RobustZettelkasten:\n   def __init__(self):\n       self.graph = nx.Graph()\n       self.model = genai.GenerativeModel(MODEL_NAME)\n       self.step_counter = 0\n\n\n   def _get_embedding(self, text):\n       result = retry_with_backoff(\n           genai.embed_content,\n           model=EMBEDDING_MODEL,\n           content=text\n       )\n       return result['embedding'] if result else [0.0] * 768<\/code><\/pre>\n<\/div>\n<\/div>\n<p>We define the fundamental MemoryNode structure to hold our content, types, and vector embeddings in an organized data class. We then initialize the main RobustZettelkasten class, establishing the network graph and configuring the Gemini embedding model that serves as the backbone of our semantic search capabilities. Check out the\u00a0<strong><a href=\"https:\/\/github.com\/Marktechpost\/AI-Tutorial-Codes-Included\/blob\/main\/Agentic%20AI%20Memory\/Agentic_Zettelkasten_Memory_Martechpost.ipynb\" target=\"_blank\" rel=\"noreferrer noopener\">FULL CODES here<\/a><\/strong>.<\/p>\n<div class=\"dm-code-snippet dark dm-normal-version default no-background-mobile\">\n<div class=\"control-language\">\n<div class=\"dm-buttons\">\n<div class=\"dm-buttons-left\">\n<div class=\"dm-button-snippet red-button\"><\/div>\n<div class=\"dm-button-snippet orange-button\"><\/div>\n<div class=\"dm-button-snippet green-button\"><\/div>\n<\/div>\n<div class=\"dm-buttons-right\"><a><span class=\"dm-copy-text\">Copy Code<\/span><span class=\"dm-copy-confirmed\">Copied<\/span><span class=\"dm-error-message\">Use a different Browser<\/span><\/a><\/div>\n<\/div>\n<pre class=\" no-line-numbers\"><code class=\" no-wrap language-php\">def _atomize_input(self, text):\n       prompt = f\"\"\"\n       Break the following text into independent atomic facts.\n       Output JSON: {{ \"facts\": [\"fact1\", \"fact2\"] }}\n       Text: \"{text}\"\n       \"\"\"\n       response = retry_with_backoff(\n           self.model.generate_content,\n           prompt,\n           generation_config={\"response_mime_type\": \"application\/json\"}\n       )\n       try:\n           return json.loads(response.text).get(\"facts\", []) if response else [text]\n       except:\n           return [text]\n\n\n   def _find_similar_nodes(self, embedding, top_k=3, threshold=0.45):\n       if not self.graph.nodes: return []\n      \n       nodes = list(self.graph.nodes(data=True))\n       embeddings = [n[1]['data'].embedding for n in nodes]\n       valid_embeddings = [e for e in embeddings if len(e) &gt; 0]\n      \n       if not valid_embeddings: return []\n\n\n       sims = cosine_similarity([embedding], embeddings)[0]\n       sorted_indices = np.argsort(sims)[::-1]\n      \n       results = []\n       for idx in sorted_indices[:top_k]:\n           if sims[idx] &gt; threshold:\n               results.append((nodes[idx][0], sims[idx]))\n       return results\n\n\n   def add_memory(self, user_input):\n       self.step_counter += 1\n       print(f\"n<img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/16.0.1\/72x72\/1f9e0.png\" alt=\"\ud83e\udde0\" class=\"wp-smiley\" \/> [Step {self.step_counter}] Processing: \"{user_input}\"\")\n      \n       facts = self._atomize_input(user_input)\n      \n       for fact in facts:\n           print(f\"   -&gt; Atom: {fact}\")\n           emb = self._get_embedding(fact)\n           candidates = self._find_similar_nodes(emb)\n          \n           node_id = str(uuid.uuid4())[:6]\n           node = MemoryNode(id=node_id, content=fact, type='fact', embedding=emb, timestamp=self.step_counter)\n           self.graph.add_node(node_id, data=node, title=fact, label=fact[:15]+\"...\")\n          \n           if candidates:\n               context_str = \"n\".join([f\"ID {c[0]}: {self.graph.nodes[c[0]]['data'].content}\" for c in candidates])\n               prompt = f\"\"\"\n               I am adding: \"{fact}\"\n               Existing Memory:\n               {context_str}\n              \n               Are any of these directly related? If yes, provide the relationship label.\n               JSON: {{ \"links\": [{{ \"target_id\": \"ID\", \"rel\": \"label\" }}] }}\n               \"\"\"\n               response = retry_with_backoff(\n                   self.model.generate_content,\n                   prompt,\n                   generation_config={\"response_mime_type\": \"application\/json\"}\n               )\n              \n               if response:\n                   try:\n                       links = json.loads(response.text).get(\"links\", [])\n                       for link in links:\n                           if self.graph.has_node(link['target_id']):\n                               self.graph.add_edge(node_id, link['target_id'], label=link['rel'])\n                               print(f\"      <img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/16.0.1\/72x72\/1f517.png\" alt=\"\ud83d\udd17\" class=\"wp-smiley\" \/> Linked to {link['target_id']} ({link['rel']})\")\n                   except:\n                       pass\n          \n           time.sleep(1)<\/code><\/pre>\n<\/div>\n<\/div>\n<p>We construct an ingestion pipeline that decomposes complex user inputs into atomic facts to prevent information loss. We immediately embed these facts and use our agent to identify and create semantic links to existing nodes, effectively building a knowledge graph in real time that mimics associative memory. Check out the\u00a0<strong><a href=\"https:\/\/github.com\/Marktechpost\/AI-Tutorial-Codes-Included\/blob\/main\/Agentic%20AI%20Memory\/Agentic_Zettelkasten_Memory_Martechpost.ipynb\" target=\"_blank\" rel=\"noreferrer noopener\">FULL CODES here<\/a><\/strong>.<\/p>\n<div class=\"dm-code-snippet dark dm-normal-version default no-background-mobile\">\n<div class=\"control-language\">\n<div class=\"dm-buttons\">\n<div class=\"dm-buttons-left\">\n<div class=\"dm-button-snippet red-button\"><\/div>\n<div class=\"dm-button-snippet orange-button\"><\/div>\n<div class=\"dm-button-snippet green-button\"><\/div>\n<\/div>\n<div class=\"dm-buttons-right\"><a><span class=\"dm-copy-text\">Copy Code<\/span><span class=\"dm-copy-confirmed\">Copied<\/span><span class=\"dm-error-message\">Use a different Browser<\/span><\/a><\/div>\n<\/div>\n<pre class=\" no-line-numbers\"><code class=\" no-wrap language-php\">def consolidate_memory(self):\n       print(f\"n<img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/16.0.1\/72x72\/1f4a4.png\" alt=\"\ud83d\udca4\" class=\"wp-smiley\" \/> [Consolidation Phase] Reflecting...\")\n       high_degree_nodes = [n for n, d in self.graph.degree() if d &gt;= 2]\n       processed_clusters = set()\n\n\n       for main_node in high_degree_nodes:\n           neighbors = list(self.graph.neighbors(main_node))\n           cluster_ids = tuple(sorted([main_node] + neighbors))\n          \n           if cluster_ids in processed_clusters: continue\n           processed_clusters.add(cluster_ids)\n          \n           cluster_content = [self.graph.nodes[n]['data'].content for n in cluster_ids]\n          \n           prompt = f\"\"\"\n           Generate a single high-level insight summary from these facts.\n           Facts: {json.dumps(cluster_content)}\n           JSON: {{ \"insight\": \"Your insight here\" }}\n           \"\"\"\n           response = retry_with_backoff(\n               self.model.generate_content,\n               prompt,\n               generation_config={\"response_mime_type\": \"application\/json\"}\n           )\n          \n           if response:\n               try:\n                   insight_text = json.loads(response.text).get(\"insight\")\n                   if insight_text:\n                       insight_id = f\"INSIGHT-{uuid.uuid4().hex[:4]}\"\n                       print(f\"   <img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/16.0.1\/72x72\/2728.png\" alt=\"\u2728\" class=\"wp-smiley\" \/> Insight: {insight_text}\")\n                       emb = self._get_embedding(insight_text)\n                      \n                       insight_node = MemoryNode(id=insight_id, content=insight_text, type='insight', embedding=emb)\n                       self.graph.add_node(insight_id, data=insight_node, title=f\"INSIGHT: {insight_text}\", label=\"INSIGHT\", color=\"#ff7f7f\")\n                       self.graph.add_edge(insight_id, main_node, label=\"abstracted_from\")\n               except:\n                   continue\n           time.sleep(1)\n\n\n   def answer_query(self, query):\n       print(f\"n<img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/16.0.1\/72x72\/1f50d.png\" alt=\"\ud83d\udd0d\" class=\"wp-smiley\" \/> Querying: \"{query}\"\")\n       emb = self._get_embedding(query)\n       candidates = self._find_similar_nodes(emb, top_k=2)\n      \n       if not candidates:\n           print(\"No relevant memory found.\")\n           return\n\n\n       relevant_context = set()\n       for node_id, score in candidates:\n           node_content = self.graph.nodes[node_id]['data'].content\n           relevant_context.add(f\"- {node_content} (Direct Match)\")\n           for n1 in self.graph.neighbors(node_id):\n               rel = self.graph[node_id][n1].get('label', 'related')\n               content = self.graph.nodes[n1]['data'].content\n               relevant_context.add(f\"  - linked via '{rel}' to: {content}\")\n              \n       context_text = \"n\".join(relevant_context)\n       prompt = f\"\"\"\n       Answer based ONLY on context.\n       Question: {query}\n       Context:\n       {context_text}\n       \"\"\"\n       response = retry_with_backoff(self.model.generate_content, prompt)\n       if response:\n           print(f\"<img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/16.0.1\/72x72\/1f916.png\" alt=\"\ud83e\udd16\" class=\"wp-smiley\" \/> Agent Answer:n{response.text}\")<\/code><\/pre>\n<\/div>\n<\/div>\n<p>We implement the cognitive functions of our agent, enabling it to \u201csleep\u201d and consolidate dense memory clusters into higher-order insights. We also define the query logic that traverses these connected paths, allowing the agent to reason across multiple hops in the graph to answer complex questions. Check out the\u00a0<strong><a href=\"https:\/\/github.com\/Marktechpost\/AI-Tutorial-Codes-Included\/blob\/main\/Agentic%20AI%20Memory\/Agentic_Zettelkasten_Memory_Martechpost.ipynb\" target=\"_blank\" rel=\"noreferrer noopener\">FULL CODES here<\/a><\/strong>.<\/p>\n<div class=\"dm-code-snippet dark dm-normal-version default no-background-mobile\">\n<div class=\"control-language\">\n<div class=\"dm-buttons\">\n<div class=\"dm-buttons-left\">\n<div class=\"dm-button-snippet red-button\"><\/div>\n<div class=\"dm-button-snippet orange-button\"><\/div>\n<div class=\"dm-button-snippet green-button\"><\/div>\n<\/div>\n<div class=\"dm-buttons-right\"><a><span class=\"dm-copy-text\">Copy Code<\/span><span class=\"dm-copy-confirmed\">Copied<\/span><span class=\"dm-error-message\">Use a different Browser<\/span><\/a><\/div>\n<\/div>\n<pre class=\" no-line-numbers\"><code class=\" no-wrap language-php\">def show_graph(self):\n       try:\n           net = Network(notebook=True, cdn_resources='remote', height=\"500px\", width=\"100%\", bgcolor='#222222', font_color='white')\n           for n, data in self.graph.nodes(data=True):\n               color = \"#97c2fc\" if data['data'].type == 'fact' else \"#ff7f7f\"\n               net.add_node(n, label=data.get('label', ''), title=data['data'].content, color=color)\n           for u, v, data in self.graph.edges(data=True):\n               net.add_edge(u, v, label=data.get('label', ''))\n           net.show(\"memory_graph.html\")\n           display(HTML(\"memory_graph.html\"))\n       except Exception as e:\n           print(f\"Graph visualization error: {e}\")\n\n\nbrain = RobustZettelkasten()\n\n\nevents = [\n   \"The project 'Apollo' aims to build a dashboard for tracking solar panel efficiency.\",\n   \"We chose React for the frontend because the team knows it well.\",\n   \"The backend must be Python to support the data science libraries.\",\n   \"Client called. They are unhappy with React performance on low-end devices.\",\n   \"We are switching the frontend to Svelte for better performance.\"\n]\n\n\nprint(\"--- PHASE 1: INGESTION ---\")\nfor event in events:\n   brain.add_memory(event)\n   time.sleep(2)\n\n\nprint(\"--- PHASE 2: CONSOLIDATION ---\")\nbrain.consolidate_memory()\n\n\nprint(\"--- PHASE 3: RETRIEVAL ---\")\nbrain.answer_query(\"What is the current frontend technology for Apollo and why?\")\n\n\nprint(\"--- PHASE 4: VISUALIZATION ---\")\nbrain.show_graph()<\/code><\/pre>\n<\/div>\n<\/div>\n<p>We wrap up by adding a visualization method that generates an interactive HTML graph of our agent\u2019s memory, allowing us to inspect the nodes and edges. Finally, we execute a test scenario involving a project timeline to verify that our system correctly links concepts, generates insights, and retrieves the right context.<\/p>\n<p>In conclusion, we now have a fully functional \u201cLiving Memory\u201d prototype that transcends simple database storage. By enabling our agent to actively link related concepts and reflect on its experiences during a \u201cconsolidation\u201d phase, we solve the critical problem of fragmented context in long-running AI interactions. This system demonstrates that true intelligence requires processing power and a structured, evolving memory, marking the way for us to build more capable, personalized autonomous agents.<\/p>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n<p>Check out the\u00a0<strong><a href=\"https:\/\/github.com\/Marktechpost\/AI-Tutorial-Codes-Included\/blob\/main\/Agentic%20AI%20Memory\/Agentic_Zettelkasten_Memory_Martechpost.ipynb\" target=\"_blank\" rel=\"noreferrer noopener\">FULL CODES here<\/a><\/strong>.\u00a0Also,\u00a0feel free to follow us on\u00a0<strong><a href=\"https:\/\/x.com\/intent\/follow?screen_name=marktechpost\" target=\"_blank\" rel=\"noreferrer noopener\"><mark>Twitter<\/mark><\/a><\/strong>\u00a0and don\u2019t forget to join our\u00a0<strong><a href=\"https:\/\/www.reddit.com\/r\/machinelearningnews\/\" target=\"_blank\" rel=\"noreferrer noopener\">100k+ ML SubReddit<\/a><\/strong>\u00a0and Subscribe to\u00a0<strong><a href=\"https:\/\/www.aidevsignals.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">our Newsletter<\/a><\/strong>. Wait! are you on telegram?\u00a0<strong><a href=\"https:\/\/t.me\/machinelearningresearchnews\" target=\"_blank\" rel=\"noreferrer noopener\">now you can join us on telegram as well.<\/a><\/strong><\/p>\n<p>The post <a href=\"https:\/\/www.marktechpost.com\/2025\/12\/25\/a-coding-implementation-on-building-self-organizing-zettelkasten-knowledge-graphs-and-sleep-consolidation-mechanisms\/\">A Coding Implementation on Building Self-Organizing Zettelkasten Knowledge Graphs and Sleep-Consolidation Mechanisms<\/a> appeared first on <a href=\"https:\/\/www.marktechpost.com\/\">MarkTechPost<\/a>.<\/p>","protected":false},"excerpt":{"rendered":"<p>In this tutorial, we dive into&hellip;<\/p>\n","protected":false},"author":1,"featured_media":29,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-190","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/posts\/190","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=190"}],"version-history":[{"count":0,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/posts\/190\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/media\/29"}],"wp:attachment":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=190"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=190"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=190"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}