{"id":988,"date":"2026-05-28T16:07:09","date_gmt":"2026-05-28T08:07:09","guid":{"rendered":"https:\/\/connectword.dpdns.org\/?p=988"},"modified":"2026-05-28T16:07:09","modified_gmt":"2026-05-28T08:07:09","slug":"a-coding-guide-to-implement-a-pgvector-powered-semantic-hybrid-sparse-and-quantized-vector-search-system","status":"publish","type":"post","link":"https:\/\/connectword.dpdns.org\/?p=988","title":{"rendered":"A Coding Guide to Implement a pgvector-Powered Semantic, Hybrid, Sparse, and Quantized Vector Search System"},"content":{"rendered":"<p class=\"wp-block-paragraph\">In this tutorial, we build a complete<a href=\"https:\/\/github.com\/pgvector\/pgvector-python\"> <strong>pgvector<\/strong><\/a><strong> <\/strong>playground inside Google Colab and explore how PostgreSQL can work as a powerful vector database for modern AI applications. We start by installing PostgreSQL, compiling the pgvector extension, connecting through Psycopg, and registering vector types for smooth Python integration. Then, we create embeddings with SentenceTransformers, store them in PostgreSQL, build HNSW indexes, and run semantic search, filtered search, distance metric comparisons, half-precision storage, binary quantization, sparse vector search, hybrid retrieval, and vector aggregation. Through this workflow, we learn how pgvector supports practical retrieval-augmented generation, recommendation, similarity search, and hybrid search systems using only open-source tools.<\/p>\n<div class=\"dm-code-snippet dark dm-normal-version default no-background-mobile\">\n<div class=\"control-language\">\n<div class=\"dm-buttons\">\n<div class=\"dm-buttons-left\">\n<div class=\"dm-button-snippet red-button\"><\/div>\n<div class=\"dm-button-snippet orange-button\"><\/div>\n<div class=\"dm-button-snippet green-button\"><\/div>\n<\/div>\n<div class=\"dm-buttons-right\"><a><span class=\"dm-copy-text\">Copy Code<\/span><span class=\"dm-copy-confirmed\">Copied<\/span><span class=\"dm-error-message\">Use a different Browser<\/span><\/a><\/div>\n<\/div>\n<pre class=\" no-line-numbers\"><code class=\" no-wrap language-php\">import os\nimport subprocess\nimport sys\nimport time\ndef sh(cmd: str, check: bool = True):\n   \"\"\"Run a shell command, streaming a compact log.\"\"\"\n   print(f\"  $ {cmd}\")\n   return subprocess.run(cmd, shell=True, check=check,\n                         stdout=subprocess.DEVNULL, stderr=subprocess.STDOUT)\nprint(\"[0\/10] Installing PostgreSQL + building pgvector (\u22481\u20132 min)...\")\nsh(\"apt-get -qq update\")\nsh(\"apt-get -qq install -y postgresql postgresql-contrib \"\n  \"postgresql-server-dev-all build-essential git\")\nif not os.path.exists(\"\/tmp\/pgvector\"):\n   sh(\"git clone --depth 1 https:\/\/github.com\/pgvector\/pgvector.git \/tmp\/pgvector\")\nsh(\"cd \/tmp\/pgvector &amp;&amp; make &amp;&amp; make install\")\nsh(\"service postgresql start\")\ntime.sleep(3)\nsh(\"\"\"sudo -u postgres psql -c \"ALTER USER postgres PASSWORD 'postgres';\" \"\"\")\nprint(\"[0\/10] Installing Python packages...\")\nsh(f\"{sys.executable} -m pip install -q pgvector psycopg[binary] \"\n  f\"sentence-transformers numpy\")<\/code><\/pre>\n<\/div>\n<\/div>\n<p class=\"wp-block-paragraph\">We set up the complete PostgreSQL and pgvector environment. We install the required system packages, clone and build pgvector from source, start the PostgreSQL service, and configure the database password. We also install the Python dependencies needed to connect to PostgreSQL and work with vector embeddings.<\/p>\n<div class=\"dm-code-snippet dark dm-normal-version default no-background-mobile\">\n<div class=\"control-language\">\n<div class=\"dm-buttons\">\n<div class=\"dm-buttons-left\">\n<div class=\"dm-button-snippet red-button\"><\/div>\n<div class=\"dm-button-snippet orange-button\"><\/div>\n<div class=\"dm-button-snippet green-button\"><\/div>\n<\/div>\n<div class=\"dm-buttons-right\"><a><span class=\"dm-copy-text\">Copy Code<\/span><span class=\"dm-copy-confirmed\">Copied<\/span><span class=\"dm-error-message\">Use a different Browser<\/span><\/a><\/div>\n<\/div>\n<pre class=\" no-line-numbers\"><code class=\" no-wrap language-php\">import numpy as np\nimport psycopg\nfrom pgvector import HalfVector, SparseVector\nfrom pgvector.psycopg import register_vector\nfrom sentence_transformers import SentenceTransformer\nprint(\"n[1\/10] Connecting and enabling the 'vector' extension...\")\nconn = psycopg.connect(\n   \"host=127.0.0.1 port=5432 dbname=postgres user=postgres password=postgres\",\n   autocommit=True,\n)\nconn.execute(\"CREATE EXTENSION IF NOT EXISTS vector\")\nregister_vector(conn)\nver = conn.execute(\"SELECT extversion FROM pg_extension WHERE extname='vector'\").fetchone()[0]\nprint(f\"      pgvector version: {ver}\")\nprint(\"n[2\/10] Loading embedding model + encoding corpus...\")\nmodel = SentenceTransformer(\"all-MiniLM-L6-v2\")\nDIM = model.get_sentence_embedding_dimension()\ncorpus = [\n   (\"Octopuses have three hearts and blue blood.\",             \"animals\"),\n   (\"Transformers revolutionized natural language processing.\",\"technology\"),\n   (\"Quantum computers exploit superposition and entanglement.\",\"technology\"),\n   (\"GPUs accelerate deep learning by parallelizing matrix math.\",\"technology\"),\n   (\"Sourdough bread relies on wild yeast and lactobacilli.\",  \"food\"),\n   (\"Dark chocolate contains flavonoid antioxidants.\",         \"food\"),\n   (\"A black hole's gravity is so strong light cannot escape.\",\"space\")\n]\ncontents   = [c for c, _ in corpus]\ncategories = [k for _, k in corpus]\nembeddings = model.encode(contents, normalize_embeddings=True)\nconn.execute(\"DROP TABLE IF EXISTS documents\")\nconn.execute(f\"\"\"\n   CREATE TABLE documents (\n       id        bigserial PRIMARY KEY,\n       content   text,\n       category  text,\n       embedding vector({DIM})\n   )\n\"\"\")\nwith conn.cursor() as cur:\n   cur.executemany(\n       \"INSERT INTO documents (content, category, embedding) VALUES (%s, %s, %s)\",\n       list(zip(contents, categories, [np.asarray(e) for e in embeddings])),\n   )\nprint(f\"      Inserted {len(corpus)} documents with {DIM}-d embeddings.\")<\/code><\/pre>\n<\/div>\n<\/div>\n<p class=\"wp-block-paragraph\">We connect to PostgreSQL, enable the pgvector extension, and register vector support with Psycopg. We load the SentenceTransformers model, define a small text corpus, generate normalized embeddings, and create a PostgreSQL table for storing documents. We then insert each document with its category and vector representation so that we can perform semantic search later.<\/p>\n<div class=\"dm-code-snippet dark dm-normal-version default no-background-mobile\">\n<div class=\"control-language\">\n<div class=\"dm-buttons\">\n<div class=\"dm-buttons-left\">\n<div class=\"dm-button-snippet red-button\"><\/div>\n<div class=\"dm-button-snippet orange-button\"><\/div>\n<div class=\"dm-button-snippet green-button\"><\/div>\n<\/div>\n<div class=\"dm-buttons-right\"><a><span class=\"dm-copy-text\">Copy Code<\/span><span class=\"dm-copy-confirmed\">Copied<\/span><span class=\"dm-error-message\">Use a different Browser<\/span><\/a><\/div>\n<\/div>\n<pre class=\" no-line-numbers\"><code class=\" no-wrap language-php\">print(\"n[3\/10] Building HNSW index and running semantic search...\")\nconn.execute(\n   \"CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops) \"\n   \"WITH (m = 16, ef_construction = 64)\"\n)\nconn.execute(\"SET hnsw.ef_search = 100\")\ndef semantic_search(query: str, k: int = 4):\n   q = np.asarray(model.encode(query, normalize_embeddings=True))\n   return conn.execute(\n       \"SELECT content, category, embedding &lt;=&gt; %s AS distance \"\n       \"FROM documents ORDER BY distance LIMIT %s\",\n       (q, k),\n   ).fetchall()\nfor content, cat, dist in semantic_search(\"animals that are unusually quick\"):\n   print(f\"      {dist:.3f}  [{cat:&lt;10}] {content}\")\nprint(\"n[4\/10] Filtered search (only category = 'space')...\")\nq = np.asarray(model.encode(\"objects with extreme gravity\", normalize_embeddings=True))\nrows = conn.execute(\n   \"SELECT content, embedding &lt;=&gt; %s AS distance \"\n   \"FROM documents WHERE category = %s ORDER BY distance LIMIT 3\",\n   (q, \"space\"),\n).fetchall()\nfor content, dist in rows:\n   print(f\"      {dist:.3f}  {content}\")\nprint(\"n[5\/10] Same query under different distance metrics (top hit each)...\")\nq = np.asarray(model.encode(\"brewing a hot caffeinated drink\", normalize_embeddings=True))\nfor op, label in [(\"&lt;-&gt;\", \"L2\"), (\"&lt;=&gt;\", \"cosine\"), (\"&lt;#&gt;\", \"neg-inner\"), (\"&lt;+&gt;\", \"L1\")]:\n   content, score = conn.execute(\n       f\"SELECT content, embedding {op} %s AS s FROM documents ORDER BY s LIMIT 1\", (q,)\n   ).fetchone()\n   print(f\"      {label:&lt;10} {score:+.3f}  {content}\")<\/code><\/pre>\n<\/div>\n<\/div>\n<p class=\"wp-block-paragraph\">We build an HNSW index on the embedding column to enable faster, more efficient vector search. We define a semantic search function that converts a query into an embedding and retrieves the most similar documents using cosine similarity. We also perform metadata-filtered search and compare different pgvector distance operators such as L2, cosine, negative inner product, and L1.<\/p>\n<div class=\"dm-code-snippet dark dm-normal-version default no-background-mobile\">\n<div class=\"control-language\">\n<div class=\"dm-buttons\">\n<div class=\"dm-buttons-left\">\n<div class=\"dm-button-snippet red-button\"><\/div>\n<div class=\"dm-button-snippet orange-button\"><\/div>\n<div class=\"dm-button-snippet green-button\"><\/div>\n<\/div>\n<div class=\"dm-buttons-right\"><a><span class=\"dm-copy-text\">Copy Code<\/span><span class=\"dm-copy-confirmed\">Copied<\/span><span class=\"dm-error-message\">Use a different Browser<\/span><\/a><\/div>\n<\/div>\n<pre class=\" no-line-numbers\"><code class=\" no-wrap language-php\">print(\"n[6\/10] Half-precision storage with halfvec...\")\nconn.execute(f\"ALTER TABLE documents ADD COLUMN IF NOT EXISTS embedding_half halfvec({DIM})\")\nconn.execute(\"UPDATE documents SET embedding_half = embedding::halfvec\")\nconn.execute(\n   \"CREATE INDEX ON documents USING hnsw (embedding_half halfvec_cosine_ops)\"\n)\nq_half = HalfVector(model.encode(\"the galaxy we live in\", normalize_embeddings=True))\nrows = conn.execute(\n   \"SELECT content, embedding_half &lt;=&gt; %s AS d FROM documents ORDER BY d LIMIT 2\",\n   (q_half,),\n).fetchall()\nfor content, d in rows:\n   print(f\"      {d:.3f}  {content}\")\nprint(\"n[7\/10] Binary quantization (Hamming) + exact re-rank...\")\nconn.execute(\n   f\"CREATE INDEX ON documents \"\n   f\"USING hnsw ((binary_quantize(embedding)::bit({DIM})) bit_hamming_ops)\"\n)\nq = np.asarray(model.encode(\"parallel hardware for AI training\", normalize_embeddings=True))\nrerank_sql = f\"\"\"\n   SELECT content, candidates.embedding &lt;=&gt; %(q)s AS exact_distance\n   FROM (\n       SELECT content, embedding\n       FROM documents\n       ORDER BY binary_quantize(embedding)::bit({DIM})\n             &lt;~&gt; binary_quantize(%(q)s)::bit({DIM})\n       LIMIT 8\n   ) AS candidates\n   ORDER BY exact_distance\n   LIMIT 3\n\"\"\"\nfor content, d in conn.execute(rerank_sql, {\"q\": q}).fetchall():\n   print(f\"      {d:.3f}  {content}\")\nprint(\"n[8\/10] Native sparse vectors...\")\nconn.execute(\"DROP TABLE IF EXISTS sparse_items\")\nconn.execute(\"CREATE TABLE sparse_items (id bigserial PRIMARY KEY, embedding sparsevec(10))\")\nsparse_data = [\n   SparseVector({0: 1.0, 3: 2.0, 7: 1.5}, 10),\n   SparseVector({1: 0.5, 3: 1.0, 9: 3.0}, 10),\n   SparseVector({0: 0.2, 4: 2.5, 7: 0.8}, 10),\n]\nwith conn.cursor() as cur:\n   cur.executemany(\"INSERT INTO sparse_items (embedding) VALUES (%s)\",\n                   [(v,) for v in sparse_data])\nquery_sparse = SparseVector({0: 1.0, 7: 1.0}, 10)\nrows = conn.execute(\n   \"SELECT id, embedding, embedding &lt;#&gt; %s AS neg_ip \"\n   \"FROM sparse_items ORDER BY neg_ip LIMIT 3\",\n   (query_sparse,),\n).fetchall()\nfor _id, vec, neg_ip in rows:\n   print(f\"      id={_id}  inner_product={-neg_ip:.2f}  nnz_indices={vec.indices()}\")<\/code><\/pre>\n<\/div>\n<\/div>\n<p class=\"wp-block-paragraph\">We explore advanced pgvector storage and retrieval techniques beyond standard dense vectors. We convert embeddings into half-precision vectors to reduce storage, use binary quantization with Hamming search for fast candidate retrieval, and then re-rank results with full-precision vectors. We also create sparse vectors and query them using inner-product similarity, which is useful for keyword-weighted or SPLADE-style retrieval.<\/p>\n<div class=\"dm-code-snippet dark dm-normal-version default no-background-mobile\">\n<div class=\"control-language\">\n<div class=\"dm-buttons\">\n<div class=\"dm-buttons-left\">\n<div class=\"dm-button-snippet red-button\"><\/div>\n<div class=\"dm-button-snippet orange-button\"><\/div>\n<div class=\"dm-button-snippet green-button\"><\/div>\n<\/div>\n<div class=\"dm-buttons-right\"><a><span class=\"dm-copy-text\">Copy Code<\/span><span class=\"dm-copy-confirmed\">Copied<\/span><span class=\"dm-error-message\">Use a different Browser<\/span><\/a><\/div>\n<\/div>\n<pre class=\" no-line-numbers\"><code class=\" no-wrap language-php\">print(\"n[9\/10] Hybrid search (vector + full-text) via RRF...\")\nuser_query = \"fast animal\"\nqvec = np.asarray(model.encode(user_query, normalize_embeddings=True))\nhybrid_sql = \"\"\"\nWITH semantic AS (\n   SELECT id, RANK() OVER (ORDER BY embedding &lt;=&gt; %(qvec)s) AS rank\n   FROM documents\n   ORDER BY embedding &lt;=&gt; %(qvec)s\n   LIMIT 20\n),\nkeyword AS (\n   SELECT d.id,\n          RANK() OVER (ORDER BY ts_rank_cd(to_tsvector('english', d.content), q) DESC) AS rank\n   FROM documents d, plainto_tsquery('english', %(qtext)s) AS q\n   WHERE to_tsvector('english', d.content) @@ q\n   LIMIT 20\n)\nSELECT d.content,\n      COALESCE(1.0 \/ (60 + semantic.rank), 0.0)\n    + COALESCE(1.0 \/ (60 + keyword.rank),  0.0) AS rrf_score\nFROM documents d\nLEFT JOIN semantic ON d.id = semantic.id\nLEFT JOIN keyword  ON d.id = keyword.id\nWHERE semantic.id IS NOT NULL OR keyword.id IS NOT NULL\nORDER BY rrf_score DESC\nLIMIT 4\n\"\"\"\nfor content, score in conn.execute(hybrid_sql, {\"qvec\": qvec, \"qtext\": user_query}).fetchall():\n   print(f\"      {score:.5f}  {content}\")\nprint(\"n[10\/10] Aggregating vectors with AVG (category centroid)...\")\ncentroid = conn.execute(\n   \"SELECT AVG(embedding) FROM documents WHERE category = %s\", (\"food\",)\n).fetchone()[0]\ntypical = conn.execute(\n   \"SELECT content, embedding &lt;=&gt; %s AS d FROM documents \"\n   \"WHERE category = %s ORDER BY d LIMIT 1\",\n   (np.asarray(centroid), \"food\"),\n).fetchone()\nprint(f\"      Centroid dim = {len(centroid)}\")\nprint(f\"      Most representative 'food' doc: {typical[0]}\")\nprint(\"n<img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/17.0.2\/72x72\/2705.png\" alt=\"\u2705\" class=\"wp-smiley\" \/> Done. You now have a working pgvector playground inside Colab.\")\nprint(\"   Try editing `corpus`, the queries, or swap in your own embedding model.\")<\/code><\/pre>\n<\/div>\n<\/div>\n<p class=\"wp-block-paragraph\">We combine semantic vector search with PostgreSQL full-text search using Reciprocal Rank Fusion. We retrieve results from both semantic and keyword rankings, merge their scores, and produce a stronger hybrid search output. Finally, we compute the average embedding for a category and use it as a centroid to find the most representative document in that group.<\/p>\n<p class=\"wp-block-paragraph\">In conclusion, we have a working pgvector-based retrieval system that runs entirely in Google Colab, without external services or API keys. We used PostgreSQL not just as a traditional relational database, but as a flexible vector search engine that supports dense vectors, half-precision vectors, binary-quantized retrieval, sparse vectors, full-text search, and aggregation. We also observed how metadata filtering, HNSW indexing, Reciprocal Rank Fusion, and centroid-based analysis make pgvector useful for real-world AI search pipelines.<\/p>\n<p class=\"wp-block-paragraph\">\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n<\/p><p class=\"wp-block-paragraph\">\n<\/p><p class=\"wp-block-paragraph\">Check out\u00a0the\u00a0<strong><a href=\"https:\/\/github.com\/Marktechpost\/AI-Agents-Projects-Tutorials\/blob\/main\/Databases\/pgvector_semantic_hybrid_search_marktechpost.py\" target=\"_blank\" rel=\"noreferrer noopener\">Full Codes with Notebook here<\/a>.\u00a0<\/strong>Also,\u00a0feel free to follow us on\u00a0<strong><a href=\"https:\/\/x.com\/intent\/follow?screen_name=marktechpost\" target=\"_blank\" rel=\"noreferrer noopener\"><mark>Twitter<\/mark><\/a><\/strong>\u00a0and don\u2019t forget to join our\u00a0<strong><a href=\"https:\/\/www.reddit.com\/r\/machinelearningnews\/\" target=\"_blank\" rel=\"noreferrer noopener\">150k+ ML SubReddit<\/a><\/strong>\u00a0and Subscribe to\u00a0<strong><a href=\"https:\/\/www.aidevsignals.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">our Newsletter<\/a><\/strong>. Wait! are you on telegram?\u00a0<strong><a href=\"https:\/\/t.me\/machinelearningresearchnews\" target=\"_blank\" rel=\"noreferrer noopener\">now you can join us on telegram as well.<\/a><\/strong><\/p>\n<p class=\"wp-block-paragraph\">Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.?\u00a0<strong><a href=\"https:\/\/forms.gle\/wbash1wF6efRj8G58\" target=\"_blank\" rel=\"noreferrer noopener\"><mark>Connect with us<\/mark><\/a><\/strong><\/p>\n<p>The post <a href=\"https:\/\/www.marktechpost.com\/2026\/05\/28\/a-coding-guide-to-implement-a-pgvector-powered-semantic-hybrid-sparse-and-quantized-vector-search-system\/\">A Coding Guide to Implement a pgvector-Powered Semantic, Hybrid, Sparse, and Quantized Vector Search System<\/a> appeared first on <a href=\"https:\/\/www.marktechpost.com\/\">MarkTechPost<\/a>.<\/p>","protected":false},"excerpt":{"rendered":"<p>In this tutorial, we build a c&hellip;<\/p>\n","protected":false},"author":1,"featured_media":29,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-988","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/posts\/988","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=988"}],"version-history":[{"count":0,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/posts\/988\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/media\/29"}],"wp:attachment":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=988"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=988"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=988"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}