{"id":396,"date":"2026-02-10T23:25:27","date_gmt":"2026-02-10T15:25:27","guid":{"rendered":"https:\/\/connectword.dpdns.org\/?p=396"},"modified":"2026-02-10T23:25:27","modified_gmt":"2026-02-10T15:25:27","slug":"alibaba-open-sources-zvec-an-embedded-vector-database-bringing-sqlite-like-simplicity-and-high-performance-on-device-rag-to-edge-applications","status":"publish","type":"post","link":"https:\/\/connectword.dpdns.org\/?p=396","title":{"rendered":"Alibaba Open-Sources Zvec: An Embedded Vector Database Bringing SQLite-like Simplicity and High-Performance On-Device RAG to Edge Applications"},"content":{"rendered":"<p>Alibaba Tongyi Lab research team released \u2018Zvec\u2019, an open source, in-process vector database that targets edge and on-device retrieval workloads. It is positioned as \u2018the SQLite of vector databases\u2019 because it runs as a library inside your application and does not require any external service or daemon. It is designed for retrieval augmented generation (RAG), semantic search, and agent workloads that must run locally on laptops, mobile devices, or other constrained hardware\/edge devices<\/p>\n<p>The core idea is simple. Many applications now need vector search and metadata filtering but do not want to run a separate vector database service. Traditional server style systems are heavy for desktop tools, mobile apps, or command line utilities. An embedded engine that behaves like SQLite but for embeddings fits this gap.<\/p>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"2012\" height=\"828\" data-attachment-id=\"77830\" data-permalink=\"https:\/\/www.marktechpost.com\/2026\/02\/10\/alibaba-open-sources-zvec-an-embedded-vector-database-bringing-sqlite-like-simplicity-and-high-performance-on-device-rag-to-edge-applications\/screenshot-2026-02-10-at-7-16-14-am-2\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/02\/Screenshot-2026-02-10-at-7.16.14-AM-1.png\" data-orig-size=\"2012,828\" data-comments-opened=\"1\" data-image-meta='{\"aperture\":\"0\",\"credit\":\"\",\"camera\":\"\",\"caption\":\"\",\"created_timestamp\":\"0\",\"copyright\":\"\",\"focal_length\":\"0\",\"iso\":\"0\",\"shutter_speed\":\"0\",\"title\":\"\",\"orientation\":\"0\"}' data-image-title=\"Screenshot 2026-02-10 at 7.16.14\u202fAM\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/02\/Screenshot-2026-02-10-at-7.16.14-AM-1-300x123.png\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/02\/Screenshot-2026-02-10-at-7.16.14-AM-1-1024x421.png\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/02\/Screenshot-2026-02-10-at-7.16.14-AM-1.png\" alt=\"\" class=\"wp-image-77830\" \/><figcaption class=\"wp-element-caption\">https:\/\/zvec.org\/en\/blog\/introduction\/<\/figcaption><\/figure>\n<\/div>\n<h3 class=\"wp-block-heading\"><strong>Why embedded vector search matters for RAG<\/strong>?<\/h3>\n<p>RAG and semantic search pipelines need more than a bare index. They need vectors, scalar fields, full CRUD, and safe persistence. Local knowledge bases change as files, notes, and project states change.<\/p>\n<p>Index libraries such as Faiss provide approximate nearest neighbor search but do not handle scalar storage, crash recovery, or hybrid queries. You end up building your own storage and consistency layer. Embedded extensions such as DuckDB-VSS add vector search to DuckDB but expose fewer index and quantization options and weaker resource control for edge scenarios. Service based systems such as Milvus or managed vector clouds require network calls and separate deployment, which is often overkill for on-device tools.<\/p>\n<p>Zvec claims to fit in specifically for these local scenarios. It gives you a vector-native engine with persistence, resource governance, and RAG oriented features, packaged as a lightweight library.<\/p>\n<h3 class=\"wp-block-heading\"><strong>Core architecture: in-process and vector-native<\/strong><\/h3>\n<p>Zvec is implemented as an embedded library. You install it with <code>pip install zvec<\/code> and open collections directly in your Python process. There is no external server or RPC layer. You define schemas, insert documents, and run queries through the Python API.<\/p>\n<p>The engine is built on Proxima, Alibaba Group\u2019s high performance, production grade, battle tested vector search engine. Zvec wraps Proxima with a simpler API and embedded runtime. The project is released under the Apache 2.0 license.<\/p>\n<p>Current support covers Python 3.10 to 3.12 on Linux x86_64, Linux ARM64, and macOS ARM64.<\/p>\n<p><strong>The design goals are explicit:<\/strong><\/p>\n<ul class=\"wp-block-list\">\n<li>Embedded execution in process<\/li>\n<li>Vector native indexing and storage<\/li>\n<li>Production ready persistence and crash safety<\/li>\n<\/ul>\n<p>This makes it suitable for edge devices, desktop applications, and zero-ops deployments.<\/p>\n<h3 class=\"wp-block-heading\"><strong>Developer workflow: from install to semantic search<\/strong><\/h3>\n<p>The quickstart documentation shows a short path from install to query.<\/p>\n<ol class=\"wp-block-list\">\n<li>Install the package:<br \/><code>pip install zvec<\/code><\/li>\n<li>Define a <code>CollectionSchema<\/code> with one or more vector fields and optional scalar fields.<\/li>\n<li>Call <code>create_and_open<\/code> to create or open the collection on disk.<\/li>\n<li>Insert <code>Doc<\/code> objects that contain an ID, vectors, and scalar attributes.<\/li>\n<li>Build an index and run a <code>VectorQuery<\/code> to retrieve nearest neighbors.<\/li>\n<\/ol>\n<div class=\"dm-code-snippet dark dm-normal-version default no-background-mobile\">\n<div class=\"control-language\">\n<div class=\"dm-buttons\">\n<div class=\"dm-buttons-left\">\n<div class=\"dm-button-snippet red-button\"><\/div>\n<div class=\"dm-button-snippet orange-button\"><\/div>\n<div class=\"dm-button-snippet green-button\"><\/div>\n<\/div>\n<div class=\"dm-buttons-right\"><a><span class=\"dm-copy-text\">Copy Code<\/span><span class=\"dm-copy-confirmed\">Copied<\/span><span class=\"dm-error-message\">Use a different Browser<\/span><\/a><\/div>\n<\/div>\n<pre class=\" no-line-numbers\"><code class=\" no-wrap language-php\">pip install zvec<\/code><\/pre>\n<\/div>\n<\/div>\n<p><strong>Example:<\/strong><\/p>\n<div class=\"dm-code-snippet dark dm-normal-version default no-background-mobile\">\n<div class=\"control-language\">\n<div class=\"dm-buttons\">\n<div class=\"dm-buttons-left\">\n<div class=\"dm-button-snippet red-button\"><\/div>\n<div class=\"dm-button-snippet orange-button\"><\/div>\n<div class=\"dm-button-snippet green-button\"><\/div>\n<\/div>\n<div class=\"dm-buttons-right\"><a><span class=\"dm-copy-text\">Copy Code<\/span><span class=\"dm-copy-confirmed\">Copied<\/span><span class=\"dm-error-message\">Use a different Browser<\/span><\/a><\/div>\n<\/div>\n<pre class=\" no-line-numbers\"><code class=\" no-wrap language-php\">import zvec\n\n# Define collection schema\nschema = zvec.CollectionSchema(\n    name=\"example\",\n    vectors=zvec.VectorSchema(\"embedding\", zvec.DataType.VECTOR_FP32, 4),\n)\n\n# Create collection\ncollection = zvec.create_and_open(path=\".\/zvec_example\", schema=schema,)\n\n# Insert documents\ncollection.insert([\n    zvec.Doc(id=\"doc_1\", vectors={\"embedding\": [0.1, 0.2, 0.3, 0.4]}),\n    zvec.Doc(id=\"doc_2\", vectors={\"embedding\": [0.2, 0.3, 0.4, 0.1]}),\n])\n\n# Search by vector similarity\nresults = collection.query(\n    zvec.VectorQuery(\"embedding\", vector=[0.4, 0.3, 0.3, 0.1]),\n    topk=10\n)\n\n# Results: list of {'id': str, 'score': float, ...}, sorted by relevance \nprint(results)<\/code><\/pre>\n<\/div>\n<\/div>\n<p>Results come back as dictionaries that include IDs and similarity scores. This is enough to build a local semantic search or RAG retrieval layer on top of any embedding model.<\/p>\n<h3 class=\"wp-block-heading\"><strong>Performance: VectorDBBench and 8,000+ QPS<\/strong><\/h3>\n<p>Zvec is optimized for high throughput and low latency on CPUs. It uses multithreading, cache friendly memory layouts, SIMD instructions, and CPU prefetching.<\/p>\n<p>In <a href=\"https:\/\/zilliz.com\/vdbbench-leaderboard?dataset=vectorSearch\" target=\"_blank\" rel=\"noreferrer noopener\">VectorDBBench<\/a> on the Cohere 10M dataset, with comparable hardware and matched recall, Zvec reports more than 8,000 QPS. This is more than 2\u00d7 the previous leaderboard #1, ZillizCloud, while also substantially reducing index build time in the same setup.<\/p>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"1620\" height=\"988\" data-attachment-id=\"77827\" data-permalink=\"https:\/\/www.marktechpost.com\/2026\/02\/10\/alibaba-open-sources-zvec-an-embedded-vector-database-bringing-sqlite-like-simplicity-and-high-performance-on-device-rag-to-edge-applications\/screenshot-2026-02-10-at-7-00-44-am-2\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/02\/Screenshot-2026-02-10-at-7.00.44-AM-1.png\" data-orig-size=\"1620,988\" data-comments-opened=\"1\" data-image-meta='{\"aperture\":\"0\",\"credit\":\"\",\"camera\":\"\",\"caption\":\"\",\"created_timestamp\":\"0\",\"copyright\":\"\",\"focal_length\":\"0\",\"iso\":\"0\",\"shutter_speed\":\"0\",\"title\":\"\",\"orientation\":\"0\"}' data-image-title=\"Screenshot 2026-02-10 at 7.00.44\u202fAM\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/02\/Screenshot-2026-02-10-at-7.00.44-AM-1-300x183.png\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/02\/Screenshot-2026-02-10-at-7.00.44-AM-1-1024x625.png\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/02\/Screenshot-2026-02-10-at-7.00.44-AM-1.png\" alt=\"\" class=\"wp-image-77827\" \/><figcaption class=\"wp-element-caption\">https:\/\/zvec.org\/en\/blog\/introduction\/<\/figcaption><\/figure>\n<\/div>\n<p>These metrics show that an embedded library can reach cloud level performance for high volume similarity search, as long as the workload resembles the benchmark conditions.<\/p>\n<h3 class=\"wp-block-heading\"><strong>RAG capabilities: CRUD, hybrid search, fusion, reranking<\/strong><\/h3>\n<p>The feature set is tuned for RAG and agentic retrieval.<\/p>\n<p><strong>Zvec supports:<\/strong><\/p>\n<ul class=\"wp-block-list\">\n<li>Full CRUD on documents so the local knowledge base can change over time.<\/li>\n<li>Schema evolution to adjust index strategies and fields.<\/li>\n<li>Multi vector retrieval for queries that combine several embedding channels.<\/li>\n<li>A built in reranker that supports weighted fusion and Reciprocal Rank Fusion.<\/li>\n<li>Scalar vector hybrid search that pushes scalar filters into the index execution path, with optional inverted indexes for scalar attributes.<\/li>\n<\/ul>\n<p>This allows you to build on device assistants that mix semantic retrieval, filters such as user, time, or type, and multiple embedding models, all within one embedded engine.<\/p>\n<h3 class=\"wp-block-heading\"><strong>Key Takeaways<\/strong><\/h3>\n<ul class=\"wp-block-list\">\n<li>Zvec is an embedded, in-process vector database positioned as the \u2018SQLite of vector database\u2019 for on-device and edge RAG workloads.<\/li>\n<li>It is built on Proxima, Alibaba\u2019s high performance, production grade, battle tested vector search engine, and is released under Apache 2.0 with Python support on Linux x86_64, Linux ARM64, and macOS ARM64.<\/li>\n<li>Zvec delivers &gt;8,000 QPS on VectorDBBench with the Cohere 10M dataset, achieving more than 2\u00d7 the previous leaderboard #1 (ZillizCloud) while also reducing index build time.<\/li>\n<li>The engine provides explicit resource governance via 64 MB streaming writes, optional mmap mode, experimental <code>memory_limit_mb<\/code>, and configurable <code>concurrency<\/code>, <code>optimize_threads<\/code>, and <code>query_threads<\/code> for CPU control.<\/li>\n<li>Zvec is RAG ready with full CRUD, schema evolution, multi vector retrieval, built in reranking (weighted fusion and RRF), and scalar vector hybrid search with optional inverted indexes, plus an ecosystem roadmap targeting LangChain, LlamaIndex, DuckDB, PostgreSQL, and real device deployments.<\/li>\n<\/ul>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n<p>Check out the\u00a0<strong><a href=\"https:\/\/zvec.org\/en\/blog\/introduction\/\" target=\"_blank\" rel=\"noreferrer noopener\">Technical details<\/a> <\/strong>and<strong> <a href=\"https:\/\/github.com\/alibaba\/zvec\" target=\"_blank\" rel=\"noreferrer noopener\">Repo<\/a>.<\/strong>\u00a0Also,\u00a0feel free to follow us on\u00a0<strong><a href=\"https:\/\/x.com\/intent\/follow?screen_name=marktechpost\" target=\"_blank\" rel=\"noreferrer noopener\"><mark>Twitter<\/mark><\/a><\/strong>\u00a0and don\u2019t forget to join our\u00a0<strong><a href=\"https:\/\/www.reddit.com\/r\/machinelearningnews\/\" target=\"_blank\" rel=\"noreferrer noopener\">100k+ ML SubReddit<\/a><\/strong>\u00a0and Subscribe to\u00a0<strong><a href=\"https:\/\/www.aidevsignals.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">our Newsletter<\/a><\/strong>. Wait! are you on telegram?\u00a0<strong><a href=\"https:\/\/t.me\/machinelearningresearchnews\" target=\"_blank\" rel=\"noreferrer noopener\">now you can join us on telegram as well.<\/a><\/strong><\/p>\n<p>The post <a href=\"https:\/\/www.marktechpost.com\/2026\/02\/10\/alibaba-open-sources-zvec-an-embedded-vector-database-bringing-sqlite-like-simplicity-and-high-performance-on-device-rag-to-edge-applications\/\">Alibaba Open-Sources Zvec: An Embedded Vector Database Bringing SQLite-like Simplicity and High-Performance On-Device RAG to Edge Applications<\/a> appeared first on <a href=\"https:\/\/www.marktechpost.com\/\">MarkTechPost<\/a>.<\/p>","protected":false},"excerpt":{"rendered":"<p>Alibaba Tongyi Lab research te&hellip;<\/p>\n","protected":false},"author":1,"featured_media":397,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-396","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/posts\/396","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=396"}],"version-history":[{"count":0,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/posts\/396\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/media\/397"}],"wp:attachment":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=396"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=396"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=396"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}