Monthly Archives: May 2025

Scoped Vector Search with the MyVector Plugin for MySQL – Part I


Semantic Search with SQL Simplicity and Operational Control

Introduction

Vector search is redefining how we work with unstructured and semantic data. Until recently, integrating it into traditional relational databases like MySQL required external services, extra infrastructure, or awkward workarounds. That changes with the MyVector plugin — a native vector indexing and search extension purpose-built for MySQL.

Whether you’re enhancing search for user-generated content, improving recommendation systems, or building AI-driven assistants, MyVector makes it possible to store, index, and search vector embeddings directly inside MySQL — with full support for SQL syntax, indexing, and filtering.

What Is MyVector?

The MyVector plugin adds native support for vector data types and approximate nearest neighbor (ANN) indexes in MySQL. It allows you to:

  • Define VECTOR(n) columns to store dense embeddings (e.g., 384-dim from BERT)
  • Index them using INDEX(column) VECTOR, which builds an HNSW-based structure
  • Run fast semantic queries using distance functions like L2_DISTANCE, COSINE_DISTANCE, and INNER_PRODUCT
  • Use full SQL syntax to filter, join, and paginate vector results alongside traditional columns

By leveraging HNSW, MyVector delivers millisecond-level ANN queries even with millions of rows — all from within MySQL.


Most importantly, it integrates directly into your existing MySQL setup—there is no new stack, no sync jobs, and no third-party dependencies.


Scoped Vector Search: The Real-World Requirement

In most production applications, you rarely want to search across all data. You need to scope vector comparisons to a subset — a single user’s data, a tenant’s records, or a relevant tag.

MyVector makes this easy by combining vector operations with standard SQL filters.

Under the Hood: HNSW and Query Performance

MyVector uses the HNSW algorithm for vector indexing. HNSW constructs a multi-layered proximity graph that enables extremely fast approximate nearest neighbor search with high recall. Key properties:

  • Logarithmic traversal through layers reduces search time
  • Dynamic index support: you can insert/update/delete vectors and reindex as needed
  • Configurable parameters like M and ef_search allow tuning for performance vs. accuracy

Under the Hood: HNSW and Query Performance

MyVector uses the HNSW algorithm for vector indexing. HNSW constructs a multi-layered proximity graph that enables extremely fast approximate nearest neighbor search with high recall. Key properties:

  • Fast ANN queries without external services
  • Scoped filtering before vector comparison
  • Logarithmic traversal through layers reduces search time
  • Dynamic index support: you can insert/update/delete vectors and reindex as needed
  • Configurable parameters like M and ef_search allow tuning for performance vs. accuracy

What’s Next

This post introduces the foundational concept of scoped vector search using MyVector and HNSW. In Part II, we’ll walk through practical schema design patterns, embedding workflows, and hybrid search strategies that combine traditional full-text matching with deep semantic understanding — using nothing but SQL.