Scoped Vector Search with the MyVector Plugin for MySQL — Part III

From Concepts to Production: Real-World Patterns, Query Plans, and What’s Next

In Part I, we introduced scoped vector search in MySQL using the MyVector plugin, focusing on how semantic similarity and SQL filtering work together.

In Part II, we explored schema design, embedding strategies, HNSW indexing, hybrid queries, and tuning — and closed with a promise to show real-world usage and execution behavior.

This final part completes the series.


Semantic Search with Explicit Scope

In real systems, semantic search is almost never global. Results must be filtered by tenant, user, or domain before ranking by similarity.

SELECT id, title
FROM knowledge_base
WHERE tenant_id = 42
ORDER BY
myvector_distance(embedding, ?, 'COSINE')
LIMIT 10;

This follows the same pattern introduced earlier in the series:

  • SQL predicates define scope
  • Vector distance defines relevance
  • MySQL remains in control of execution

Real-Time Document Recall (Chunk-Based Retrieval)

Document-level embeddings are often too coarse. Most AI workflows retrieve chunks.

SQL
SELECT chunk_text
FROM document_chunks
WHERE document_id = ?
ORDER BY
myvector_distance(chunk_embedding, ?, 'L2')
LIMIT 6;

This query pattern is commonly used for:

  • Knowledge-base lookups
  • Assistant context retrieval
  • Pre-RAG recall stages

Chat Message Memory and Re-Ranking

Chronological chat history is rarely useful on its own. Semantic re-ranking allows systems to recall relevant prior messages.

SQL
SELECT message
FROM chat_history
WHERE session_id = ?
ORDER BY
myvector_distance(message_embedding, ?, 'COSINE')
LIMIT 8;

The result set can be fed directly into an LLM prompt as conversational memory.


Using MyVector in RAG Pipelines

MyVector integrates naturally into Retrieval-Augmented Generation workflows by acting as the retrieval layer.

SQL
SELECT id, content
FROM documents
WHERE MYVECTOR_IS_ANN(
'mydb.documents.embedding',
'id',
?
)
LIMIT 12;

At this point:

  • Embeddings are generated externally
  • Retrieval happens inside MySQL
  • Generation happens downstream

No additional vector database is required.


Query Execution and Fallback Behavior

ANN Execution Path (HNSW Enabled)

Once an HNSW index is created and loaded, MySQL uses the ANN execution path provided by the plugin.
Candidate IDs are retrieved first, followed by row lookups.

This behavior is visible via EXPLAIN.


Brute-Force Fallback (No HNSW Index)

When no ANN index is available, MyVector falls back to deterministic KNN evaluation.

SQL
SELECT id
FROM documents
ORDER BY
myvector_distance(embedding, ?, 'L2')
LIMIT 20;

This results in a full scan and sort — slower, but correct and predictable.

Understanding this fallback is critical for production sizing and diagnostics.


Project Update: MyVector v1.26.1

The project continues to move quickly.

MyVector v1.26.1 is now available, introducing enhanced Docker support for:

  • MySQL 8.4 LTS
  • MySQL 9.0

This release significantly improves:


Stop Moving Data — Start Searching It Where It Lives

Across all three parts, the conclusion is consistent:

Vector search does not require a separate database.

With MyVector, you can:

  • Keep data in MySQL
  • Apply strict SQL scoping
  • Use ANN when available
  • Fall back safely when it isn’t

All with observable execution plans and predictable behavior.


Join the Community

Development happens in the open:

Feedback and contributions are welcome.


Next Up: Powering AI-Ready MySQL — When MyVector Meets ProxySQL

The next step is production architecture.

In the next post, we’ll explore:

  • Integrated MCP Server
  • Improved Full Text Search operations
  • Routing vector-heavy queries with ProxySQL
  • Isolating ANN workloads from OLTP traffic
  • Designing AI-ready MySQL deployments that scale safely

MyVector brings semantic search into MySQL.
ProxySQL helps it run at scale.

Stay tuned…

Fiddling with MySQL MCP Server During Holidays

I was able to get the MySQL MCP Server up and running and it was time to try it first time against a sample data set outside of its test suite.

Let’s start with Stack Overflow and the value of its data for modern LLMs and AI systems. As most of you already know, the questions in forums has significantly dropped since the launch of OpenAI, aka ChatGPT

Source : https://x.com/alojoh/status/2009366134513127462

This is probably true for traditional search engines like Google, although they’re already adopting built-in AI solutions. Other sources, such as Reddit, also provided valuable, quick, and easy information to the masses.

I wanted to spin a subset of StackExchange questions. These are loaded into a local MySQL server. I intended to put the MCP server to a test to find out what’s going on.

Results aren’t only interesting but also very intriguing. The reason is the questions at these forums and their answers. If these questions and answers don’t exist, how will LLMs train themselves to give precise answers? They may otherwise rely on other outdated sources and start hallucinating.


So first thing we spin up the lastes version of the mysql-mcp-server and configure it to work with Claude desktop per the instructions in the documentation. Basically, get the data sample dataset from the StackExchange data dump site. You can ask Cursor, Gemini, or Claude for help. They can set up this data sample in your local MySQL database server. Here’s the github repository for this data set as well.

My prompt: Explore Stack Exchange database


🚁 Stack Exchange: Drones & Model Aircraft – Complete Analysis

📊 Community Overview

Site: Drones and Model Aircraft Stack Exchange
Period: April 2020 – March 2024 (4 years)
Status: 🚨 Declining but quality community

MetricValue
Total Questions908
Total Answers1,277
Total Users5,251
Total Votes9,983
Avg Answers/Question1.41
Acceptance Rate49.45%
Avg Question Views757

I will not copy and paste the rest of the conversation. It’s a big, long one. However, the results are amazing on the subject of Drones.

As a drone pilot for almost a decade, the following questions are seriously intriguing.

🔥 Most Viral Questions (5K+ Views)

TitleViewsScoreAnswers
At what current should I charge a LiPo battery?24,27894
Why do brushless motors have 3 wires?17,83482
What do ESC beeps mean?16,44452
How come DJI Fly App is NOT in Play Store?15,826124
How to configure servo in Betaflight?12,05252

Prompt: Why do multirotors usually have four propellers?

ACCEPTED ANSWER (27 upvotes) – by Kenn Sebesta

TL;DR: Four is the mathematical minimum for simple, robust control


The Physics: 4 Degrees of Freedom

For hovering aircraft, we need to control 4 degrees of freedom (DoF):

  1. Roll (tilt left/right)
  2. Pitch (tilt forward/backward)
  3. Yaw (rotate clockwise/counterclockwise)
  4. Thrust (up/down)

(We can control horizontal movement (x-y) by combining roll/pitch with thrust)

These are some samples. If you run it yourself, there are amazing insights into the data set. Not only that, but you can question the results, trends, and answers.

Going back to MySQL and like any other database out of the box query optimization was terribly wrong. We dug in a bit further. We discovered that most of the queries it ran for the above results were inefficient.

1. Row Count Query – 100x Faster ⚡

Problem: 9 separate full table scans
Solution: Use information_schema.TABLES metadata
Speedup: 2000ms → 20ms

2. Top Users with Posts – 50x Faster ⚡⚡

Problem: Joins ALL posts for ALL users before filtering
Solution: Filter to top 15 users FIRST, then join
Speedup: 800ms → 15ms
✅ Tested successfully – Works perfectly!

3. Monthly Activity – 10x Faster ⚡

Problem: YEAR() and MONTH() functions prevent index usage
Solution: Use DATE_FORMAT() and index-friendly WHERE
Speedup: 500ms → 50ms

4. Vote Percentages – 10x Faster ⚡

Problem: Subquery recalculates total count for each group
Solution: Calculate once with CTE or window function
Speedup: 400ms → 40ms

5. Answer Distribution – 3x Faster ⚡

Problem: CASE expression in GROUP BY
Solution: Group by integer bucket IDs, then label
Speedup: 300ms → 100ms

This led me to create an optimization guide for the Claude agent.

In conclusion, the MCP servers are great resources to explore data sets. With some experimentation and guidance, they can reveal highly valuable analytics use cases. These include marketing and sales data that would normally take too much time and material to cover.

Next up is token usage. If you are also wondering, “Where have all my tokens gone using these AI tools?” I have some thoughts on that topic, too.

2025 Rewind and Thank You

I’m grateful to all my professional and personal networks for this year. It has been full of tears, sweat, and blood all over my face once again. Let’s not worry about that. I want to start with a big Thank You to all of you who made this year possible.

If I look back at what stood out in 2025, just before we hit 2026.

Oracle ACE Pro 

I was thrilled to be nominated to the Oracle ACE Program as an ACE Pro in April. This recognition opened doors to launch a technical blog series on vector search and AI integration with MySQL.

Project Antalya at Altinity, Inc. 

We announced native Iceberg catalog and Parquet support on S3 for ClickHouse. This pushes the boundaries of what’s possible with open lakehouse analytics.

MySQL MCP Server 

Introduced a lightweight, secure MySQL MCP server bridging relational databases and LLMs. Practical AI integration starts with safety and observability.

FOSDEM & MySQL’s 30th Birthday 

I have one of my busiest agendas in ten years. It includes the MySQL Devroom Committee, a talk, and an O’Reilly book signing for #mysqlcookbook4e. Additionally, there are 6 talks from Altinity.

O’Reilly Recognition 

After 50+ hours of flights for conferences, I came home to O’Reilly’s all-time recognition for the MySQL Cookbook. It was a moment I won’t forget.

Sailing While Working 

Once again, months at sea with salt, humidity, and wind were challenging. We handled tickets, RCAs, and meetings. We even recorded a podcast on ferry maneuvering. Born to sail, forced to work, making it work anyway.

I am immensely grateful to the #MySQL, #ClickHouse, and #opensource communities. Thank you to my co-authors Sveta Smirnova and Ibrar Ahmed. I also thank my nominator, Vinicius Grippa. I appreciate the Altinity team and every conference organizer who gave me a stage this year.

Recognition is an invitation to contribute more, not a finish line. Looking forward to more open-source collaboration in 2026.

If you’re passionate about open-source databases, MySQL, ClickHouse, or AI integration, or just want to connect, reach out.

#opensource #mysql #clickhouse #oracleacepro #ai #vectorsearch #sailing #LinkedInRewind #Coauthor #2025wrapped

Introducing Lightweight MySQL MCP Server: Secure AI Database Access


A lightweight, secure, and extensible MCP (Model Context Protocol) server for MySQL designed to bridge the gap between relational databases and large language models (LLMs).

I’m releasing a new open-source project: mysql-mcp-server, a lightweight server that connects MySQL to AI tools via the Model Context Protocol (MCP). It’s designed to make MySQL safely accessible to language models, structured, read-only, and fully auditable.

This project started out of a practical need: as LLMs become part of everyday development workflows, there’s growing interest in using them to explore database schemas, write queries, or inspect real data. But exposing production databases directly to AI tools is a risk, especially without guardrails.

mysql-mcp-server offers a simple, secure solution. It provides a minimal but powerful MCP server that speaks directly to MySQL, while enforcing safety, observability, and structure.

What it does

mysql-mcp-server allows tools that speak MC, such as Claude Desktop, to interact with MySQL in a controlled, read-only environment. It currently supports:

  • Listing databases, tables, and columns
  • Describing table schemas
  • Running parameterized SELECT queries with row limits
  • Introspecting indexes, views, triggers (optional tools)
  • Handling multiple connections through DSNs
  • Optional vector search support if using MyVector
  • Running as either a local MCP-compatible binary or a remote REST API server

By default, it rejects any unsafe operations such as INSERT, UPDATE, or DROP. The goal is to make the server safe enough to be used locally or in shared environments without unintended side effects.

Why this matters

As more developers, analysts, and teams adopt LLMs for querying and documentation, there’s a gap between conversational interfaces and real database systems. Model Context Protocol helps bridge that gap by defining a set of safe, predictable tools that LLMs can use.

mysql-mcp-server brings that model to MySQL in a way that respects production safety while enabling exploration, inspection, and prototyping. It’s helpful in local development, devops workflows, support diagnostics, and even hybrid RAG scenarios when paired with a vector index.

Getting started

You can run it with Docker:

docker run -e MYSQL_DSN='user:pass@tcp(mysql-host:3306)/' \
  -p 7788:7788 ghcr.io/askdba/mysql-mcp-server:latest

Or install via Homebrew:

brew install askdba/tap/mysql-mcp-server
mysql-mcp-server

Once running, you can connect any MCP-compatible client (like Claude Desktop) to the server and begin issuing structured queries.

Use cases

  • Developers inspecting unfamiliar databases during onboarding
  • Data teams writing and validating SQL queries with AI assistance
  • Local RAG applications using MySQL and vector search with MyVector
  • Support and SRE teams need read-only access for troubleshooting

Roadmap and contributions

This is an early release and still evolving. Planned additions include:

  • More granular introspection tools (e.g., constraints, stored procedures)
  • Connection pooling and config profiles
  • Structured logging and tracing
  • More examples for integrating with LLM environments

If you’re working on anything related to MySQL, open-source AI tooling, or database accessibility, I’d be glad to collaborate.

Learn more

If you have feedback, ideas, or want to contribute, the project is open and active. Pull requests, bug reports, and discussions are all welcome.

Why successful people should be mentored?


Not knowing whether you need a mentor or not at the beginning.

People often assume that mentorship is something you outgrow. They imagine it as a ladder. It’s something you climb early in your career. Once you reach a certain level of success, you step off and stand on your own.

In reality, the opposite is true. The higher you go, the more critical mentorship becomes. I’ve learned this repeatedly throughout my career in open source, in leadership, and in life.


Success Doesn’t Eliminate Blind Spots

When you achieve success, you start to hear less honest feedback. People around you become careful with their words. Colleagues hesitate to challenge your ideas. Slowly, your perspective narrows. It happens not out of arrogance. It’s hard to see what no one reflects back at you.

That’s where a mentor makes all the difference. A good mentor isn’t impressed by your title or your achievements. They see you as the person behind the professional identity. They’ll challenge your assumptions and remind you that growth never stops, no matter how far you’ve come.


Learning Never Ends

The world around us moves too fast for anyone to claim mastery. Technologies evolve, leadership philosophies change, and the definition of success itself shifts over time.

Mentorship keeps you learning. It introduces you to new ways of thinking, new perspectives, and new generations. It forces you to stay curious, and curiosity is what keeps leaders relevant.

In my years working with global database communities, I’ve seen brilliant engineers become stagnant simply because they stopped seeking input. The best ones? They’re still asking questions, still open to being mentored.


Every Step Forward Is New Territory

No matter how experienced you are, every stage of career growth is unexplored terrain. Each new role, responsibility, or challenge introduces conditions you’ve never faced. There are new dynamics, new expectations, and sometimes, new vulnerabilities.

Mentors are the ones who’ve walked those paths already. They know where the turns are, where you stumble, and how to prepare for what’s coming next. They help you see beyond the horizon of your current comfort zone.

That foresight is the ability to anticipate the next chapter of your journey. It is one of the most valuable gifts mentorship offers. It opens your mind to possibilities you have never considered. It helps you approach the unknown with clarity rather than fear.


The Lonely Space at the Top

Leadership is often described as empowering, and it is, but it’s also lonely. You carry responsibilities that few others truly understand. You can’t always be vulnerable with your team or share the full weight of the decisions you make.

Having a mentor gives you a space to breathe. Someone who listens without judging, who helps you find balance when everything feels heavy. Sometimes, mentorship isn’t about advice at all. It’s about presence and perspective. It’s about being reminded that you’re not alone in figuring it out.


From Achievement to Legacy

There’s a point where success stops being about how much you achieve and starts being about what you enable. Mentorship helps you make that shift.

It turns experience into impact. It teaches you how to guide others. It shows you how to pass on lessons without ego. You learn how to translate hard-earned wisdom into something that outlives your career. Every time I’ve been mentored, I’ve become a better mentor myself. I think that is the real cycle of growth.


The Real Value of Mentorship

I’ve come to see mentorship not as a career stage, but as a lifelong relationship with learning. It keeps you honest. It keeps you grounded. And it ensures that success doesn’t harden into comfort.

If anything, mentorship is a mirror. It helps you stay true to your principles. It also connects you to your evolution and your humanity.

No matter how much experience I gain, I’ll always seek mentors. Because the moment I stop learning from others is the moment I stop growing.

This is why I still believe in mentorship even after a successful career. In conclusion, have I had a dedicated mentor in my career? The short answer is no, but I’ve had role models along the way. I’ve used them as my mentors and always asked them what would I do if I were them.


Book Recommendation: https://a.co/d/hc6f6le