tl;dr: A discussion of the pros and cons of storing embedding data with the data it represents vs an external vector database. I’ll be following this post up shortly with a walkthrough the boilerplate required to create and search embeddings in SQlite and PostGres.
I’ve been exploring RAG techniques and embeddings and as part of that, I’ve been checking out effective embedding generation and storage/retrieval options. As long as I’m able to perform vector searches, I don’t see the value in storing embedding data separately from the relational data it represents.
“Vector Databases: They are single-purpose DBMSs with indexes to accelerate nearest-neighbor search. RM DBMSs should soon provide native support for these data structures and search methods using their extendable type system that will render such specialized databases unnecessary“ – Stonebraker & Pavlo, 2024.
Do dedicated vector databases make sense?
Vector storage and search is at this point, essentially commoditized: going forward it’s not clear to me (or others) how dedicated vector databases can differentiate themselves from bog-standard relational databases with Vector search enabled.
Storing vector embeddings with the data that they represent is convenient and allows for succinct access to the results of vector-based searches. Some people are finding that keeping a separate vector DB in synch can be “painful at best, even for prototype applications.” That said, vector database providers are understandably keen to provide value. But, given that both storing vectors and searching them are solved problems, there doesn’t appear to be much room in which they could make any improvements.
So, even if adding vector storage to your existing database won’t work for you, making adding a secondary database your the least-worst option, there’s no obvious reason not to consider Postgres with PGVec, (or PGVec-scale) for that role.
Bonus: LLMS can speak SQL.
Many LLMs compose SQL well, which brings up an interesting possibility: LLM agents that can compose their own vector-based search queries. The use cases where this would make sense might be minimal at the moment: but interesting avenue nonetheless. I want to play around with that.
The contenders
Currently, I am creating embeddings for a dataset, storing them in Sqlite and Postgres, and performing connecting them to a local LLM, via a couple of extensions. Given that Sqlite and Postgres aren’t really competing databases, this isn’t going to be much of a comparison as much as a walkthrough of encoding and retrieving vector embeddings from SQL compatible databases. There’s Sqlite-vec, and PG-Vector.
No love for mySql
I couldn’t find an equivalent for mySQL, so if you are using mySQL, adding an secondary vector database appears to be your only straightforward option.
Of course, this isn’t likely to be the case for long: nearest-neighbour search is a solved problem, it just needs to be implemented for mySQL. There’s also been at least one stab at building one: MySQLvss. At the moment, this doesn’t seem to be a maintained: the last commit was six months ago: perhaps it can provide a starting off point, should you decide to build your own mySQL nearest-neighbour search.
Fwiw, many cloud services, such as both Oracle and Google are offering vector search functionality as part of their managed mySQL services.
Sqlite: Sqlite-vec, -lembed & -rembed
Sqlite-vec is a new database extension learned about during the AI Engineer World’s Fair keynote. It performs vector search allows the storage and retrieval of vector embeddings It seemed like something nice and shiny, and yet practical program to add to my RAG toolset.
Enabling extensions in sqlite can be less than straightforward, if you don’t have easy access to the sqlite C api. Enabling extensions via python sqlite library requires recompiling python with sqlite feature flags enabled, or just using the package from Homebrew, which comes with the feature enabled.
SqLite-Vec comes peer extensions that allow the execution of embedding models locally, or from a model running on a server or 3rd party service. PGVector is just provides yjr search function, so we’ll have to create the embeddings before we can use it. It’s a small convenience, but I can see how it would be useful for programmatic generation of embeddings (such as providing an agent semantic search over their interactions with the user and other behaviours).
Additionally, SqLite-Vec is built with WASM, so this can power AI running in the browser, or on embedded devices.
Postgres: PG-Vector, PG-Vector-scale
PGVector provides a nearest-neighbour vector search PostGres. As such, I expect it the one I’ll be reaching for more often. Additionally, it provide more sophisticated search algorithms than Sqlite: while Sqlite only implements cosine similarity, PG-Vector also search algorithms such as HNSW (Hierarchical Navigable Small World), IVFFlat (Inverted File Flat) and Euclidean distance.
PG-Vector-scale is a super fast iteration on PGVector, intended for really large deployments where the PGVector might hit performance or scale limitations. It is also addresses scalability for large datasets, distributed indexing and querying and handling billions of vectors with efficiency. As I write this, it’s not clear whether there are circumstances where it makes sense to use PGVector at all, hopefully as I dig into it, it will become clear.
Ok, so, these are the tools I’m currently playing with. Next post, I’m going to get into implementing them and working with them.
If you are a software engineer, no doubt you’ve seen some astounding new model or prompt posted to twitter, or some shamelessly fraudulent product demo and felt your blood run cold, your wizardly coding powers draining from your fingers.
The reality, however, is that Full Stack engineers are quite a bit closer to the modern AI engineering role today than they might think. Machine learning
βIn numbers, there’s probably going to be significantly more AI Engineers than there are ML engineers / LLM engineers. One can be quite successful in this role without ever training anything.β – Andrej Karpathy
AI engineering is no longer Deep Learning
The field of AI *used* to be an offshoot of Machine Learning (i.e. AI is what we used to refer to as Deep Learning), and at the bottom end of the stack, that is what it is. And certainly, if you want to build models from scratch, that’s what it takes. A little as a year ago, my feeling was that the way to approach AI was via the one I had take – via the fundamentals first: walk through the Fast.ai lesson series, getting a working understanding of Machine Learning processes as they relate to Deep Learning and build on that.
But, the systems and abstractions over the Deep Learning layer are now so powerful (and complex) that building them and using them takes an entirely different skill set. AI is being absorbed into actual engineering, and becoming an engineering field of its own. And, as such, what used to be the most direct and immediately applicable skills have changed. Moreover, as that has happened, the distance between “Full Stack” engineer and “AI” engineer has been falling over time.
Notice where the lines being drawn here? It’s not between you and AI engineering. And, AI is going to continue moving to the right on the line above: if the stack is entirely composed or managed by AI, a “Full Stack” engineer, who can’t also implement and manage AI pipelines, isn’t going to be all that “Full Stack” any more.
All in all, it’s just another tool in the tool box.
The Faustian bargain we made for a job creating shiny new toys was a never-ending supply of new shiny toys. And, AI appears particularly Faustian, in that (for now) that there’s a never-ending supply of new shiny AI tools, along with FOMO.
“We have no idea what large language models are going to be good at or bad at over any sense of time… The amount we don’t know because of how quickly this has developed is at an all time high, that lets us experiment and have a sense of play, and do things and not know how the result is going to be, which is fun. – Dan Becker @dan_s_becker
I’m working at Say Mosaic as Systems Architect of their flagship product: Smart Home in a Box. This has involved designing their cloud infrastructure, and helping improve their home-grown NLP/NLU AI systems. Today, we’re trending on product hunt.
I could get used to this trend of trending π
I say! I’m trending on github.
7/12/2016
I released a repo on Github on Sunday. And it’s trending today!
All the way down there in the middle π
Every time I find myself feeling a little silly being buoyed by a small bunch of internet points, I remember all the repos I’ve cloned or required, and where they got me π
Is there a bubble?
1/29/2015
I don’t know – you tell me.
Yesterday, my wife met someone in a pool ride who explained he was working at a start up that displayed pictures of the food in a restaurant, so you would choose which restaurant you would got to depending which pictures of food you liked. Yes – they are building Tinder for plates of food.
She asked him how they were going to take all the pictures, since most fancy restaurants are going to be switching out their menu once or twice a month.ΓΒ
Fireworks, filmed from a drone.
7/4/2014
Dear Alice and Ryan,
Congratulations on your new fondleslabs, and welcome to theΓΒ 90% of humanity that now spend all their time staring at their phones. Since I am also too busy staring at my phone to talk to you, I thought that you might appreciate a list of the things I am usually staring at, so you can stare at them too.
Coming up: Schmoozr
5/14/2013
So, I’m working on a new app to go in my portfolio. It’s called Schmoozr (until I find out that name is taken).
Say you go to some event, and meet someone there, and decide you want to exchange emails, so you can continue you the discussion further later on. Instead of handing them your email or typing their email into your phone, you open schmoozr, and it has a simple form for name, email, and perhaps a note they might want to leave for you about your conversation.