Anatomy of a SQL Engine
10 comments
·April 26, 2025ignoreusernames
Abde-Notte
Second this - building even a simple engine gives real insight into query planning and execution. Once parsing is handled, the core ideas are a lot more approachable than they seem.
gopalv
This is a great write up about a pull-style volcano SQL engine.
The IR I've used is the Calcite implementation, this looks very concept adjacent enough that it makes sense on the first read.
> tmp2/test-branch> explain plan select count() from xy join uv on x = u;
One of the helpful things we did was to build a graphviz dot export for the explains plans, which saved us days and years of work when trying to explain an optimization problem between the physical and logical layers.
My version would end up displayed as SVG like this
https://web.archive.org/web/20190724161156/http://people.apa...
But the calcite logical plans also have that dot export modes.
gavinray
Calcite also has a relatively-unknown web tool for plan visualization that lets you step through execution.
It's a method from "RuleMatchVisualizer":
https://github.com/apache/calcite/blob/36f6dddd894b8b79edeb5...
Here's a screenshot of what the webpage looks like, for anyone curious:
https://github.com/GavinRay97/GraphQLCalcite/blob/92b18a850d...
th0ma5
This is really great!!
Austizzle
Man, this title tripped me up for a minute because I pronounce it with the letters like Ess-Queue-Ell
So the "A" in "A ess-queue-ell" engine felt like it should have been an "An" until I realized it was meant to be pronounced like "sequel"
kreetx
Many (most?) non-native English speakers do pronounce it as ess-queue-ell, especially in their own languages, so yes, the use of "a" instead of "an" does look off from that perspective.
perching_aix
Not necessarily, I see native speakers completely ignore this a lot.
Have you ever considered pronouncing it as squirrel by the way?
null
jimbokun
Very nice write up enumerating all the stages of SQL query execution. Interesting even if you don’t care about the DoIt database specifically.
I recommend anyone who works with databases to write a simple engine. It's a lot simpler than you may think and it's a great exercise. If using python, sqlglot (https://github.com/tobymao/sqlglot) let's you skip all the parsing and it even does some simple optimizations. From the parsed query tree it's pretty straightforward to build a logical plan and execute that. You can even use python's builtin ast module to convert sql expressions into python ones (so no need for a custom interpreter!)