Building AI agents to query your databases
18 comments
·March 14, 2025bob1029
> This abstraction shields users from the complexity of the underlying systems and allows us to add new data sources without changing the user experience.
Cursed mission. These sorts of things do work amazingly well for toy problem domains. But, once you get into more complex business involving 4-way+ joins, things go sideways fast.
I think it might be possible to have a human in the loop during the SQL authoring phase, but there's no way you can do it clean without outside interaction in all cases.
95% correct might sound amazing at first, but it might as well be 0% in practice. You need to be perfectly correct when working with data in bulk with SQL operations.
mritchie712
Using a semantic layer is the cleanest way to have a human in the loop. A human can validate and create all important metrics (e.g. what does "monthly active users" really mean) then an LLM can use that metric definition whenever asked for MAU.
With a semantic layer, you get the added benefit of writing queries in JSON instead of raw SQL. LLM's are much more consistent at writing a small JSON vs. hundreds of lines of SQL.
We[0] use cube[1] for this. It's the best open source semantic layer, but there's a couple closed source options too.
ozim
A bit of a joke here:
It solves 100% cases where some manager requests dashboard never to look at it again after one day.
aargh_aargh
There's no smoke without fire.
spolu
Hi, you are right that things can go sideways fast. In practice, the data that the typical employee needs is also quite simple. So there is definitely a very nice fit for this kind of product with a large number of use-case that we do see provide a lot of value internally for employees (self access to data) and data scientist (reducing loads).
For complex queries/use-cases, we generally instead push our users to create agents that assist them in shaping SQL directly, instead of going directly from text to result/graphs. Pushes them to think more about correctness while still saving them tone of time (the agent has access to the table schemas etc...), but not a good fit for non technical people of course.
abirch
This works well until it doesn’t. As long as there is someone who is responsible for the data correctness. E.g. the cardinality of two joining tables maintained cardinality instead of: there’s currently no one in the system in with two locations in the employee_to_location table so it works right now. One it happens there will be the wrong employee count from this query
ritz_labringue
It does require writing good instructions for the LLM to properly use the tables, and it works best if you carefully pick the tables that your agent is allowed to use beforehand. We have many users that use it for every day work with real data (definitely not toy problems).
spolu
Yes you are perfectly right. Our product pushes users to be selective on the tables they give access to a given agent for a given use-case :+1:
The tricky part is correctly supporting multiple systems which each have their own specificity. All the way to Salesforce which is an entirely different beast in terms of query language. We're working on it right now and will likely follow-up with a blog post there :+1:
iLoveOncall
If only we had a language to accurately describe what we want to retrieve from the database! Alas, one can only dream!
troupo
> It does require writing good instructions for the LLM to properly use the tables
--- start quote ---
prompt engineering is nothing but an attempt to reverse-engineer a non-deterministic black box for which any of the parameters below are unknown:
- training set
- weights
- constraints on the model
- layers between you and the model that transform both your input and the model's output that can change at any time
- availability of compute for your specific query
- and definitely some more details I haven't thought of
https://dmitriid.com/prompting-llms-is-not-engineering
--- end quote ---
gavinray
Interesting -- I work on a similar tool [1], and the JSON IR representation for your query is similar to the internal IR we used for our data connectors.
mritchie712
We (https://www.definite.app/) solved this a bit differently.
We spin up a data lake and pipelines (we support 500+ integrations / connectors) to populate the data lake for you then put DuckDB on top as a single query engine to access all your data.
yonl
This is really interesting. At my previous company, I built a data lakehouse for operational reporting with recency prioritization (query only recent data, archive the rest). While there was no LLM integration when I left, I've learned from former colleagues that they've since added a lightweight LLM layer on top (though I suspect Dustt's implementation is more comprehensive).
Our main requirement was querying recent operational data across daily/weekly/monthly/quarterly timeframes. The data sources included OLTP binlogs, OLAP views, SFDC, and about 15 other marketing platforms. We implemented a datalake with our own query and archival layers. This approach worked well for queries like "conversion rate per channel this quarter" where we needed broad data coverage (all 17 integrations) but manageable depth (reasonable row scanned).
This architecture also enabled quick solutions for additional use cases, like on-the-fly SFDC data enrichment that our analytics team could handle independently. Later, I learned the team integrated LLMs as they began dumping OLAP views inside the datalake for different query types, and eventually replaced our original query layer with DuckDB.
I believe approaches like these (what I had done as in house solution and what definite may be doing more extensively) are data and query-pattern focused first. While it might initially seem like overkill, this approach can withstand organizational complexity challenges - with LLMs serving primarily as an interpretation layer. From skimming the Dustt blog, their approach is refreshing, though it seems their product was built primarily for LLM integration rather than focusing first on data management and scale. They likely have internal mechanisms to handle various use cases that weren't detailed in the blog.
lennythedev
Just messed around with the concept last week. With a good enough schema explanation, bigger reasoning models did an amazing job. Definitely something I'm going to use in my dataflows.
LaGrange
I mean this is generally repulsive, but please I beg of you, run this exclusively against a read only replica. I mean, you should have one for exploratory queries _anyway_, but nobody ever does that.
"Validating the query to ensure it's safe and well-formed" all I can say to that is "ROFL. LMAO."
Sayyidalijufri
Cool
Tewboo
I've built AI agents for database queries; they streamline processes but require careful design to avoid overloading the system.
My general method for things like this is to:
1. Get a .dot file of the database. Many tools will export this. 2. Open the .dot in a tool I built for the purpose. 3. Select the tables I'm interested in, and export a subset of the .dot file representing just those tables and relationships. 4. Hand that subset .dot file to the LLM and say, "given this schema, write a query -- here's what I want: <rest of the request here>"
That gets the job done 60% of the time. Sometimes when there's an incorrect shortcut relationship resulting in the wrong join I'll have to redirect with something like, "You need to go through <table list> to relate <table X> to <table Y>" That gets my success rate up above 95%. I'm not doing ridiculous queries, but I am doing recursive aggregations successfully.