Skip to content(if available)orjump to list(if available)

Instant SQL for results as you type in DuckDB UI

carlineng

I just watched the author of this feature and blog post give a talk at the DataCouncil conference in Oakland, and it is obvious what a huge amount of craft, ingenuity, and care went into building it. Congratulations to Hamilton and the MotherDuck team for an awesome launch!

ryguyrg

wohoo! glad you noticed that. Hamilton is amazing.

wodenokoto

Is that talk available online?

carlineng

Not yet, but I believe the DataCouncil staff recorded it and will post it to their YouTube channel sometime in the next few weeks: https://www.youtube.com/@DataCouncil/videos

ryguyrg

In DuckDB UI and MotherDuck.

Awesome video of feature: https://youtu.be/aFDUlyeMBc8

Disclaimer: I’m a co-founder at MotherDuck.

rancar2

Thanks for sharing this update with the world and including it on the local ui too.

Feature request: enable the tuning of when Instant SQL is run and displayed. The erroring out with flashing updates at nearly every keystoke while expanding on a query is distracting for me personally (my brain goes into troubleshooting vs thinking mode). I like the feature (so I will keep it on by default), but I’d like to have a few modes for it depending on my working context (specifically tuning of update frequency at separation characters [space, comma], end of statement [semicolon/newline], and injections [paste/autocomplete]).

hamilton

Great feedback! Thanks. We agree w/ the red errors. It's not helpful when it feels like your editor is screaming at you.

strgcmc

This is probably stupid, but at the hope of helping others through exposing my own ignorance -- I'm having trouble actually installing and running the preview... I've downloaded the preview release duckdb binary itself, then when I try to run "duckdb -ui", I'm getting this error:

Extension Autoloading Error: An error occurred while trying to automatically install the required extension 'ui': Failed to download extension "ui" at URL "http://extensions.duckdb.org/0069af20ab/osx_arm64/ui.duckdb_..." (HTTP 403) Extension "ui" is an existing extension.

Is it looking to download the preview version of the extension, but getting blocked/unauthorized (hence the 403 forbidden response)? Or is there something about the auto-loading behavior that I'm supposed to disable maybe?

1egg0myegg0

Sorry you hit that! This is actually already working on version 1.2.2. Could you install that version? That should get you going for the moment! We will dig into what you ran into.

strgcmc

All good, v1.2.2 works fine, thank you!

theLiminator

Curious if there has been any thought given to open sourcing the UI? Of course there's no obligation to though!

hamilton

We do have plans. It's a question of effort, not business / philosophy.

rastignack

It’s good to know it. I live in a heavily regulated workplace and our data usage is constantly monitored.

Good to know a totally offline tool is being considered.

Thanks for the great tool BTW.

d0100

That would be nice as it would spare us the effort of replicating the UI, half-baked as we can

theLiminator

Thank you, that's awesome to hear!

jakozaur

It would be even better if SQL had pipe syntax. SQL is amazing, but its ordering isn’t intuitive, and only CTEs provide a reliable way to preview intermediate results. With pipes, each step could clearly show intermediate outputs.

Example:

FROM orders |> WHERE order_date >= '2024-01-01' |> AGGREGATE SUM(order_amount) AS total_spent GROUP BY customer_id |> WHERE total_spent > 1000 |> INNER JOIN customers USING(customer_id) |> CALL ENRICH.APOLLO(EMAIL > customers.email) |> AGGREGATE COUNT(*) high_value_customer GROUP BY company.country

tstack

The PRQL[1] syntax is built around pipelines and works pretty well.

I added a similar "get results as you type" feature to the SQLite integration in the Logfile Navigator (lnav)[2]. When entering PRQL queries, the preview will show the results for the current and previous stages of the pipeline. When you move the cursor around, the previews update accordingly. I was waiting years for something like PRQL to implement this since doing it with regular SQL requires more knowledge of the syntax and I didn't want to go down that path.

[1] - https://prql-lang.org [2] - https://lnav.org/2024/03/29/prql-support.html

RyanHamilton

If you want to get started with prql check out qstudio https://www.timestored.com/qstudio/prql-ide it allows running prql easily against mysql postgresql duckdb etc

hamilton

Obviously one advantage of SQL is everyone knows it. But conceptually, I agree. I think [1]Malloy is also doing some really fantastic work in this area.

This is one of the reasons I'm excited about DuckDB's upcoming [2]PEG parser. If they can pull it off, we could have alternative dialects that run on DuckDB.

[1] https://www.malloydata.dev/ [2] https://duckdb.org/2024/11/22/runtime-extensible-parsers.htm...

da_chicken

While I would certainly agree with you that putting the FROM clause first would be a significant improvement to SQL and was a genuine design mistake, this otherwise feels more like just wanting SQL to be less declarative and more imperative. Wanting it to be more like LINQ and less like relational algebra.

That, I think, is most developers' real sticking point with SQL. It's not object-relational impedance mismatch between their application and the data store, it's imperative-declarative impedance mismatch between their preferred or demonstrated talent. They are used to thinking about problems in exactly one way, so when they struggle to adapt to a different way of thinking about the problems they assume their familiarity is what's more correct.

I think this is why the same developers insist that XML/HTML is "just a markup language." Feeding a document into an executable to produce output isn't really significantly different than feeding imperative language into a compiler. The only real difference is that one is Turing complete, but Turning completeness is not a requirement of programming languages.

metadata

Google SQL has it now:

https://cloud.google.com/blog/products/data-analytics/simpli...

It's pretty neat:

    FROM mydataset.Produce
    |> WHERE sales > 0
    |> AGGREGATE SUM(sales) AS total_sales, COUNT(\*) AS num_sales
       GROUP BY item;
Edit: formatting

ryguyrg

note that DuckDB allows that reverse ordering (FROM-first)

FROM table SELECT foo, bar WHERE zoo=“goo”

viggity

it makes intellisense/autocomplete work a hell of a lot easier. LINQ in dotnet does the same thing.

crooked-v

I suspect you'll like PRQL: https://github.com/PRQL/prql

wodenokoto

I haven’t tested but I believe there’s a prql extension for duckdb

NDizzle

This is the stuff nightmares are made out of. Keep that style of coding out of any project I’m involved in, please.

sidpatil

What do you dislike about that style?

bb86754

He/she isn't used to it. Any R, Elixir, or F# developer would be right at home with this syntax.

cdchhs

that syntax is horrendous.

motoboi

DuckDb is missing a killer feature by not having a pipe syntax like kusto or google's pipe query syntax.

Why is it a killer feature? First of all, LLMs complete text from left to right. That alone is a killer feature.

But for us meatboxes with less compute power, pipe syntax allow (much better) code completion.

Pipe syntax is delightful to work with and makes going back to SQL a real bummer moment (please insert meme of Kate Perry kissing the earth here).

hantusk

CTEs go a long way towards left to right readability while keeping everything standard SQL.

gervwyk

Nothing comes close to the power of mongodb aggression pipelines.. when used in production apps it reduces the amount of code significantly for us by doing data modeling as close as possible to the source

sterlinm

[grizzled kdb+ user considers starting an argument but then thinks better of it]

XCSme

I hope this doesn't work with DELETE queries.

falcor84

Maybe in the next version they could also implement support for DROP, with autocorrect for the nearest (not yet dropped) table name.

munk-a

Or, for extra fun, it auto completes to DROP TRIGGER and just drops a single random trigger from your database. It'll help counter automation fears by ensuring your DBAs get to have a wonderful weekend on payroll where, very much in the easter spirit, they can hunt through the DB looking for the one thing that should be there but isn't!

falcor84

Wow, that's perhaps the most nefarious version of chaos engineering I had ever heard of. Kudos!

clgeoio

LLM powered queries that run in Agent mode so it can answer questions of your data before you know what to ask.

XCSme

That's actually not a bad idea, to have LLM autocomplete when you write queries, especially if you first add a comment at the top saying what you want to achieve:

// Select all orders for users registered in last year, and compute average earnings per user

SELECT ...

Covenant0028

Vibe SQLing is where it's at

krferriter

DELETED 0 rows. Did you mean `where 1=1`? (click accept to re-run with new where clause)

matsonj

for clarity: Instant SQL won't automatically run queries that write or delete data or metadata. It only runs queries that read data.

d0100

And this is a happy coincidence that json_serialize_sql doesn't work with anything but select queries

worldsayshi

Can't it just run inside a transaction that isn't committed?

crmi

Young bobby tables at it again

ryguyrg

ROFL

codetrotter

ROFL FROM jokes WHERE thats_a_new_one;

jpambrun

I really like duckdb's notebooks for exploration and this feature makes them even more awesome, but the fact that I can't share, export or commit them into a git repo feels extremely limiting. It's neat-ish that it dodfoods and store them in a duckdb database. It even seems to stores historical versions, but I can't really do anything with it..

akshayka

You can try marimo notebooks, which are stored as pure Python and support SQL cells through duckdb. (I’m one of its authors.)

https://github.com/marimo-team/marimo

hamilton

Definitely something we want too! (I'm the author / lead for the UI)

RyanHamilton

Local markdown file based sql notebooks: https://www.timestored.com/sqlnotebook Disclaimer: I'm the author

ayhanfuat

CTE inspection is amazing. I spend too much time doing that manually.

hamilton

Me too (author of the post here). In fact, I was watching a seasoned data engineer at MotherDuck show me how they would attempt to debug a regex in a CTE. As a longtime SQL user, I felt the pain immediately; haven't we all been there before? Instant SQL followed from that.

RobinL

Agree, definitely amazing feature. In the Python API you can get somewhere close with this kind of thing:

input_data = duckdb.sql("SELECT * FROM read_parquet('...')")

step_1 = duckdb.sql("SELECT ... FROM input_data JOIN ...")

step_2 = duckdb.sql("SELECT ... FROM step_1")

final = duckdb.sql("SELECT ... FROM step_2;")

ako

In datagrip you can select part of a query and execute it to see its result.

biophysboy

If there are any DuckDB engineers here, I just want you to know that your tool has been incredible for my work in bioinformatics/biotech. It has the flexibility/simplicity that biological data (messy, changing constantly) requires.

cess11

At times I've done crude implementations of similar functionality, by basically just taking the current string on change and concatenating with " LIMIT 20" before passing it to the database API and then rerendering a table if the result is an associative array rather than an error message.

I think this would be better if it was combined with information about valid words in the cursor position, which would likely be a bit more involved but achievable through querying the schema and settling on a subset of SQL. It would help people that aren't already fluent in SQL to extract the data they want. Perhaps allow them to click the suggestions to add them to the query.

I've done partial implementations of this too, that query the schema for table or column names. It's very cheap even on large, complex schemas, so it's fine to just throw every change at the database and check what drops out. In practice I didn't get much out of either beyond the fun of hacking up an ephemeral tool, or I would probably have built some small product around it.

jwilber

Amazing work. Motherduck and the duckdb ecosystem have done a great job of gathering talented engineers with great taste. Craftsmanship may be the word I’m looking for - I always look forward to their releases.

I spent the first two quarters of 2024 working on observability for a build-the-plane-as-you-fly-it style project. I can’t express how useful the cte preview would have been for debugging.

arrty88

it looks cool, but i wish i could just see the entire table that im about to query. i always start my queries with a quick `select * from table limit 10;` then go about adding the columns and joins

matsonj

`from my_table`

will do the same!

We are working on how to make it easy to switch from instant sql -> run query -> instant sql

Vaslo

I moved from pandas and SQLite to polars and DuckDB. Such an improvement in these new tools.

arsalanb

Check out livedocs.com, we built a notebook around Polars and DuckDB (disclaimer: I'm the founder)