Skip to content(if available)orjump to list(if available)

Test Postgres in Python Like SQLite

WhyNotHugo

I can't imagine a wasm running inside nodejs being faster than native code that's been optimised for decades.

> No PostgreSQL install needed—just Node.js

postgres is 32MB, nodejs is 63MB. I know that 31MB isn't a huge deal for most folks, but it's hard to see as "doesn't require postgres" as a selling point when you need something else that's twice the size instead.

TheTaytay

This isn’t because of the size. Frankly, this is appealing because we already use both python and node package managers, so not needing to reach for another binary install mechanism is really appealing.

wey-gu

Haha yeah, I put a (I know) in the readme

> Effortless Setup: No PostgreSQL install needed—just Node.js(I know)!

Was just to have kind of SQLite dx in 1 hour thus did so.

And then I thought why not open source it?

Maybe in v2 I could abstract actual binary with same dx

jauco

There’s also https://testcontainers.com/ not sure about the speed difference but testcontainers has never felt slow to me when using it from node js unittests.

hardwaresofton

Ding ding ding. Testcontainers is a fanatstic way to write the most important tests (arguably) for your app. Don't test E2E with a mock, just use a real database.

If that feels hard to you (to set up your app pointing to another DB, run a single E2E-testable part of your app with it's own DB connection, etc), fix that.

maartenh

Yep. I do create fresh db's from a fixture db using postgres's ability to create a database from a template. Very quick, always correct.

kinow

That's exactly what we are using for our tests in a new pull request to add support to Postgres, https://github.com/BSC-ES/autosubmit/pull/2187

The last GH Actions jobs with SQLite and Python 3.9 took 3m 41s, and the same tests with Postgres took 4m 11s. Running a single test locally in PyCharm also executes in less than 1 second. You notice some bootstrap happening, but once the container image is downloaded locally, it's really quite fast.

wey-gu

yeah, w/o py-pglite attempt this should be the only approach, the pglite ideally could make it more flexibly/lightweight in unittest cases, but as you mentioned it's never felt slow, it should be fine to working on it.

And actually, more e2e cases I think it's way better to not use the lite backend.

the non-container solutions would do more like the lifecycle mgmt/isolated env prep/tear-down with elegantly designed abstractions. While I think similar abstractions could be done on top of containers.

Maybe we ideally could have unified abstractions on both container-based, wasm evantually to boost dx yet with different expectation of speed vs compatibility.

globular-toast

For Python specifically I use pytest-docker: https://pypi.org/project/pytest-docker/

benpacker

I have this setup and integrated for Node/Bun -

This is an example of a unit test of an API route on a fully isolated WASM backed Postgres - very few lines of code, and all your API unit tests can run fully in parallel without any shared state: https://github.com/ben-pr-p/bprp-react-router-starter/blob/m...

This is all of the code needed to use Postgres in prod and PGLite in test/dev: https://github.com/ben-pr-p/bprp-react-router-starter/blob/m...

wey-gu

wow thanks! Should I use bun instead of node now?

benpacker

It’s the same - just saying it works in both.

I like it because I can do full stack development, including the database, with a single system level dependency (Bun).

wey-gu

ha, thanks, make sense

samwillis

Awesome work Wey! Love that you're building this!

I work on PGlite, we have an experimental WASI build that can run in a WASI runtime, it should enable dropping the Node requirement. It lacks error handling at the moment (WASM has no long jump, and Postgres uses that for error handling - Emscripten has hacks that fix this via JS), and so we haven't yet pushed it far.

Do ping me on our Discord and I can point you towards it.

Happy to answers any PGlite questions while I'm here!

veggieroll

Yo, this installs npm packages at runtime. Very not cool IMO. You should disclose this prominently in the README.

This is a nice project idea. But, you should use a Python WASM interpreter to run the PostgreSQL WASM.

ptx

How does running PostgreSQL compiled to WebAssembly reduce "the overhead of a full PostgreSQL installation"? Couldn't a native version be configured similarly and avoid the additional overhead of WebAssembly, Node.js and npm?

CamouflagedKiwi

Yes, it can. It's not especially hard to start up a Postgres instance, and with a couple of config tweaks you can improve the startup time. I've had this working nicely at a previous job, it's under a couple of seconds to start.

wey-gu

Thanks~

perrygeo

For Clojure and Java apps, check out Zonky (https://github.com/zonkyio/embedded-postgres). It provides a similar experience on the JVM, but instead of containers or WASM, you're running an embedded native binary.

wey-gu

Thanks! This is the ultimate shape I am going to pursue!

selimnairb

I just use pytest-docker-compose, then I don’t need to bother with NPM. I usually don’t like “magic”, but pytest’s fixtures are so powerful I’m okay with a little bit of “magic”.

laurencerowe

This is running pglite in a node subprocess. Why not just run Postgres itself as a subprocess with a data directory in a tempdir?

wey-gu

I think the ultimate version in such use case would be carefully wire-up the baremetal one with ad-hoc in-mem-disk or tempdir :), this could be a future backend of py-pglite(planned in v2).

For now, it's more accessible for me to hack it in hours and it works.

isoprophlex

Exactly; in the past I've had reasonable success with test fixtures based on temp dirs and a templated docker compose file. Just needs docker in the environment, which is not too far fetched.

buremba

Amazing, this is what I was trying to find for the last few weeks! I wonder if it's possible to run WASM directly from Python instead of the subprocess approach, though?

murkt

I wonder if it’s possible to compile Postgres directly into a Python extension instead of WASM. Just import it and go forth, no node dependency, nothing.

samwillis

This is something I've explored as part of my work on PGlite. It's possible but needs quite a bit of work, and would come with some limitations until Postgres upstream make some changes.

You will need to use an unrolled main loop similar to what we have in PGlite, using the "single user mode" (you likely don't want to fork sub processes like a normal Postgres). The problems come with then trying to run multiple instances in a single process, Postgres makes heavy use of global vars for state (they can as they fork for each session), these would clash if you had multiple instances. There is work happening to possibly make Postgres multi-threaded, this will solve that problem.

The long term ambition of the PGlite project is to create a libpglite, a low level embedded Postgres with a C api, that will enable all this. We quite far off though - happy to have people join the project to help make it happen!

wey-gu

wow, thanks! it should be feasible! and as I recall there are such thing from some databases(chromadb, milvus-lite) in the py-first communities.

we could think big to someday do that within py-pglite project actually.

let me put it as the roadmap of v2(much more work to do!)

heinrichhartman

Can you explain this? What is the compilation target? What is the compiler? How does being a "python extension" help?

wey-gu

the target would be just baremetal binary with some changes needed and the purpose was just to be python dx optimized