Show HN: Pyper – Concurrent Python Made Simple
10 comments
·January 12, 2025solidasparagus
Nice work! There is a gap when it comes to writing single-machine, concurrent CPU-bound python code. Ray is too big, pykka is threads only, builtins are poorly abstracted. The syntax is also very nice!
But I'm not sure I can use this even though I have a specific use-case that feels like it would work well (high-performance pure Python downloading from cloud object storage). The examples are a bit too simple and I don't understand how I can do more complicated things.
I chunk up my work, run it in parallel and then I need to do a fan-in step to reduce my chunks - how do you do that in Pyper?
Can the processes have state? Pure functions are nice, but if I'm reaching for multiprocess, I need performance and if I need performance, I'll often want a cache of some sort (I don't want to pickle and re-instantiate a cloud client every time I download some bytes for instance).
How do exceptions work? Observability? Logs/prints?
Then there's stuff that is probably asking too much from this project, but I get it if I write my own python pipeline so it matters to me - rate limiting WIP, cancellation, progress bars.
But if some of these problems are/were solved and it offers an easy way to use multiprocessing in python, I would probably use it!
globular-toast
Do you really need to reinvent the wheel every time for parallel workloads? Just learn GNU parallel and write single-threaded code.
Concurrency in general isn't about parallelism. It's just about doing multiple things at the same time.
halfcat
> I don't want to pickle and re-instantiate a cloud client every time I download some bytes for instance
Have you tried multiprocessing.shared_memory to address this?
solidasparagus
I haven't played with that much! This isn't really a problem in general for my approach to writing this sort of code - when I use multiprocessing, I use a Process class or a worker task function with a setup step followed by a while loop that pulls from a work/control queue. But in the Pyper functional programming world, it would be a concern.
IIRC multiprocessing.shared_memory is a much more low-level of abstraction than most python stuff, so I think I'd need to figure out how to make the client use the shared memory and I'm not sure if I could.
rtpg
You really should dive more into the `multiprocess` support option and highlight how this gets around issues with the GIL. This feels like a major value add, and "does this help with CPU-bound work" being "yes" is a big deal!
I don't really need pipelining that much, but pipelining along with a certain level of durability and easy multiprocessing support? Now we're talking
t43562
...although python 3.13 can be built without the GIL and it really does make threading useful. I did some comparisons with and without.
I suppose one excellent thing about this would be if you could just change 1 parameter and switch from multiprocessing to threaded.
yablak
How does this compare to https://github.com/svenkreiss/pysparkling?
minig33
This is cool - I’ve been looking for something like this. I really liked the syntax of Prefect v1 but it was overcomplicated with execution configuration in subsequent versions. I just want something to help me just run async pipelines and prevent AsyncIO weirdness - going to test this out.
grandma_tea
Nice! I'm looking forward to trying it out. This seems very similar to https://github.com/cgarciae/pypeln/
kissgyorgy
Very simple and elegant API!
Hello and happy new year!
We're excited to introduce the Pyper package for concurrency & parallelism in Python. Pyper is a flexible framework for concurrent / parallel data processing, following the functional paradigm.
Source code can be found on [github](https://github.com/pyper-dev/pyper)
Key features:
Intuitive API: Easy to learn, easy to think about. Implements clean abstractions to seamlessly unify threaded, multiprocessed, and asynchronous work.
Functional Paradigm: Python functions are the building blocks of data pipelines. Let's you write clean, reusable code naturally.
Safety: Hides the heavy lifting of underlying task execution and resource clean-up. No more worrying about race conditions, memory leaks, or thread-level error handling.
Efficiency: Designed from the ground up for lazy execution, using queues, workers, and generators.
Pure Python: Lightweight, with zero sub-dependencies.
We'd love to hear any feedback on this project!