Demos of Stream Processing Solving Real-World Problems
4 comments
·March 6, 2025halfcat
> “stream processing" might sound intimidating… we believe this isn't true
> Getting started: Install Kafka…
I know this is all relative, but running Kafka isn’t realistic for a small team.
Until a business reaches an enterprise level of maturity where everyone is on the same systems, there is always a mess of legacy systems from acquisitions that don’t even emit events. In order to make those systems work with streaming you often end up with something like “read the entire source data set, read the entire destination data set, and compare” to produce a single event that there’s a new row. And many of these legacy systems don’t have unique identifiers, or modification timestamps, among other problems that make these challenging to integrate with a streaming approach.
People like to say, “batch is a special form of streaming, so just use streaming”, and that’s true if you’re always working with nice, modern systems that emit perfect events.
But streaming and event based systems tend to fall out of sync over time, whether due to bugs in our code, an end user changes a field name in the source system of record, or just the lack of a proper distributed transaction solution. And when the systems fall out of sync, now you need a sync mechanism to find and remediate discrepancies.
So to make the opposing disingenuous argument (joking, but only half):
* Batch is a special case of streaming, so just use streaming
* Streaming is a fragile form of sync, so just use sync
* Sync is a special case of batch, so just use batch
datadrivenangel
And Kafka ends up with the infinite-data / sync issue anyways: if an event stream grows unbounded, eventually it will be too big and you'll run into problems, so you start needing to cull events, which then means that you need to do some kind of batch sync to start new consumers.
Or when you need to store state, have the event stream be your entire source of truth and the applications be stateless... but there be dragons.
null
A nice ad for RisingWave. Stream processing is good, but most actual problems are easier to solve with small batches.