|
You are here |
jensrantil.github.io | ||
| | | | |
sookocheff.com
|
|
| | | | | Kafka is a messaging system. That's it. So why all the hype? In reality messaging is a hugely important piece of infrastructure for moving data between systems. To see why, let's look at a data pipeline without a messaging system. This system starts with Hadoop for storage and data processing. Hadoop isn't very useful without data so the first stage in using Hadoop is getting data in. Bringing Data in to Hadoop So far, not a big deal. Unfortunately, in the real world data exists on many systems in parallel, all of which need to interact with Hadoop and with each other. The situation quickly becomes more complex, ending with a system where multiple data systems are talking to one another over many channels. Each of these channels requires their own custom pro... | |
| | | | |
www.madewithtea.com
|
|
| | | | | This article is about aggregates in stateful stream processing in general. I write about the differences between Apache Spark and Apache Kafka Streams along concrete code examples. Further, I list the requirements which we might like to see covered by a stream processing framework. | |
| | | | |
www.ververica.com
|
|
| | | | | Discover Fluss, a unified streaming storage solution for Apache Flink, revolutionizing real-time data processing and analytics with sub-second latency. | |
| | | | |
endjin.com
|
|
| | | Microsoft Fabric unifies data & analytics, building on Azure Synapse Analytics for improved data-level interoperability. Explore its offerings & pros/cons. | ||