|
You are here |
quix.io | ||
| | | | |
tech.scribd.com
|
|
| | | | | Streaming data from Apache Kafka into Delta Lake is an integral part of Scribd's data platform, but has been challenging to manage and scale. We use Spark Structured Streaming jobs to read data from Kafka topics and write that data into Delta Lake tables. This approach gets the job done but in production our experience has convinced us that a different approach is necessary to efficiently bring data from Kafka to Delta Lake. To serve this need, we created kafka-delta-ingest. | |
| | | | |
www.decodable.co
|
|
| | | | | Apache Flink and Spark Structured Streaming are two leading real-time processing frameworks. In this blog post, we will talk about why we picked Flink to be the foundation of our platform from 3 different perspectives. | |
| | | | |
www.morling.dev
|
|
| | | | | Postgres logical replication, while powerful for capturing real-time data changes, presents challenges with TOAST columns, whose values can be absent from data change events in specific situations. This post discusses how Debezium addresses this through its built-in reselect post processor, then explores more robust solutions leveraging Apache Flink's capabilities for stateful stream processing, including Flink SQL and the brand-new process table functions (PTFs) in Flink 2.1. | |
| | | | |
greg.molnar.io
|
|
| | | In this tutorial, I will show you how to use MRSK to deploy a Rails app to a VPS, run Caddy in front of the docker container to handle SSL, use a hosted database server, run Redis on the same droplet, run a worker to process background jobs | ||