|
You are here |
www.ssp.sh | ||
| | | | |
github.com
|
|
| | | | | Practical Data Engineering: A Hands-On Real-Estate Project Guide - ssp-data/practical-data-engineering | |
| | | | |
rmoff.net
|
|
| | | | | [AI summary] A blog post from September 2025 highlights various data engineering, AI, and cloud computing topics, including recent developments in tools like DuckDB, Iceberg, and Kafka, along with insights on AI's growing impact. | |
| | | | |
dagshub.com
|
|
| | | | | Explore the top ML workflow and pipeline tools, including tools from Netflix, to enhance your data science projects' efficiency and impact. | |
| | | | |
jack-vanlightly.com
|
|
| | | In today's post I want to walk through a fascinating indexing technique for data lakehouses which flips the role of the index in open table formats like Apache Iceberg and Delta Lake. We are going to turn the tables on two key points: 1. Indexes are primarily for reads. Indexes are usually framed as read optimizations paid for by write overhead: they make read queries fast, but inserts and updates slower. That isn't the full story as indexes also support writes such as with faster uniqueness enforcement and reducing lock contention (for example, by avoiding range locks during table scans) but the dominant mental model is that indexing serves reads while writes pay the bill. 2. OTFs don't use tree-based indexes. Open-table format indexes are data-skipping ind... | ||