Tag: apache-iceberg
All the articles with the tag "apache-iceberg".
-
Data This Week #15
Spark Declarative Pipelines for financial lakehouses, ten AWS Glue & Iceberg fixes, MOR as an architectural shift, DuckDB's Quack protocol, SQL fraud patterns, Kafka checkpoint patterns, and the LLM-for-validation debate.
-
Data This Week #14
Flink CDC streaming ELT from MySQL to Kafka, the LLM engineer's stack map, Ursa's diskless Kafka fork, Iceberg write mechanics, Instacart's billion-product search, Jikkou 1.0, and the AI knowledge-base debate.
-
Data This Week #12
Cold Postgres data to S3 lakehouse, Databricks Lakeflow Designer, vector databases & HNSW indexing, Salesforce migration best practices, SwiftLake for Iceberg, and data observability lessons.
-
Data This Week #11
Iceberg cross-account migrations, DuckLake 1.0 metadata, IaC for data engineers, Redshift Iceberg writes, agent-data patterns, LARQL for LLM graph queries, and Dagster pricing debate.
-
Data This Week #10
Data product lifecycle, semantic context layer for LLM agents, Netflix's Druid interval caching, Ursa Kafka storage engine, Iceberg v3 VARIANT type, and Ministack vs LocalStack.
-
Data This Week #9
DuckLake's 926x Iceberg speedup, Expedia's Trino Gateway for workload routing, Ontul unified SQL engine, PostgreSQL memory myths, and a 6-tier FFLIIP streaming lakehouse deep-dive.
-
Data This Week #5
Spark DAG compilation deep dive, query federation with StarRocks, Pinterest's CDC migration, CyberArk AI with Iceberg, Databricks Zerobus Ingest, and data quality tooling debates.
-
Data This Week #4
How OpenAI scales PostgreSQL for ChatGPT, Dropbox's enterprise RAG, 3x faster Spark on Iceberg, dbt with DuckDB, local AWS Lakehouse setups, and new tool Alibaba ZVec.