Challenges and Lessons Learned While Building a Real-Time Lakehouse using Apache Iceberg and Kafka
How do you build a large-scale data lakehouse architecture that makes data available for business analytics in real time, while being more cost-effective, more flexible and faster than the previous proprietary solution? With Python, Kafka and Iceberg, of course!
We built a large-scale data lakehouse based on Apache Iceberg for the Schwarz Group, Europe's largest retailer. The system collects business data from thousands of stores, warehouses and offices across Europe.
In this talk, we will present our architecture, the challenges we faced, and how Apache Iceberg is shaping up to be the data lakehouse format of the future.