2024-11-16 –, LT9
Language: English
This session explores how enterprises can build robust transactional data lakes using the open-source Delta Lake format and Python tools. The presenters will first discuss the exponential growth in enterprise data volumes and how data lakes provide a compelling solution to cost-effectively retain and extract value from vast amounts of structured and unstructured data.
The session outlines key limitations of traditional data lakes, such as the lack of database-like capabilities for efficient updates, maintaining performance at scale, and ensuring data consistency.
To address these challenges, the session will showcase how the Delta Lake format, along with complementary Python tools like PySpark and Delta-rs, can be leveraged on AWS to build highly optimized and manageable data lake architectures. The presenters will dive into two real-world use cases, covering both large-scale batch processing and smaller-scale data workloads, highlighting best practices and architectural patterns for Python developers.
Alan is an Assistant Technical Manager at ATAL Engineering Group. He is specialised in the development of total solutions for smart buildings, covering the areas of energy optimisation, air conditioning, intelligent control, and automation.
Jacky Kwok is an Enterprise Solutions Architect at Amazon Web Services, Hong Kong. With more than 10 years of experience, he possesses proficiency in a wide range of technology stacks, such as Java, Python, Node.js, MySQL, and PostgreSQL. Jacky is a seasoned architect with extensive hands-on experience in application development, Data Analytic and solution architecture.