DataBricks :Manage Data with Delta Lake

Prem Vishnoi(cloudvala)
2 min readFeb 10, 2024

--

What is Delta Lake?

Delta Lake is an open-source project that enables building a data lakehouse on top of existing cloud storage

Delta Lake Is Not…
• Proprietary technology
• Storage format
• Storage medium
• Database service or data warehouse

Delta Lake Is…
• Open source
• Builds upon standard data formats
• Optimized for cloud object storage
• Built for scalable metadata handling

Delta Lake brings ACID to object storage
Atomicity
means all transactions either succeed or fail completely Consistency guarantees relate to how a given state of the data is observed by
simultaneous operations
Isolation refers to how simultaneous operations conflict with one another. The isolation guarantees that Delta Lake provides do differ from other systems
Durability means that committed changes are permanent

Problems solved by ACID
• Hard to append data
• Modification of existing data difficult
• Jobs failing mid way
• Real-time operations hard
• Costly to keep historical data versions

--

--

Prem Vishnoi(cloudvala)
Prem Vishnoi(cloudvala)

Written by Prem Vishnoi(cloudvala)

Head of Data and ML experienced in designing, implementing, and managing large-scale data infrastructure. Skilled in ETL, data modeling, and cloud computing

No responses yet