-
Book Overview & Buying
-
Table Of Contents
-
Feedback & Rating

Data Engineering with Apache Spark, Delta Lake, and Lakehouse
By :

Delta Lake also supports the upsert operation using the MERGE
clause. The idea of the upsert is simply to merge data in the existing table with new data originating from the source. It is very common in a lakehouse to ingest incremental data and merge it with the pre-existing table in Delta Lake. The MERGE
process is repeated each time new data gets ingested and is instrumental in creating the single truth of data in the silver layer of the lakehouse.
Important Note
An UPSERT
operation is a mix of updates and inserts in the same operation. Using the MERGE
clause, you can perform an insert in a table if it does not exist or simply update existing data if it does.
MERGE
process, let's find out the current number of rows in the store_orders
table:%sql SELECT count(*) FROM store_orders;
This results in the following output:
Figure 6.30 – Checking the count of rows in sales_orders