Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying Data Engineering with Databricks Cookbook
  • Table Of Contents Toc
  • Feedback & Rating feedback
Data Engineering with Databricks Cookbook

Data Engineering with Databricks Cookbook

By : Pulkit Chadha
4.4 (7)
close
close
Data Engineering with Databricks Cookbook

Data Engineering with Databricks Cookbook

4.4 (7)
By: Pulkit Chadha

Overview of this book

Written by a Senior Solutions Architect at Databricks, Data Engineering with Databricks Cookbook will show you how to effectively use Apache Spark, Delta Lake, and Databricks for data engineering, starting with comprehensive introduction to data ingestion and loading with Apache Spark. What makes this book unique is its recipe-based approach, which will help you put your knowledge to use straight away and tackle common problems. You’ll be introduced to various data manipulation and data transformation solutions that can be applied to data, find out how to manage and optimize Delta tables, and get to grips with ingesting and processing streaming data. The book will also show you how to improve the performance problems of Apache Spark apps and Delta Lake. Advanced recipes later in the book will teach you how to use Databricks to implement DataOps and DevOps practices, as well as how to orchestrate and schedule data pipelines using Databricks Workflows. You’ll also go through the full process of setup and configuration of the Unity Catalog for data governance. By the end of this book, you’ll be well-versed in building reliable and scalable data pipelines using modern data engineering technologies.
Table of Contents (16 chapters)
close
close
Free Chapter
1
Part 1 – Working with Apache Spark and Delta Lake
9
Part 2 – Data Engineering Capabilities within Databricks

The evolving landscape of data engineering

In recent years, the field of data engineering has undergone a transformative shift, with the demand for efficient, scalable, and collaborative solutions reaching unprecedented heights. This book addresses this paradigm shift by delving into the intricacies of Apache Spark, a versatile engine for big data processing, Databricks, a collaborative and cloud-based platform, and Delta Lake, an open source storage layer that enhances the reliability and consistency of your data workflows.

A pragmatic approach to data engineering

This cookbook goes beyond being a compilation of recipes; it’s a pragmatic guide aimed at empowering you to overcome real-world data engineering challenges. The recipes provided are designed to be practical, with step-by-step instructions, code snippets, and detailed explanations to facilitate a hands-on learning experience. Whether you are a seasoned data engineer or just embarking on your data journey, the book offers valuable insights and practical solutions to integrate these cutting-edge technologies seamlessly into your projects.

Key features

Some of the key objectives of this book are as follows:

  • In-depth recipes for the entire data engineering life cycle: Navigate through a comprehensive set of recipes covering data extraction, transformation, loading, and effective management within a Lakehouse architecture
  • Practical learning: Embrace a hands-on approach with detailed instructions, code examples, and explanations to ensure you gain practical expertise in applying these technologies to real-world scenarios
  • Best practices and optimization: Benefit from industry best practices and expert tips to optimize your data engineering workflows, building scalable, efficient, and easy-to-maintain solutions
  • Real-world challenges and solutions: Explore recipes addressing common challenges faced by data engineers in actual projects, providing practical insights for implementation
  • Collaboration and seamless integration: Leverage the collaborative capabilities of Databricks and learn how to seamlessly integrate these technologies into your existing data infrastructure, fostering a more efficient and collaborative environment

Embark on a journey to master the art of data engineering with Apache Spark, Databricks, and Delta Lake. This cookbook is not just a guide; it’s your companion in navigating the complexities of modern data engineering. Happy cooking!

Unlock full access

Continue reading for free

A Packt free trial gives you instant online access to our library of over 7000 practical eBooks and videos, constantly updated with the latest in tech
bookmark search playlist download font-size

Change the font size

margin-width

Change margin width

day-mode

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Delete Bookmark

Modal Close icon
Are you sure you want to delete it?
Cancel
Yes, Delete

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY