-
Book Overview & Buying
-
Table Of Contents
-
Feedback & Rating

Data Engineering with Databricks Cookbook
By :

To follow along with the examples in this chapter, you will need to have the following:
If on Mac, make sure to choose the binary associated with your chip (Intel Chip or Apple Chip).
git clone https://github.com/PacktPublishing/Data-Engineering-with-Databricks-Cookbook.git
$ sh build.sh
Note
This may take several minutes the first time since it has to download and install Spark and all other supporting libraries on the base images.
docker-compose
from the root folder of the cloned repository:$ docker-compose up
This docker-compose
file is creating a multi-container application that consists of the following services:
9092
. It allows plaintext listeners and has some custom configuration options.8888
and 4040
and shares a local volume with the other services. It has a custom image, which includes Spark 3.4.1.8080
and 7077
and shares a local volume with the other services. It has a custom image, which includes Spark 3.4.1.8081
. They have custom images, which include Spark 3.4.1 and some environment variables to specify the worker cores and memory.To run this docker-compose
file, you need to have the following minimum system requirements:
The system requirements in order to follow the recipes are as follows:
Software/Hardware covered in the book |
OS requirements |
Docker Engine version 18.02.0+ |
Windows, Mac OS X, and Linux (any) |
Docker Compose version 1.25.5+ |
|
Docker Desktop |
|
Git |
If you are using the digital version of this book, we advise you to type the code yourself or access the code via the GitHub repository (link available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.