Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying The Definitive Guide to Data Integration
  • Table Of Contents Toc
  • Feedback & Rating feedback
The Definitive Guide to Data Integration

The Definitive Guide to Data Integration

By : BONNEFOY, CHAIZE, Raphaël MANSUY, Mehdi TAZI
close
close
The Definitive Guide to Data Integration

The Definitive Guide to Data Integration

By: BONNEFOY, CHAIZE, Raphaël MANSUY, Mehdi TAZI

Overview of this book

The Definitive Guide to Data Integration is an indispensable resource for navigating the complexities of modern data integration. Focusing on the latest tools, techniques, and best practices, this guide helps you master data integration and unleash the full potential of your data. This comprehensive guide begins by examining the challenges and key concepts of data integration, such as managing huge volumes of data and dealing with the different data types. You’ll gain a deep understanding of the modern data stack and its architecture, as well as the pivotal role of open-source technologies in shaping the data landscape. Delving into the layers of the modern data stack, you’ll cover data sources, types, storage, integration techniques, transformation, and processing. The book also offers insights into data exposition and APIs, ingestion and storage strategies, data preparation and analysis, workflow management, monitoring, data quality, and governance. Packed with practical use cases, real-world examples, and a glimpse into the future of data integration, The Definitive Guide to Data Integration is an essential resource for data eclectics. By the end of this book, you’ll have the gained the knowledge and skills needed to optimize your data usage and excel in the ever-evolving world of data.
Table of Contents (19 chapters)
close
close

To get the most out of this book

Before beginning, it’s important to know that this book assumes you have a foundational understanding of data sources and types, including relational databases, NoSQL, flat files, and APIs. You should be familiar with basic data formats such as CSV, JSON, and XML. The book builds on these basics to explore data integration models, architectures, and patterns, with practical applications across various industries. Having prior experience with SQL and understanding its role in data transformation will be beneficial. Additionally, knowledge of data storage technologies and architectures will help you make the most of the content.

Software/hardware covered in the book

Operating system requirements

SQL and data transformation

Windows, macOS, or Linux

Massively parallel processing systems

Windows, macOS, or Linux

Spark for data transformation

Windows, macOS, or Linux

Data storage technologies (data warehouses, data lakes, and object storage)

Windows, macOS, or Linux

Data modeling techniques

Windows, macOS, or Linux

Data integration models (ETL and ELT)

Windows, macOS, or Linux

Data exposition technologies (Streams, REST APIs, and GraphQL)

Windows, macOS, or Linux

If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book’s GitHub repository (a link is available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.

The following are some additional installation instructions and information:

  • You should have a stable internet connection to access the online resources and repositories mentioned in the book.
  • Familiarize yourself with basic command-line operations as they are commonly used in setting up and managing data environments.
  • Installation of a database system that supports SQL, such as MySQL, PostgreSQL, or a similar system, may be required to follow the practical examples.
  • For massively parallel processing systems and Spark, ensure that Java is installed on your system as it is required for running Spark-based applications.
  • It’s recommended to have a code editor or an Integrated Development Environment (IDE) that supports database management and big data processing, such as PyCharm, Jupyter, or Visual Studio Code, to facilitate code writing and testing.
  • The versions of software and examples provided are current as of the book’s publication. You should always check for the latest versions to ensure compatibility and access to the latest features.

Unlock full access

Continue reading for free

A Packt free trial gives you instant online access to our library of over 7000 practical eBooks and videos, constantly updated with the latest in tech

Create a Note

Modal Close icon
You need to login to use this feature.
notes
bookmark search playlist download font-size

Change the font size

margin-width

Change margin width

day-mode

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Delete Bookmark

Modal Close icon
Are you sure you want to delete it?
Cancel
Yes, Delete

Delete Note

Modal Close icon
Are you sure you want to delete it?
Cancel
Yes, Delete

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY