Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying The Definitive Guide to Data Integration
  • Table Of Contents Toc
  • Feedback & Rating feedback
The Definitive Guide to Data Integration

The Definitive Guide to Data Integration

By : BONNEFOY, CHAIZE, Raphaël MANSUY, Mehdi TAZI
close
close
The Definitive Guide to Data Integration

The Definitive Guide to Data Integration

By: BONNEFOY, CHAIZE, Raphaël MANSUY, Mehdi TAZI

Overview of this book

The Definitive Guide to Data Integration is an indispensable resource for navigating the complexities of modern data integration. Focusing on the latest tools, techniques, and best practices, this guide helps you master data integration and unleash the full potential of your data. This comprehensive guide begins by examining the challenges and key concepts of data integration, such as managing huge volumes of data and dealing with the different data types. You’ll gain a deep understanding of the modern data stack and its architecture, as well as the pivotal role of open-source technologies in shaping the data landscape. Delving into the layers of the modern data stack, you’ll cover data sources, types, storage, integration techniques, transformation, and processing. The book also offers insights into data exposition and APIs, ingestion and storage strategies, data preparation and analysis, workflow management, monitoring, data quality, and governance. Packed with practical use cases, real-world examples, and a glimpse into the future of data integration, The Definitive Guide to Data Integration is an essential resource for data eclectics. By the end of this book, you’ll have the gained the knowledge and skills needed to optimize your data usage and excel in the ever-evolving world of data.
Table of Contents (19 chapters)
close
close

What this book covers

Chapter 1, Introduction to Our Data Integration Journey, explores data integration’s evolution and significance, discussing the proliferation of data sources and the evolving landscape. It tackles the complexities and opportunities in modern data integration and outlines the book’s purpose and vision.

Chapter 2, Introducing Data Integration, covers the definition of data integration, the modern data stack, and strategies in data integration. It details the role of data in businesses and examines the techniques, tools, and technologies used in data integration processes.

Chapter 3, Architecture and History of Data Integration, traces the history of data integration, the impact of open source technologies, and various architectures. It discusses the future of data integration, highlighting trends such as real-time and AI-driven integrations.

Chapter 4, Data Sources and Types, discusses the variety of data sources including relational and NoSQL databases, flat files, and APIs. It also explores different data types and formats, emphasizing their importance and challenges in data integration processes.

Chapter 5, Columnar Data Formats and Comparisons, focuses on columnar data formats, contrasting them with traditional row-based methods, emphasizing their advantages in analytics. It explores the challenges of working with different data formats and the necessity of data format conversion.

Chapter 6, Data Storage Technologies and Architectures, delves into data storage technologies such as data warehouses, lakes, and object storage, discussing their strengths and weaknesses. It also covers various data architectures and their impact on data integration, including physical and logical layers, data modeling, and partitioning.

Chapter 7, Data Ingestion and Storage Strategies, covers the goals and strategies of data ingestion, outlining efficient, scalable, and adaptable methods for diverse data sources. It also discusses data storage and modeling techniques, and strategies for optimizing storage performance and defining adapted strategies.

Chapter 8, Data Integration Techniques, explores different data integration models and architectures, covering point-to-point integration, middleware, batch, micro-batching, and real-time approaches. It also discusses common data integration patterns such as ETL and ELT and organizational models for data management.

Chapter 9, Data Transformation and Processing, introduces various data transformation techniques including filters, aggregations, and joins. It delves into SQL’s role in data transformation and massively parallel processing systems, discussing their applications and challenges in data processing.

Chapter 10, Transformation Patterns, Cleansing, and Normalization, explores transformation patterns such as lambda and kappa architectures, their pros and cons, and their applications in data pipelines. It delves into data cleansing and normalization, which are crucial for good data quality and consistency in integration.

Chapter 11, Data Exposition and APIs, covers strategic motives for data exposure in analytics, seamless data exchange, and the role of various data exposition technologies. It focuses on APIs and strategies for data exposure, and compares different data exposure solutions.

Chapter 12, Data Preparation and Analysis, discusses the importance of data preparation, strategies for selecting data transformations, and key concepts in reporting and self-analysis, all of which are crucial for effective decision-making and business insights.

Chapter 13, Workflow Management, Monitoring, and Data Quality, examines workflow and event management, monitoring in data stacks, the significance of data quality and observability, and data governance and compliance in managing data assets.

Chapter 14, Lineage, Governance, and Compliance, explores the significance of data lineage in decision-making and compliance, techniques for visualizing data journeys, and the importance of adhering to regulations with robust governance frameworks.

Chapter 15, Various Architecture Use Cases, discusses data integration in scenarios such as real-time data analysis, cloud-based, geospatial, and IoT data analysis, covering the specific challenges, tools, and techniques for each use case.

Chapter 16, Prospects and Challenges, focuses on the future of data integration within the modern data stack, highlighting emerging trends, challenges, and opportunities, and provides guidance for further learning in data integration.

Unlock full access

Continue reading for free

A Packt free trial gives you instant online access to our library of over 7000 practical eBooks and videos, constantly updated with the latest in tech
bookmark search playlist download font-size

Change the font size

margin-width

Change margin width

day-mode

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Delete Bookmark

Modal Close icon
Are you sure you want to delete it?
Cancel
Yes, Delete

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY