Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying Fundamentals of Analytics Engineering
  • Table Of Contents Toc
  • Feedback & Rating feedback
Fundamentals of Analytics Engineering

Fundamentals of Analytics Engineering

By : Dumky De Wilde, Kassapian, Gligorevic, Juan Manuel Perafan, Lasse Benninga, Ricardo Angel Granados Lopez, Taís Laurindo Pereira
4.7 (3)
close
close
Fundamentals of Analytics Engineering

Fundamentals of Analytics Engineering

4.7 (3)
By: Dumky De Wilde, Kassapian, Gligorevic, Juan Manuel Perafan, Lasse Benninga, Ricardo Angel Granados Lopez, Taís Laurindo Pereira

Overview of this book

Written by a team of 7 industry experts, Fundamentals of Analytics Engineering will introduce you to everything from foundational concepts to advanced skills to get started as an analytics engineer. After conquering data ingestion and techniques for data quality and scalability, you’ll learn about techniques such as data cleaning transformation, data modeling, SQL query optimization and reuse, and serving data across different platforms. Armed with this knowledge, you will implement a simple data platform from ingestion to visualization, using tools like Airbyte Cloud, Google BigQuery, dbt, and Tableau. You’ll also get to grips with strategies for data integrity with a focus on data quality and observability, along with collaborative coding practices like version control with Git. You’ll learn about advanced principles like CI/CD, automating workflows, gathering, scoping, and documenting business requirements, as well as data governance. By the end of this book, you’ll be armed with the essential techniques and best practices for developing scalable analytics solutions from end to end.
Table of Contents (23 chapters)
close
close
1
Prologue
Free Chapter
2
Part 1:Introduction to Analytics Engineering
5
Part 2: Building Data Pipelines
11
Part 3: Hands-On Guide to Building a Data Platform
13
Part 4: DataOps
17
Part 5: Data Strategy
21
Index

Summary

In this chapter, we have looked at both the problems of data quality and the potential solutions to improve it.

We saw that data quality issues come from three key areas. First, from the source system, where data might be incomplete, unreliable, or inconsistent. Second, from the infrastructure and pipelines that process and transform data as it is ingested. In that case, you might have issues with data quality because the data is too late, corrupted, or missing. It might also be that a mistake is made in the transformations, where, for example, we get the granularity or precision of the data wrong. Finally, data quality issues can arise from problems with data governance, most notably from inconsistencies in the definitions and documentation or a lack of access management, cost management, or metadata management. This all leads to a misunderstanding of the data and the data’s lineage and dependencies, which, in turn, leads to suboptimal decision-making and increased...

Unlock full access

Continue reading for free

A Packt free trial gives you instant online access to our library of over 7000 practical eBooks and videos, constantly updated with the latest in tech
bookmark search playlist download font-size

Change the font size

margin-width

Change margin width

day-mode

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Delete Bookmark

Modal Close icon
Are you sure you want to delete it?
Cancel
Yes, Delete

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY