Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying Driving Data Quality with Data Contracts
  • Table Of Contents Toc
  • Feedback & Rating feedback
Driving Data Quality with Data Contracts

Driving Data Quality with Data Contracts

By : Andrew Jones
4.8 (11)
close
close
Driving Data Quality with Data Contracts

Driving Data Quality with Data Contracts

4.8 (11)
By: Andrew Jones

Overview of this book

Despite the passage of time and the evolution of technology and architecture, the challenges we face in building data platforms persist. Our data often remains unreliable, lacks trust, and fails to deliver the promised value. With Driving Data Quality with Data Contracts, you’ll discover the potential of data contracts to transform how you build your data platforms, finally overcoming these enduring problems. You’ll learn how establishing contracts as the interface allows you to explicitly assign responsibility and accountability of the data to those who know it best—the data generators—and give them the autonomy to generate and manage data as required. The book will show you how data contracts ensure that consumers get quality data with clearly defined expectations, enabling them to build on that data with confidence to deliver valuable analytics, performant ML models, and trusted data-driven products. By the end of this book, you’ll have gained a comprehensive understanding of how data contracts can revolutionize your organization’s data culture and provide a competitive advantage by unlocking the real value within your data.
Table of Contents (16 chapters)
close
close
1
Part 1: Why Data Contracts?
4
Part 2: Driving Data Culture Change with Data Contracts
8
Part 3: Designing and Implementing a Data Architecture Based on Data Contracts

Summary

In this chapter, we walked through a sample implementation of a data-contract-driven architecture and used that to illustrate the concepts we have been learning throughout the book and show them in action. We started by defining a contract in a custom YAML-based interface, and used that to drive a few different applications and services.

The first of those was a BigQuery table, which acts as the interface between the data generators and the consumers. We introduced an IaC tool called Pulumi and showed how it can be used to create and manage resources driven by the data contract.

We then showed how, by converting our data contract to JSON Schema, an open standard, we can easily produce libraries to help the data generators publish data that matches the schema and passes the data quality checks we defined.

That same JSON Schema was then used to populate a schema registry. We showed how that allows the schemas to be easily accessible through its rich API and also looked...

Unlock full access

Continue reading for free

A Packt free trial gives you instant online access to our library of over 7000 practical eBooks and videos, constantly updated with the latest in tech
bookmark search playlist download font-size

Change the font size

margin-width

Change margin width

day-mode

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Delete Bookmark

Modal Close icon
Are you sure you want to delete it?
Cancel
Yes, Delete

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY