Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying Database Design and Modeling with Google Cloud
  • Table Of Contents Toc
  • Feedback & Rating feedback
Database Design and Modeling with Google Cloud

Database Design and Modeling with Google Cloud

By : Sukumaran
4.9 (7)
close
close
Database Design and Modeling with Google Cloud

Database Design and Modeling with Google Cloud

4.9 (7)
By: Sukumaran

Overview of this book

In the age of lightning-speed delivery, customers want everything developed, built, and delivered at high speed and at scale. Knowledge, design, and choice of database is critical in that journey, but there is no one-size-fits-all solution. This book serves as a comprehensive and practical guide for data professionals who want to design and model their databases efficiently. The book begins by taking you through business, technical, and design considerations for databases. Next, it takes you on an immersive structured database deep dive for both transactional and analytical real-world use cases using Cloud SQL, Spanner, and BigQuery. As you progress, you’ll explore semi-structured and unstructured database considerations with practical applications using Firestore, cloud storage, and more. You’ll also find insights into operational considerations for databases and the database design journey for taking your data to AI with Vertex AI APIs and generative AI examples. By the end of this book, you will be well-versed in designing and modeling data and databases for your applications using Google Cloud.
Table of Contents (18 chapters)
close
close
1
Part 1:Database Model: Business and Technical Design Considerations
4
Part 2:Structured Data
8
Part 3:Semi-Structured, Unstructured Data, and NoSQL Design
11
Part 4:DevOps and Databases
13
Part 5:Data to AI

Business aspect

Business requirements are the starting point for your application and also for choosing your database system. There are four stages in the life cycle of data in its business application that help determine the choice of database system:

  • Data ingestion
  • Storage
  • Process
  • Visualize

The following diagram represents the attributes in the four stages of data and the categories of questions in each stage in the life cycle of your data:

Figure 1.1 – Representation of the four stages of data and the categories of questions in each stage

Figure 1.1 – Representation of the four stages of data and the categories of questions in each stage

Let’s look at some of these attributes in detail. Some of them are in the business attributes category, while others are technical.

Ingestion

This is the first stage in the data life cycle and it is all about acquiring (bringing in) data from different sources in one place into your system. In this stage, the questions that arise are bucketed into three categories:

  • What type of data are you bringing in?
  • What is the purpose of this data?
  • What is the structure of your data?

Let’s take a look at each in detail.

Types of data

There are broadly three types of data we will be dealing with that highly influence the choice of database and storage.

Application data

This is the kind of data that is generated or downloaded as part of the application’s content and can contain transactional data that is generated by users and applications – for example, online retail applications, log data from applications, event data, and clickstream data. Let’s take a look at a specific example – consider a banking application in which user A transfers money from their account to user B’s account. In this case, the user data, such as the account ID, name, bank details, the recipient’s name, and transaction date, constitute the application data.

Live stream and real-time stream data

This data comes from real-time sources such as streaming data, which comes in continuously from data sources such as sensor data. These can also be event data responses and can be very frequent compared to batch data processing. It refers to data that is immediately available and not delayed by a system or process. The term real-time stream refers to streams of real-time data that are gathered and stored or processed as they come in. This includes monitoring data such as CPU utilization, memory consumption, Internet of Things (IoT) devices data such as humidity and pressure, and automated real-time environmental temperature monitoring data.

Batch data

This is data that comes in as bulk at scheduled intervals and could be event-triggered. For example, batch data is transactional data that comes in from applications after a transaction and is stored for use in later stages of the data life cycle. This can include data extracted from one application for use in another at a later point, data migration use cases, and file uploads for processing later. Such applications may not be designed for real-time operations on the data.

The purpose of data

The specific use case and the nature of implementing applications using the data being ingested is a critical factor in determining the choice and design of the database. There may be cases where the type and ingestion mode of data fall into a different choice of database design, whereas its functional use case would imply a different purpose. For example, you could have data streamed in from live events or housekeeping data coming in real-time from transactions, but the specific use case you are designing for might only involve visualization, analytical, or ML functionalities. So, make sure you understand what purpose you are solving with the data that is being ingested in a specific mode and type.

The structure of data

The structure of the data is a crucial factor in deciding the choice and design of a database. There are three widely recognized categories:

  • Structured
  • Semi-structured
  • Unstructured

Let’s briefly explore these three categories.

Structured data

This type of data is typically composed of rows and columns; rows are entities or records and columns are attributes. Structured data is organized in such a way that you can be sure that the data structure will be consistent for the most part throughout the life cycle of that data, except for the possible addition or removal of some attributes altogether. This kind of data is mostly transactional or analytical.

Semi-structured data

Semi-structured data does not follow a fixed tabular format – that is, a column-row structure. Instead, it stores schema attributes along with data. The attributes for semi-structured data could vary for each record. The major differentiating factor for each kind of semi-structured data is the way they are accessed.

Unstructured data

Unstructured data includes images, audio files, and so on. Unstructured data does not have a definite schema or data model. The amount of unstructured data is much larger than that of structured data. So, the methods by which we store such data are more important than ever. Here are some examples of unstructured data:

  • Text
  • Audio
  • Video
  • Images
  • Other binary large objects (BLOBs)

Now that we have had a sneak peek into the structure of data, be sure to include functional and design questions based on these categories while designing your database and application model.

bookmark search playlist download font-size

Change the font size

margin-width

Change margin width

day-mode

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Delete Bookmark

Modal Close icon
Are you sure you want to delete it?
Cancel
Yes, Delete

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY