Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • PostgreSQL 12 High Availability Cookbook
  • Toc
  • feedback
PostgreSQL 12 High Availability Cookbook

PostgreSQL 12 High Availability Cookbook

By : Shaun Thomas
4.5 (2)
close
PostgreSQL 12 High Availability Cookbook

PostgreSQL 12 High Availability Cookbook

4.5 (2)
By: Shaun Thomas

Overview of this book

Databases are nothing without the data they store. In the event of an outage or technical catastrophe, immediate recovery is essential. This updated edition ensures that you will learn the important concepts related to node architecture design, as well as techniques such as using repmgr for failover automation. From cluster layout and hardware selection to software stacks and horizontal scalability, this PostgreSQL cookbook will help you build a PostgreSQL cluster that will survive crashes, resist data corruption, and grow smoothly with customer demand. You’ll start by understanding how to plan a PostgreSQL database architecture that is resistant to outages and scalable, as it is the scaffolding on which everything rests. With the bedrock established, you'll cover the topics that PostgreSQL database administrators need to know to manage a highly available cluster. This includes configuration, troubleshooting, monitoring and alerting, backups through proxies, failover automation, and other considerations that are essential for a healthy PostgreSQL cluster. Later, you’ll learn to use multi-master replication to maximize server availability. Later chapters will guide you through managing major version upgrades without downtime. By the end of this book, you’ll have learned how to build an efficient and adaptive PostgreSQL 12 database cluster.
Table of Contents (17 chapters)
close

Setting expectations with RPO

RPO is a common term in business continuity known as Recovery Point Objective. In the context of a database system, it describes the amount of data that may be lost following an unexpected outage before it is once again operational. It's important to understand this at an early stage because it will drive decisions such as node count, data synchronization methods, and backup technologies.

In this recipe, we will examine the ingredients for concocting a comprehensive RPO that will influence the PostgreSQL cluster composition itself.

Getting ready

The first thing we need to do is set expectations. These are most often defined by upper management or some other decision-making entity. Data loss is never desirable but is unavoidable in catastrophic scenarios. How much data loss can the business tolerate under these circumstances? Seconds, minutes, or hours' worth?

This recipe will mainly focus on information gathering from key individuals, so make sure it's possible to at least email anyone involved with the application stack. Hardware purchases depend on budget proposals, so it may even be necessary to interact with VP and C-level executives as well. Even if we don't do this right away, try to determine the extent of influence available to you.

How to do it...

Since we're dealing with many vectors, we should iterate them if possible. Try to follow a process like this:

  1. Seek the input of major decision makers:
  • VP and C-level executives involved with technology
  • Product manager
  • Application designers and architects
  • Infrastructure team lead
  1. Find an amount of time that will satisfy most or all of the above.
  2. Follow the rest of the advice in this chapter to find a suitable architecture.
  3. Try to determine a rough cost for this and the closest alternative.
  4. Present one or more designs and cost estimates to decision makers.
  5. Document the final RPO decision and architecture as reference material.

How it works...

Decision makers such as the technology VP, CEO, CTO, and such are the final word in most cases. Their input is vital and should be considered a requirement before ever taking a step further. Keep in mind that these people are likely not familiar with the technical feasibility of their demands at this extreme implementation level. When asked a question such as How much data can we lose in a major outage? they're probably going to say None! Regardless, this is a vital first step for reasons that will shortly become apparent.

Then, we simply traverse the stack of people who helped define the features the application stack fulfills, those who designed and implemented it, and whoever may be in charge of the requisite hardware and network where everything runs. Perhaps the design has a built-in tolerance for certain amounts of loss. Perhaps inherent queues or caches act as a sort of buffer for data backend difficulties. Maybe the design assumes there are multiple data systems all ingesting the same stream for redundancy. The architecture and those who built it are the best sources of this information.

Once we know the maximum amount of data the backend can lose before being restored, we must apply what we learn from the rest of this chapter and choose one or two best-case designs that can deliver that promise. The point here is that we will be executing this recipe several times until everyone agrees to all inherent design costs and limitations before continuing.

The best way to estimate cost is to take the chosen database server architectures and iterate a gross cost for each element. The next chapter on Hardware Planning describes in detail how to do this. We don't have to be exact here; the goal is to have some numbers we can present to decision makers. Do they still want zero RPO if it costs 10x as much as ten seconds of data loss? Are they willing to compromise on a hybrid design?

Once we have chosen a final structure, possibly the most important step is to produce a document describing that architecture, why it was chosen, the known limitations, and the RPO it delivers. Present this document to decision makers and encourage them to sign it if possible. Save it in any corporate documentation management system available, and make sure it's one of the first things people see regarding the database cluster layer. This document will single-handedly answer multiple questions about the capabilities of the database cluster, all while acting as a reference specification.

There's more...

RPO is considered a vital part of business continuity planning. Entire books have been written on this subject, and what we've presented here is essentially a functional summary. The subject is deep and varied, rich with its own inherent techniques beyond simply architecture and design. It is the language of business and resource management, so it can be a key component when interacting with decision makers.

Learning these concepts in depth can help influence the overall application stack to a more sustainable long-term structure. We'll cover more of these techniques in this chapter, but don't be afraid to proactively incorporate these techniques into your repertoire.

bookmark search playlist download font-size

Change the font size

margin-width

Change margin width

day-mode

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Delete Bookmark

Modal Close icon
Are you sure you want to delete it?
Cancel
Yes, Delete