Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying Mastering Prometheus
  • Table Of Contents Toc
  • Feedback & Rating feedback
Mastering Prometheus

Mastering Prometheus

By : Hegedus
3.7 (6)
close
close
Mastering Prometheus

Mastering Prometheus

3.7 (6)
By: Hegedus

Overview of this book

With an increased focus on observability and reliability, establishing a scalable and reliable monitoring environment is more important than ever. Over the last decade, Prometheus has emerged as the leading open-source, time-series based monitoring software catering to this demand. This book is your guide to scaling, operating, and extending Prometheus from small on-premises workloads to multi-cloud globally distributed workloads and everything in between. Starting with an introduction to Prometheus and its role in observability, the book provides a walkthrough of its deployment. You’ll explore Prometheus’s query language and TSDB data model, followed by dynamic service discovery for monitoring targets and refining alerting through custom templates and formatting. The book then demonstrates horizontal scaling of Prometheus via sharding and federation, while equipping you with debugging techniques and strategies to fine-tune data ingestion. Advancing through the chapters, you’ll manage Prometheus at scale through CI validations and templating with Jsonnet, and integrate Prometheus with other projects such as OpenTelemetry, Thanos, VictoriaMetrics, and Mimir. By the end of this book, you’ll have practical knowledge of Prometheus and its ecosystem, which will help you discern when, why, and how to scale it to meet your ever-growing needs.
Table of Contents (21 chapters)
close
close
Free Chapter
1
Part 1: Fundamentals of Prometheus
7
Part 2: Scaling Prometheus
11
Part 3: Extending Prometheus

Making robust alerts

The ability to make more robust alerts is one of the distinguishing factors of Prometheus vs. traditional, check-based monitoring solutions such as Nagios. It allows you to consider multiple factors when creating alerts. For example, rather than just alerting on high memory usage on a server, you can easily create an alert that will only fire if you have high memory usage and a high rate of major page faults since that is generally a better indicator of a system experiencing memory pressure. The idea is to craft alerts in such a way that you reduce the number of false positives as much as possible so that alerts only fire when real, visible impact is occurring. This is part of a larger discussion on the philosophy of alerting on symptoms vs. causes, which is covered comprehensively in Rob Ewaschuk’s excellent document entitled My Philosophy on Alerting (linked at the end of this chapter).

Use logical/set binary operators

In order to make robust alerts...

Unlock full access

Continue reading for free

A Packt free trial gives you instant online access to our library of over 7000 practical eBooks and videos, constantly updated with the latest in tech
bookmark search playlist download font-size

Change the font size

margin-width

Change margin width

day-mode

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Delete Bookmark

Modal Close icon
Are you sure you want to delete it?
Cancel
Yes, Delete

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY