-
Book Overview & Buying
-
Table Of Contents
-
Feedback & Rating

Practical Threat Detection Engineering
By :

The foundation of how we can track and categorize an adversary’s actions allows us to prioritize and understand the scope or coverage of our detections. The following subsection covers common frameworks and models that will be referenced throughout this book. They provide a starting model for framing cyberattacks, their granular sub-components, and how to defend against them.
Cyberattacks tend to follow a predictable pattern that should be understood by defenders. This pattern was initially documented as the now famous Lockheed Martin Cyber Kill Chain. This model has been adapted and modernized over time by multiple vendors. The Unified Kill Chain is a notable modernization of the model. This model defines 18 broad tactics across three generalized goals, which provides defenders with a reasonable framework for designing appropriate defenses according to attackers’ objectives. Let’s look at these goals:
Figure 1.1, based on the Unified Kill Chain whitepaper by Paul Pols, shows the individual tactics in each phase of the kill chain:
Figure 1.1 – The Unified Kill Chain
To better understand how the Unified Kill Chain applies to cyberattacks, let’s look at how it maps to a well-known attack. We are specifically going to look at an Emotet attack campaign. Emotet is a malicious payload often distributed via email and used to deliver additional payloads that will carry out the attacker’s final objectives. The specific campaign we will analyze is one reported on by The DFIR Report in November 2022: https://thedfirreport.com/2022/11/28/emotet-strikes-again-lnk-file-leads-to-domain-wide-ransomware/.
Table 1.1 lists the stages of the attack, as reported in the article, and how they map to the Unified Kill Chain:
Attack Event |
Unified Kill Chain Phase Group |
Unified Kill Chain Phase |
Emotet executed via LNK malspam attachment |
In |
Delivery |
Emotet sends outbound SMTP spam email |
Network propagation |
Pivoting |
Domain enumeration via Cobalt Strike |
Through |
Discovery |
Lateral movement to user workstation |
Through |
Pivoting |
SMB share enumeration |
Through |
Discovery |
Zerologon exploit attempt |
In |
Exploitation |
Remote Management Agent installed |
In |
Command and control/persistence |
Exfiltration via Rclone to Mega |
Out |
Exfiltration |
Ransomware execution |
Out |
Impact |
Table 1.1 – Unified Kill Chain mapping for Emotet attack chain
As can be seen from Table 1.1, not all phases will take place in every attack and may not occur in a linear order.
To read the full Unified Kill Chain whitepaper, visit this link: https://www.unifiedkillchain.com/assets/The-Unified-Kill-Chain.pdf.
While this follows the progression of a typical cyberattack, as the paper outlines and as our example shots show, it is not uncommon for the attacker to execute some tactics outside this expected order. While the Unified Kill Chain provides a model for how threat actors carry out attacks, it does not dive into the detailed techniques that can be used to achieve the goals of each phase in the kill chain. The MITRE ATT&CK framework provides more granular insight into the tactics, techniques, and procedures leveraged by threat actors.
The MITRE ATT&CK framework is a knowledge base developed by the MITRE Corporation. The framework classifies threat actor objectives and catalogs the granular tools and activities related to achieving those objectives.
ATT&CK stands for Adversarial Tactics, Techniques, and Common Knowledge. The MITRE ATT&CK framework groups adversarial techniques into high-level categories called tactics. Each tactic represents a smaller immediate goal within the overall cyberattack. This framework will be referenced frequently throughout this book, providing an effective model for designing and validating detections. The following points detail the high-level tactics included as part of the Enterprise ATT&CK framework:
We encourage you to explore the MITRE ATT&CK framework in full at https://attack.mitre.org/. In this book, we are specifically going to focus on the Enterprise ATT&CK framework, but MITRE also provides frameworks for ICS and mobile-based attacks as well. The ATT&CK Navigator, located at https://mitre-attack.github.io/attack-navigator/, is also extremely useful for defenders to quickly search for and qualify tactics.
Most publications documenting incident response observations typically provide kill chain and MITRE ATT&CK tactics, which help defenders understand how to design detections and other preventative controls.
Another helpful model for defenders to understand is the Pyramid of Pain. This model, developed by David Bianco, visualizes the relationship between the categories of indicators and the impact of defending each. This impact is expressed as the effort required by the threat actor to modify their attack once an effective defense is implemented for a given indicator category. Figure 1.2 shows the concept of the Pyramid of Pain:
Figure 1.2 – David Bianco’s Pyramid of Pain
As we can see, controls designed to operate on static indicators such as domain names, IP addresses, and hash values are trivial for adversaries to evade. For example, modifying the hash of a binary simply involves changing a single bit. It is far more difficult for an adversary to modify their tools, tactics, and procedures (TTPs), which are essentially the foundation of their attack playbook. The gold standard for defensive controls is those that target TTPs. However, these are usually more difficult to implement and require reliable data from protected assets, as well as a deep understanding of the adversary’s tactics and capabilities. Defensive controls designed for static indicators are effective for short-term, tactical defense. You can read David Bianco’s full blog post here: https://detect-respond.blogspot.com/2013/03/the-pyramid-of-pain.html.
Throughout the remainder of this book, we will frequently reference these concepts. In later chapters, we will illustrate how these models can be used to understand cyberattacks, translate high-level business objectives for defense into detections, and measure coverage against known attacks.
Now that we have gained an understanding of the model for framing cyberattacks, let’s look into the most common types of cyberattacks.
To detect cyberattacks, detection engineers need to have a base understanding of the attacks that they will face. Some of the most prevalent attacks at the time of writing are summarized here to provide some introductory insight into the attacks we are trying to defend against.
The FBI reported receiving a total of 19,954 complaints related to Business Email Compromise (BEC) incidents in 2021. They estimate these complaints represent a cumulative loss of 2.4 billion dollars (USD). The full report can be accessed at https://www.ic3.gov/Media/PDF/AnnualReport/2021_IC3Report.pdf.
BEC attacks target users of the most popular and accessible user collaboration tool available – email. The electronic transfer of funds is a normal part of business operations for many organizations. Threat actors research organizations and identify personnel likely to be involved in correspondence related to the exchange of funds. Having identified a target, the threat actor leverages several techniques to gain access to the target’s mailbox (or someone adjacent from a business process perspective). With this access, the threat actor’s objective pivots to observing email exchanges to understand internal processes. During this time, the threat actor needs to understand the communication flows and key players. In ideal cases, they will identify a third-party contractor whom the organization conducts routine business with, the people who typically send correspondence for payments, and the person who approves these payments on behalf of the organization. Once the right opportunity arises, the threat actor can intercept and alter email conversations about payment, changing destination account numbers. If this goes unnoticed, funds may be deposited into the attacker’s account instead of the intended recipient.
Denial of service (DoS) attacks attempt to make services unavailable to legitimate users by overwhelming the service or otherwise impairing the infrastructure the service depends on. There are three main types of DoS attacks: volumetric, protocol, and application attacks.
Volumetric attacks are executed by sending an inordinate volume of traffic to a target system. If the attack persists, it can degrade the service or disrupt it entirely. Protocol attacks focus on the network and transport layer and attempt to deplete the available resources of the networking devices, making the target service available. Application attacks send large volumes of requests to a target service. The service attempts to process each request, which consumes processing power on the underlying systems. Eventually, the available resources are exhausted, and service response times increase to the point where the service becomes unavailable. These types of attacks can be further categorized by their degree of automation and the techniques used.
Increasing the number of systems executing the attack can significantly increase the impact. By making use of compromised systems, threat actors can conduct synchronized DoS attacks against a single target, known as distributed denial of service (DDoS) attacks.
When malicious software, or malware, manages to evade defensive controls, the impact can range broadly, depending on the specific malware family. In low-impact cases, an end user may be bombarded with unsolicited pop-up ads, and in more extreme scenarios, malware can give full control of a system to a remote threat actor. The presence of malware in an enterprise environment usually indicates a possible deficiency in security controls. Seemingly low-impact malware infections can lead to more significant incidents, including full-blown ransomware attacks.
Employees of an organization who perform malicious activity against that organization are known as insider threats. Insider threats can exist at any level of the organization and have various motivations. Malicious insiders can be difficult to defend against since the organization has granted them a degree of trust.
Phishing attacks fall under the category of social engineering, where threat actors design attacks around communication and collaboration tools, such as email, instant messaging apps, SMS text messages, and even regular phone calls. The underlying objective in all cases is to entice users to reveal sensitive information, such as credentials or banking information. BEC attacks typically leverage phishing techniques.
While the threat landscape is full of countless actors, with diverse goals ranging from stealthy cyber espionage to tech-support scams, the most prolific and impactful of these is the modern ransomware attack.
The goal of a ransomware attack is to interrupt critical business operations by taking critical systems offline and demanding payment, or a ransom, from the organization. In exchange for a successful payment, the threat actors claim they will return systems to a normal operating state.
Recently, some ransomware operators have added a separate extortion component to their playbook. During their ransomware attack, they exfiltrate sensitive data from the organization’s environment to attacker-controlled systems. Ransomware operators then threaten to publicize this data unless the ransom is paid. This attack is commonly referred to as the double-extortion ransomware attack.
Successful ransomware operations put businesses in a frightening predicament. Apart from untangling the deep complexities of determining whether to pay the ransom, recovering from a successful cyberattack can take months or sometimes years.
These malicious operations have become increasingly sophisticated and successful over time. According to CrowdStrike, the first instance of modern ransomware was recorded in 2005. Between then and now, the frequency, scale, and sophistication of ransomware attacks have only increased. CrowdStrike’s History of Ransomware article provides a summary of the evolution of ransomware. You can read the full article here: https://www.crowdstrike.com/cybersecurity-101/ransomware/history-of-ransomware/.
Successful breaches can have expensive impacts, requiring thousands of man-hours to remediate. IBM’s 2022 Cost of a Data Breach report found that the average total cost of a data breach amounted to 4.35 million USD. Typically, the earlier a threat is detected, the lower the cost of remediation. For every phase that an attacker advances through the kill chain, the cost of remediation goes up. While a threat hunt allows an organization to search for an adversary already inside its environment, the identification occurs when and if a search is performed. This detection, though, allows an organization to identify malicious behavior when the activity is performed, reducing the mean time to detect. Given that the same IBM Cost of a Data Breach report determined that the average time to identify and contain a breach was 277 days, there is much work to be done in attempting to reduce the time to detection.
To understand how the time to detect an attack greatly determines the impact on the business, let’s consider a scenario where a threat actor can gain initial access to an internet-connected workstation via a successful phishing campaign. This unauthorized access was immediately detected by the organization’s security team. They quickly isolated this workstation and performed a full re-imaging of its contents to a known-good state. They also performed a full reset of the user’s credentials, along with any other user who interacted with that workstation. Administrators identified the phishing email in their enterprise email solution, and all recipients had their workstations re-imaged and their credentials reset.
In this scenario, the steps that were taken by the security team were relatively simple to execute and would likely be sufficient to remove the threat from the environment. In contrast, if the threat actors were able to gain privileged access, exfiltrate data, and then deploy ransomware across all systems, the task becomes significantly more onerous. The security team would be faced with the dual task of understanding what happened while simultaneously advising on the best way to restore the business’s ability to operate safely. The following table summarizes how the number of assets impacted, the investigative requirements, and typical remediation efforts change across the Unified Kill Chain goals:
Initial Foothold |
Network Propagation |
Action on Objectives |
|
Assets impacted |
Low value. Typically, this involves edge devices, public-facing servers, or user workstations. Because of their position in modern architectures, these devices are typically untrusted by default. |
Medium value. Some internal systems. Typically at this phase, the threat actor has access to some member servers within the environment and has a reliable C2 channel established. |
High value. Critical servers such as Active Directory domain controllers, backup servers, or file servers. |
Threat actor’s degree of control |
Low. The threat actor has unreliable access to a system or is attempting to obtain access to a system, typically through phishing or attacking publicly facing services. Typically, this phase is the best opportunity for defenders to remove a threat. |
Medium. The threat actor has enough control to traverse the network, but not enough control to execute objectives. At this point, threat actors typically have some credentials and have a reliable C2 channel established. |
High. The threat actor is fully comfortable operating in the environment. They found all the resources needed to execute their objectives. At this point, they likely have the highest level of privileges available in the environment. |
Data requirement for investigation |
Relatively low. Typically, impact at this phase is limited to a small number of assets. Once identified at this phase, the data required for fully scoping the event is limited to a single host. |
Significant. The capability to traverse the internal network typically indicates the presence of a reliable C2 channel. A higher volume of historic and real-time data is required to identify impacted assets. At this point, incident responders will need to have visibility of all connected assets to fully track lateral movement. |
High. Investigators will require access to historical and real-time data from all connected assets. Additionally, in cases where data exfiltration is an objective, telemetry for the access and movement of data will also be required. This data is difficult to collect and is not typically tracked. |
Effort required to remediate |
Low. Activities at this phase typically occur on edge devices or public-facing assets. The typical posture is to treat these assets as untrusted, so it is common for environments to have capabilities for rapidly isolating these assets. |
Medium. Traversing the network requires more investigative work to identify the individual assets that were accessed, the degree to which they were utilized, and the requirements for remediating. |
High. In nearly every case, this requires rebuilding critical infrastructure. Often, this needs to occur with the added pressure of returning the business to a minimally operational state, to minimize losses. |
Table 1.2 – Generalized asset impact and effort versus kill chain goals
It’s plain to see the importance of finding out about cyberattacks in your environment and, more so, the importance of finding out as early as possible. The right person needs to get the relevant information about cyberattacks in a timely fashion. This is the primary objective of detection engineering.
Quickly identifying, qualifying, and mitigating potential security incidents is a top priority for security teams. Identifying potential security incidents quickly is a fairly complicated problem to solve. In general terms, security personnel need to be able to do the following:
Each of these steps can be difficult to execute within small environments. The complexity increases radically for any increase in the size of a managed environment.
Detection engineering definition
Detection engineering can be defined as a set of processes that enable potential threats to be detected within an environment. These processes encompass the end-to-end life cycle, from collecting detection requirements, aggregating system telemetry, and implementing and maintaining detection logic to validating program effectiveness.
To accomplish these goals, a good detection engineering program typically needs to implement four main processes:
Figure 1.3 – The detection engineering processes
Chapter 2, The Detection Engineering Life Cycle, takes a deeper dive into each of these processes.
Detection engineering can be misunderstood, partly because some processes overlap with other functions within a security organization. We can clarify detection engineering’s position with the following distinctions:
In this section, we examined some basic cyber security concepts that will be useful throughout this book as we dive into the detection engineering process. Furthermore, we established a definition for detection engineering. With this definition in mind, the following section will examine the value that a detection engineering program brings to an organization.
Change the font size
Change margin width
Change background colour