Understanding the difference between interpretability and explainability

Something you've probably noticed when reading the first few pages of this book is that the verbs interpret and explain, as well as the nouns interpretation and explanation, have been used interchangeably. This is not surprising, considering that to interpret is to explain the meaning of something. Despite that, the related terms interpretability and explainability should not be used interchangeably, even though they are often mistaken for synonyms.

What is interpretability?

Interpretability is the extent to which humans, including non-subject-matter experts, can understand the cause and effect, and input and output, of a machine learning model. To say a model has a high level of interpretability means you can describe in a human-interpretable way its inference. In other words, why does an input to a model produce a specific output? What are the requirements and constraints of the input data? What are the confidence bounds of the predictions? Or, why does one variable have a more substantial effect than another? For interpretability, detailing how a model works is only relevant to the extent that it can explain its predictions and justify that it's the right model for the use case.

In this chapter's example, you could explain that there's a linear relationship between human height and weight, so using linear regression rather than a non-linear model makes sense. You can prove this statistically because the variables involved don't violate the assumptions of linear regression. Even when statistics are on our side, you still ought to consult with the domain knowledge area involved in the use case. In this one, we rest assured, biologically speaking, because our knowledge of human physiology doesn't contradict the connection between height and weight.

Beware of complexity

Many machine learning models are inherently harder to understand simply because of the math involved in the inner workings of the model or the specific model architecture. In addition to this, many choices are made that can increase complexity and make the models less interpretable, from dataset selection to feature selection and engineering, to model training and tuning choices. This complexity makes explaining how it works a challenge. Machine learning interpretability is a very active area of research, so there's still much debate on its precise definition. The debate includes whether total transparency is needed to qualify a machine learning model as sufficiently interpretable. This book favors the understanding that the definition of interpretability shouldn't necessarily exclude opaque models, which, for the most part, are complex, as long as the choices made don't compromise their trustworthiness. This compromise is what is generally called post-hoc interpretability. After all, much like a complex machine learning model, we can't explain exactly how a human brain makes a choice, yet we often trust its decision because we can ask a human for their reasoning. Post-hoc machine learning interpretation is exactly the same thing, except it's a human explaining the reasoning on behalf of the model. Using this particular concept of interpretability is advantageous because we can interpret opaque models and not sacrifice the accuracy of our predictions. We will discuss this in further detail in Chapter 3, Interpretation Challenges.

When does interpretability matter?

Decision-making systems don't always require interpretability. There are two cases that are offered as exceptions in research, outlined here:

When incorrect results have no significant consequences. For instance, what if a machine learning model is trained to find and read the postal code in a package, occasionally misreads it, and sends it elsewhere? There's little chance of discriminatory bias, and the cost of misclassification is relatively low. It doesn't occur often enough to magnify the cost beyond acceptable thresholds.
When there are consequences, but these have been studied sufficiently and validated enough in the real world to make decisions without human involvement. This is the case with a traffic-alert and collision-avoidance system (TCAS), which alerts the pilot of another aircraft that poses a threat of a mid-air collision.

On the other hand, interpretability is needed for these systems to have the following attributes:

Minable for scientific knowledge: Meteorologists have much to learn from a climate model, but only if it's easy to interpret.
Reliable and safe: The decisions made by a self-driving vehicle must be debuggable so that its developers can understand points of failure.
Ethical: A translation model might use gender-biased word embeddings that result in discriminatory translations, but you must be able to find these instances easily to correct them. However, the system must be designed in such a way that you can be made aware of a problem before it is released to the public.
Conclusive and consistent: Sometimes, machine learning models may have incomplete and mutually exclusive objectives—for instance, a cholesterol-control system may not consider how likely a patient is to adhere to the diet or drug regimen, or there might be a trade-off between one objective and another, such as safety and non-discrimination.

By explaining the decisions of a model, we can cover gaps in our understanding of the problem—its incompleteness. One of the most significant issues is that given the high accuracy of our machine learning solutions, we tend to increase our confidence level to a point where we think we fully understand the problem. Then, we are misled into thinking our solution covers ALL OF IT!

At the beginning of this book, we discussed how levering data to produce algorithmic rules is nothing new. However, we used to second-guess these rules, and now we don't. Therefore, a human used to be accountable, and now it's the algorithm. In this case, the algorithm is a machine learning model that is accountable for all of the ethical ramifications this entails. This switch has a lot to do with accuracy. The problem is that although a model may surpass human accuracy in aggregate, machine learning models have yet to interpret its results like a human would. Therefore, it doesn't second-guess its decisions, so as a solution it lacks a desirable level of completeness. and that's why we need to interpret models so that we can cover at least some of that gap. So, why is machine learning interpretation not already a standard part of the pipeline? In addition to our bias toward focusing on accuracy alone, one of the biggest impediments is the daunting concept of black-box models.

What are black-box models?

This is just another term for opaque models. A black box refers to a system in which only the input and outputs are observable, and you cannot see what is transforming the inputs into the outputs. In the case of machine learning, a black-box model can be opened, but its mechanisms are not easily understood.

What are white-box models?

These are the opposite of black-box models (see Figure 1.3). They are also known as transparent because they achieve total or near-total interpretation transparency. We call them intrinsically interpretable in this book, and we cover them in more detail in Chapter 3, Interpretation Challenges.

Have a look at a comparison between the models here:

Figure 1.3 – Visual comparison between white- and black-box models

What is explainability?

Explainability encompasses everything interpretability is. The difference is that it goes deeper on the transparency requirement than interpretability because it demands human-friendly explanations for a model's inner workings and the model training process, and not just model inference. Depending on the application, this requirement might extend to various degrees of model, design, and algorithmic transparency. There are three types of transparency, outlined here:

Model transparency: Being able to explain how a model is trained step by step. In the case of our simple weight prediction model, we can explain how the optimization method called ordinary least squares finds the coefficient that minimizes errors in the model.
Design transparency: Being able to explain choices made, such as model architecture and hyperparameters. For instance, we could justify these choices based on the size or nature of the training data. If we were performing a sales forecast and we knew that our sales had a seasonality of 12 months, this could be a sound parameter choice. If we had doubts, we could always use some well-established statistical method to find the right seasonality.
Algorithmic transparency: Being able to explain automated optimizations such as grid search for hyperparameters; but note that the ones that can't be reproduced because of their random nature—such as random search for hyperparameter optimization, early stopping, and stochastic gradient descent—make the algorithm non-transparent.

Opaque models are called opaque simply because they lack model transparency, but for many models this is unavoidable, however justified the model choice might be. In many scenarios, even if you outputted the math involved in—say—training a neural network or a random forest, it would raise more doubts than generate trust. There are at least a few reasons for this, outlined here:

Not "statistically grounded": An opaque model training process maps an input to an optimal output, leaving behind what appears to be an arbitrary trail of parameters. These parameters are optimized to a cost function but are not grounded in statistical theory.
Uncertainty and non-reproducibility: When you fit a transparent model with the same data, you always get the same results. On the other hand, opaque models are not equally reproducible because they use random numbers to initialize their weights or to regularize or optimize their hyperparameters, or make use of stochastic discrimination (such is the case for Random Forest).
Overfitting and the curse of dimensionality: Many of these models operate in a high-dimensional space. This doesn't elicit trust because it's harder to generalize on a larger number of dimensions. After all, there's more opportunity to overfit a model, the more dimensions you add.
Human cognition and the curse of dimensionality: Transparent models are often used for smaller datasets with fewer dimensions, and even if they aren't a transparent model, never use more dimensions than necessary. They also tend to not complicate the interactions between these dimensions more than necessary. This lack of unnecessary complexity makes it easier to visualize what the model is doing and its outcomes. Humans are not very good at understanding many dimensions, so using transparent models tends to make this much easier to understand.
Occam's razor: This is what is called the principle of simplicity or parsimony. It states that the simplest solution is usually the right one. Whether true or not, humans also have a bias for simplicity, and transparent models are known for— if anything—their simplicity.

Why and when does explainability matter?

Trustworthy and ethical decision-making is the main motivation for interpretability. Explainability has additional motivations such as causality, transferability, and informativeness. Therefore, there are many use cases in which total or nearly total transparency is valued, and rightly so. Some of these are outlined here:

Scientific research: Reproducibility is essential to the scientific method. Also, using statistically grounded optimization methods is especially desirable when causality needs to be proven.
Clinical trials: These must also produce reproducible findings and be statistically grounded. In addition to this, given the potential gravity of overfitting, they must use the fewest dimensions possible and models that don't complicate them.
Consumer product safety testing: Much as with clinical trials, when life-and-death safety is a concern, simplicity is preferred whenever possible.
Public policy and law: This is a more nuanced discussion, as part of what is called by law scholars algorithmic governance, and they have distinguished between fishbowl transparency and reasoned transparency. The former is closer to the rigor required for consumer product safety testing, and the latter is one where post-hoc interpretability would suffice. One day, the government could be entirely run by algorithms. When that happens, it's hard to tell which policies will align with which form of transparency, but there are many areas of public policy, such as criminal justice, where absolute transparency is necessary. However, whenever total transparency contradicts privacy or security objectives, a less rigorous form of transparency would have to make do.
Criminal investigation and regulatory compliance audits: If something goes wrong, such as an accident at a chemical factory caused by a robot malfunction or a crash by an autonomous vehicle, an investigator needs to trace the decision trail. This is to "facilitate the assignment of accountability and legal liability". Even when no accident has happened, this kind of auditing can be performed when mandated by authorities. Compliance auditing applies to industries that are regulated, such as financial services, utilities, transportation, and healthcare. In many cases, fishbowl transparency is preferred.

Interpretable Machine Learning with Python

By : Serg Masís

Interpretable Machine Learning with Python

By: Serg Masís

Overview of this book

Understanding the difference between interpretability and explainability

What is interpretability?

Beware of complexity

When does interpretability matter?

What are black-box models?

What are white-box models?

What is explainability?

Why and when does explainability matter?

Interpretable Machine Learning with Python

By : Serg Masís

Interpretable Machine Learning with Python

By: Serg Masís

Overview of this book

Understanding the difference between interpretability and explainability

What is interpretability?

Beware of complexity

When does interpretability matter?

What are black-box models?

What are white-box models?

What is explainability?

Why and when does explainability matter?

Create a Note

Delete Bookmark

Delete Note

Edit Note

Confirmation

Buy this book with your credits?