What Does It Mean to Ask for an “Explainable” Algorithm?

By Edward Felten

Posted on June 15, 2017


One of the standard critiques of using algorithms for decision-making about people, and especially for consequential decisions about access to housing, credit, education, and so on, is that the algorithms don’t provide an “explanation” for their results or the results aren’t “interpretable.” This is a serious issue, but discussions of it are often frustrating. The reason, I think, is that different people mean different things when they ask for an explanation of an algorithm’s results.


Before unpacking the different flavors of explainability, let’s stop for a moment to consider that the alternative to algorithmic decisionmaking is human decisionmaking. And let’s consider the drawbacks of relying on the human brain, a mechanism that is notoriously complex, difficult to understand, and prone to bias. Surely an algorithm is more knowable than a brain. After all, with an algorithm it is possible to examine exactly which inputs factored in to a decision, and every detailed step of how these inputs were used to get to a final result. Brains are inherently less transparent, and no less biased. So why might we still complain that the algorithm does not provide an explanation?


We should also dispense with cases where the algorithm is just inaccurate–where a well-informed analyst can understand the algorithm but will see it as producing answers that are wrong. That is a problem, but it is not a problem of explainability.


So what are people asking for when they say they want an explanation? I can think of at least four types of explainability problems.


The first type of explainability problem is a claim of confidentiality. Somebody knows relevant information about how a decision was made, but they choose to withhold it because they claim it is a trade secret, or that disclosing it would undermine security somehow, or that they simply prefer not to reveal it. This is not a problem with the algorithm, it’s an institutional/legal problem.


The second type of explainability problem is complexity. Here everything about the algorithm is known, but somebody feels that the algorithm is so complex that they cannot understand it. It will always be possible to answer what-if questions, such as how the algorithm’s result would have been different had the person been one year older, or had an extra $1000 of annual income, or had one fewer prior misdemeanor conviction, or whatever. So complexity can only be a barrier to big-picture understanding, not to understanding which factors might have changed a particular person’s outcome.


The third type of explainability problem is unreasonableness. Here the workings of the algorithm are clear, and are justified by statistical evidence, but the result doesn’t seem to make sense. For example, imagine that an algorithm for making credit decisions considers the color of a person’s socks, and this is supported by unimpeachable scientific studies showing that sock color correlates with defaulting on credit, even when controlling for other factors. So the decision to factor in sock color may be justified on a rational basis, but many would find it unreasonable, even if it is not discriminatory in any way. Perhaps this is not a complaint about the algorithm but a complaint about the world–the algorithm is using a fact about the world, but nobody understands why the world is that way. What is difficult to explain in this case is not the algorithm, but the world that it is predicting.


The fourth type of explainability problem is injustice. Here the workings of the algorithm are understood but we think they are unfair, unjust, or morally wrong. In this case, when we say we have not received an explanation, what we really mean is that we have not received an adequate justification for the algorithm’s design. The problem is not that nobody has explained how the algorithm works or how it arrived at the result it did. Instead, the problem is that it seems impossible to explain how the algorithm is consistent with law or ethics.


It seems useful, when discussing the explanation problem for algorithms, to distinguish these four cases–and any others that people might come up with–so that we can zero in on what the problem is. In the long run, all of these types of complaints are addressable–so that perhaps explainability is not a unique problem for algorithms but rather a set of commonsense principles that any system, algorithmic or not, must attend to.



The preceding is republished on TAP with permission by its author, Professor Ed Felten, Director of the Center for Information Technology Policy at Princeton University. “What Does It Mean to Ask for an “Explainable” Algorithm?” was originally published May 31, 2017 on Freedom to Tinker.