Research and analysis item

Psychological foundations of explainability and interpretability in artificial intelligence

Abstract

In this paper, we make the case that interpretability and explainability are distinct requirements for machine learning systems. To make this case, we provide an overview of the literature in experimental psychology pertaining to interpretation (especially of numerical stimuli) and comprehension. We ﬁnd that interpretation refers to the ability to contextualize a model’s output in a manner that relates it to the system’s designed functional purpose, and the goals, values, and preferences of end users. In contrast, explanation refers to the ability to accurately describe the mechanism, or implementation, that led to an algorithm’s output, often so that the algorithm can be improved in some way. Beyond these deﬁnitions, our review shows that humans differ from one another in systematic ways, that affect the extent to which they prefer to make decisions based on detailed explanations versus less precise interpretations. These individual differences, such as personality traits and skills, are associated with their abilities to derive meaningful interpretations from precise explanations of model output. This implies that system output should be tailored to different types of users.

External Links

Access document

Key Information

Name of organisation: National Institute of Standards and Technology

Type of organisation: Government

Date published: 12 Apr 2021

Categorisation

Domain: Horizontal

Topic: Assurance and testing, Explainability and transparency, Standardisation for AI

Content Type

Psychological foundations of explainability and interpretability in artificial intelligence

Abstract

External Links

Key Information

Categorisation

Discussion forum

You must be logged in to contribute to the discussion

Report abuse

Report submitted

Provide feedback on the site

Feedback submitted

Submit a missing item

Feedback submitted