Explaining explainable artificial intelligence (XAI)

Explainable AI (XAI) is a field of artificial intelligence that ensures that process followed by machine learning models can be explained and understood by humans. It is critical for high stakes AI

Explainable AI (XAI) is a subfield of Artificial Intelligence (AI) that makes machine learning models more transparent and interpretable to humans. Explainable AI helps clarify how AI figures out specific solutions, like classification or spotting objects. It can also answer basic (wh) questions, shedding light on the why and how behind AI decisions. This explainability, which is not possible in traditional AI, helps ensure that users can trust the recommendations of AI products.

High-stakes AI applications must be explainable

Critical applications, such as those affecting defence, financial services, healthcare, law and order, and autonomous driving vehicles, must be explainable to promote adoption and safety. Several XAI techniques, including self-explainable models, global explainable AI algorithms, and per-decision explainable AI algorithms, have been proposed for such applications

Managing Adversarial Attacks on Explainability

According to a 2023 survey conducted by Hubert Baniecki and Przemyslaw Biecek at Cornell University, adversarial attacks on explainable AI are a growing concern. These attacks can be used to manipulate, fool, or obfuscate the explanation of a machine learning model’s decision-making process, which can have serious consequences in high-stakes decision-making and knowledge discovery.

The survey provides a comprehensive overview of research concerning adversarial attacks on explanations of machine learning models, as well as fairness metrics. It introduces a unified notation and taxonomy of methods facilitating a common ground for researchers and practitioners from the intersecting research fields of adversarial machine learning (AdvML) and explainable AI (XAI)

One way to protect your machine learning model from adversarial attacks is to use adversarial training. Adversarial training involves training the model on adversarial examples in addition to the original training data. This helps the model learn to recognize and resist adversarial attacks. Another way is to use defensive distillation. Defensive distillation involves training a model on a distilled version of itself, which is less susceptible to adversarial attacks.

It is also important to regularly update your machine learning model with new data and retrain it to ensure that it remains robust against adversarial attacks.

Trust critical for AI adoption

According to a 2019 PWC study, 82% of CEOs believe that AI-based decisions cannot be trusted if they are not explainable. In a 2023 Forbes study, it was found that 75% of consumers would not use a product that they do not understand. These statistics highlight the importance of XAI in the development of AI products.

The Healthcare Opportunity

One of the most significant applications of XAI is in the field of healthcare. XAI can help doctors and medical professionals interpret machine learning models’ results and make informed decisions. For example, XAI can be used to explain the diagnosis of a particular disease, which can help doctors to understand the reasoning behind the diagnosis and provide better treatment options. XAI healthcare chatbots can reduce the load on medical professionals making healthcare more accessible, cost-effective and convenient.

Delivering Quality Data

XAI also has a critical role to play in delivering quality data. Artificial intelligence and machine learning are areas of focus to reduce workloads when delivering data quality, business glossaries, and identifying confidential data, to name just a few data management applications. Without explainability, AI algorithms run the risk of creating, rather than reducing chaos.

Conclusion

In conclusion, XAI is an emerging area of research in the field of AI that aims to make machine learning models more transparent and interpretable to humans. XAI is essential for high-stakes applications, such as those enabling defence, investment managment, healthcare or law and order, where the know-how is required for trust and transparency.

Where businesses use AI transparently and responsibly, consumers buy-in, with 65% indicating that they trust businesses that employ XAI to improve customer experiences.

FAQ

What are self-explainable AI models?

Self-explainable AI models are a type of artificial intelligence that can provide an explanation for their decision-making process without the need for additional tools or techniques.

A well-known example is decision trees. Decision trees are a type of machine-learning model that can be used for both classification and regression tasks. They are easy to interpret and can provide an explanation of their decision-making process in a way that is understandable to humans. Another example is linear regression. Linear regression is a simple machine learning model that can be used to predict a continuous output variable based on one or more input variables.

What are global explainable AI algorithms?

Global explainable AI algorithms are a type of artificial intelligence that can provide an explanation for their decision-making process in a way that is understandable to humans. Unlike self-explainable models, global explainable AI algorithms provide an explanation for the model’s decision-making process as a whole, rather than for individual decisions.

One such example is LIME (Local Interpretable Model-Agnostic Explanations). LIME is a global explainable AI algorithm that can be used to explain the predictions of any machine learning model. Another example is SHAP (SHapley Additive exPlanations). SHAP is a global explainable AI algorithm that can be used to explain the output of any machine learning model based on the concept of Shapley values.

What per-decision explainable AI algorithms?

Per-decision explainable AI algorithms aim to provide an explanation for each decision made by the model, which can help users understand how the model works and build trust in the model’s predictions. This type of algorithm is particularly useful in situations where the model’s decision-making process is complex and difficult to understand.

One such example is counterfactual explanations. Counterfactual explanations are generated by identifying the smallest changes to the input data that would result in a different output from the model. This can help users understand why a particular decision was made by the model. Another example is local surrogate models. Local surrogate models are simpler models that are trained to approximate the behaviour of a more complex model. They can be used to provide an explanation of the decision-making process of the more complex model on a per-decision basis.

What are adversarial attacks on explainability?

Adversarial attacks on explainability aim to undermine the trustworthiness of the model’s explanations by manipulating the input data in a way that causes the model to produce incorrect or misleading explanations.

Some examples of adversarial attacks on explainability include adversarial patches, adversarial perturbations, and model inversion attacks. Adversarial patches are images that have been modified to cause a machine learning model to misclassify them, while still appearing normal to humans. Adversarial perturbations are small changes made to an input that can cause a MLmodel to make incorrect predictions. Model inversion attacks are used to extract sensitive information from a ML model by using its output to infer information about the input data.

These types of attack can be particularly harmful in high-stakes decision-making scenarios, where the model’s predictions can have significant consequences

References:

Explainable AI: current status and future directions, Gohel, Singh, Mohanty, 2021

Bosses want to see explainable AI

Over 75% Of Consumers Are Concerned About Misinformation From Artificial Intelligence, Forbes 2023

Four principles of explainable Artificial Intelligence, National Institute of Standards and Technology, 2020

Adversarial Attacks and Defenses in Explainable Artificial Intelligence: A Survey, Baniecki, Biecek, 2023