2023-04-19 –, A1
Assessing the robustness of models is an essential step in developing machine-learning systems. To determine if a model is sound, it often helps to know which and how many input features its output hinges on. This talk introduces the fundamentals of “anchor” explanations that aim to provide that information.
Many data scientists are familiar with algorithms like Integrated Gradients, SHAP, or LIME that determine the importance of input features. But that’s not always the information we need to determine whether a model’s output is sound. Is there a specific feature value that will make or break the decision? Does the outcome solely depend on artifacts in an image? These questions require a different explanation method.
First introduced in 2018, “anchors” are a model-agnostic method to uncover what parts of the input a machine-learning model's output hinges on. Their computation is based on a search-based approach that can be applied to different modalities such as image, text, and tabular data.
In this talk, to truly grok the concept of anchor explanations, we will implement a basic anchor algorithm from scratch. Starting with nothing but a text document and a machine learning model, we will create a sampling, encoding, and search component and finally compute an anchor.
No knowledge of machine learning is required to follow this talk. Aside from familiarity with the basics of numpy
arrays, all you need is your curiosity.
What makes or breaks a machine-learning model's decision? Let's use anchor explanations to find out!
Expected audience expertise: Domain –Intermediate
Expected audience expertise: Python –Intermediate
Public link to supporting material –My journey into Python started in a physics research lab, where I discovered the merits of loose coupling and adherence to standards the hard way. I like automated testing, concise documentation, and hunting complex bugs.
I completed a PhD on the design of human-AI interactions and now work to use Explainable AI to open up new areas of application for AI systems.