A practical guide towards algorithmic bias and explainability in machine learning
2019-09-05 , Track 1 (Mitxelena)

Undesired bias in machine learning has become a worrying topic due to the numerous high profile incidents. In this talk we demystify machine learning bias through a hands-on example. We'll be tasked to automate the loan approval process for a company


Undesired bias in machine learning has become a worrying topic due to the numerous high profile incidents that have been covered by the media. It is certainly a challenging topic, as it could even be said that the concept of societal bias is inherently biased in itself depending on an individual’s (or group’s) perspective. In this talk we avoid re-inventing the wheel, instead we use traditional methods to simplify this issue so it can be tackled from a practical perspective.

Content

In this talk we will cover the high level definitions of bias in machine learning to remove ambiguity, and we will demistify it through a hands on example. Our objective will be to automate the loan approval process for a company using machine learning. This will allow us to go through this challenge step by step, using key tools and techniques from latest research that will allow us to assess and mitigate undesired bias in our machine learning models.

Definitions

We will begin by providing a high level definition of undesired bias as two constituent parts: “a-priori societal bias” and “a-posteriori statistical bias”. We will provide tangible examples of how undesired bias is introduced in each step. This initial section will introduce very interesting research findings in this topic. Spolier alert: We will take a pragmatic approach, showing how any non-trivial system will always have an inherent bias, so the objective is not to remove bias, but to make sure 1) you can get as close as possible to your objectives, and 2) you can make sure your objectives are as close as possible to the “ideal solution”.

Process

In this talk we introduce a pragmatic process to assess bias in machine learning models through three key steps: 1) Data analysis, 2) Inference result analysis, and 3) Production metrics analysis. For each of these three steps we will walk through a real life example. We will be tasked with the automation of a loan approval process. We will show how some bias may affect our results in a negative way, as well as how we can use various techniques to ensure we perform a reasonable analysis. Our objective is not to show how to completely remove bias from a machine learning model, but instead what are the tools and techniques available, as well as the key touch-points & metrics to ensure the right domain experts are involved.

Topics covered

We will cover fundamental topics in data science such as feature importance analysis, class imbalance assessment, model evaluation metrics, partial dependence, feature correlation, etc. More importantly, we will cover how these fundamentals can interact at different touch-points with the right domain experts to ensure undesired bias is identified and documented. All will be covered with a hands on example through a practical jupyter notebook experience.


Project Homepage / Git

https://github.com/EthicalML/

Project Homepage / Git

https://github.com/EthicalML/XAI

Abstract as a tweet

A practical guide towards algorithmic bias and explainability in machine learning (with Tensorflow)

Python Skill Level

professional

Domain Expertise

some

Domains

Big Data, Data Visualisation, Jupyter, Machine Learning, Open Source, Political and Social Sciences, Scientific data flow and persistence, Statistics

Alejandro is currently leading the research and development at the Institute for Ethical AI & Machine Learning as their Chief Scientist. With over 10 years of software development experience Alejandro has held technical leadership positions across hyper-growth scale-ups and tech giants including Eigen Tchnologies, Bloomberg LP and Hack Partners. Alejandro has a strong track record building departments of machine learning engineers from scratch, and leading the delivery of large-scale machine learning system across the financial, insurance, legal, transport, manufacturing and construction sectors (in Europe, US and Latin America). Alejandro has given multiple talks at international scientific, technical and business conferences, and has chaired panels & events with government ministers, senior executives and domain experts.