By Microsoft


WILDS builds on top of recent data collection efforts by domain experts in applications such as tumour identification, wildlife monitoring and poverty mapping, presenting a unified collection of datasets with evaluation metrics and train/test splits that the researchers believe are representative of real-world distribution shifts.

Tensorflow’s Fairness Evaluation and Visualization Toolkit

By Google


Fairness Indicators is designed to support teams in evaluating, improving, and comparing models for fairness concerns in partnership with the broader Tensorflow toolkit. The tool is currently actively used internally by many of our products.

The Alan Turing Institute


By World Economic Forum


This resource for boards of directors consists of: an introduction; 13 modules intended to align with traditional board committees, working groups and oversight concerns; and a glossary of artificial intelligence (AI) terms.




Shapash is a Python library which aims to make machine learning interpretable and understandable by everyone. It provides several types of visualization that display explicit labels that everyone can understand.

The AI Incident Database (AIID)

By Partnership on AI


The AI Incident Database (AIID) is a systematized collection of incidents where intelligent systems have caused safety, fairness, or other real-world problems.

The Building Data Genome 2 (BDG2) Data-Set

By Kaggle


BDG2 is an open data set made up of 3,053 energy meters from 1,636 buildings.

The LinkedIn Fairness Toolkit (LiFT)

By Linkedin


The LinkedIn Fairness Toolkit (LiFT) is a Scala/Spark library that enables the measurement of fairness in large scale machine learning workflows. The library can be deployed in training and scoring workflows to measure biases in training data, evaluate fairness metrics for ML models, and detect statistically significant differences in their performance across different subgroups. It can also be used for ad-hoc fairness analysis.




StereoSet is a dataset that measures stereotype bias in language models. StereoSet consists of 17,000 sentences that measures model preferences across gender, race, religion, and profession.

SHAP (SHapley Additive exPlanations)


A game theoretic approach to explain the output of any machine learning model.

Themis-ml: A Fairness-aware Machine Learning Interface for End-to-end Discrimination Discovery and Mitigation


themis-ml is a Python library built on top of pandas and sklearnthat implements fairness-aware machine learning algorithms...

Themis Fairness Testing: Testing Software for Discrimination


This paper defines software fairness and discrimination and develops a testing-based method for measuring if and how much software discriminates, focusing on causality in discriminatory behavior...


By Google


Google today released MinDiff, a new framework for mitigating (but not eliminating) unfair biases when training AI and machine learning models. The company says MinDiff is the culmination of years of work and has already been incorporated into various Google products, including models that moderate content quality.


By EqualAI


We help companies reduce bias in their AI by addressing each touchpoint and the full spectrum of the issue.

ML Privacy Meter


Machine Learning Privacy Meter: A tool to quantify the privacy risks of machine learning models with respect to inference attacks, notably membership inference attacks.


By Allen Institute for AI (AI2), University of Washington


PowerTransformer is a tool that aims to rewrite text to correct implicit and potentially undesirable bias in character portrayals.

Intel Geospatial

By Intel


The Intel Geospatial cloud platform is launching today to transform the way industries manage their assets, including Utilities, Smart cities, and Oil and Gas.

SafeLife: Avoiding Side Effects in Complex Environments

By Partnership on AI


SafeLife is a reinforcement learning environment that's designed to test an agent's ability to learn and act safely. In this benchmark, they focus on the problem of avoiding negative side effects. The SafeLife environment has complex dynamics, procedurally generated levels, and tunable difficulty. Each agent is given a primary task to complete, but there's a lot that can go wrong! Can you train an agent to reach its goal without making a mess of things?

Medical Open Network for AI (MONAI)



The MONAI framework is the open-source foundation being created by Project MONAI. MONAI is a freely available, community-supported, PyTorch-based framework for deep learning in healthcare imaging. It provides domain-optimized foundational capabilities for developing healthcare imaging training workflows in a native PyTorch paradigm.

Open Differential Privacy

By Microsoft, Harvard University


This toolkit uses state-of-the-art differential privacy (DP) techniques to inject noise into data, to prevent disclosure of sensitive information and manage exposure risk...

eXplainability Toolbox

By The Institute for Ethical AI & Machine Learning


XAI - An eXplainability toolbox for machine learning