Blossom Metevier

Blossom

I am an M.S./Ph.D. student in the Autonomous Learning Lab at the University of Massachusetts, where Phil Thomas advises me. I study ways to ensure safety and fairness in reinforcement learning and bandits. Specifically, I am interested in designing (practical) machine learning algorithms with guarantees on behavior and performance.


Before attending UMass, I was a member of the Force Projection Sector at APL. I completed my undergraduate degree in Computer Science from the University of Maryland Baltimore County, where I also competed as a track & field athlete. Throughout my undergraduate career, I was mentored by Marie desJardins and coach David Bobb.


Contact information: bmetevier [at] cs [dot] umass [dot] edu

Recent News

NERDS2020 | Website

Scott Jordan and I are organizing the first Northeast Reinforcement Learning and Decision Making Symposium (NERDS)!

Publications


Fairness Guarantees under Demographic Shift
Stephen Giguere, Blossom Metevier, Yuriy Brun, Philip S. Thomas
In Submission

Abstract

Recent studies have demonstrated that using machine learning for social applications can result in racist, sexist, and otherwise unfair and discriminatory outcomes, which can lead to social injustice. While machine learning algorithms exist that provide assurances that unfair behavior does not take place, these approaches typically assume that the data used for training is representative of what will be encountered once the model is deployed. In particular, if certain subgroups of the population are more or less probable after the model is deployed (a phenomenon we call demographic shift), the guarantees provided by prior algorithms are often violated. In this paper, we consider the impact of demographic shift and present Shifty, a new algorithm that provides high-confidence behavioral guarantees despite this shift. We then evaluate Shifty on a real-world data set of university entrance exams and subsequent student success, and demonstrate that Shifty avoids bias, even when the student distribution changes after training, while existing methods exhibit bias. Our experiments demonstrate that the high-confidence fairness guarantees of our algorithm are valid in practice, and that our algorithm is an effective tool for training models that are fair when demographic shift occurs.


Reinforcement Learning When All Actions are Not Always Available
Yash Chandak, Georgios Theocharous, Blossom Metevier, Philip S. Thomas
Thirty-fourth Conference on Artificial Intelligence (AAAI 2020)

Abstract | Arxiv

The Markov decision process (MDP) formulation used to model many real-world sequential decision making problems does not capture the setting where the set of available decisions (actions) at each time step is stochastic. Recently, the stochastic action set Markov decision process (SAS-MDP) formulation has been proposed, which captures the concept of a stochastic action set. In this paper we argue that existing RL algorithms for SAS-MDPs suffer from divergence issues, and present new algorithms for SAS-MDPs that incorporate variance reduction techniques unique to this setting, and provide conditions for their convergence. We conclude with experiments that demonstrate the practicality of our approaches using several tasks inspired by real-life use cases wherein the action set is stochastic.


Offline Contextual Bandits with High Probability Fairness Guarantees
Blossom Metevier, Stephen Giguere, Sarah Brockman, Ari Kobren, Yuriy Brun, Emma Brunskill, Philip S. Thomas
Advances in Neural Information Processing Systems (NeurIPs 2019)

Abstract

We present RobinHood, an offline contextual bandit algorithm designed to satisfy a broad family of fairness constraints. Unlike previous work, our algorithm accepts multiple fairness definitions and allows users to construct their own unique fairness definitions for the problem at hand. We provide a theoretical analysis of RobinHood, which includes a proof that it will not return an unfair solution with probability greater than a user-specified threshold. We validate our algorithm on three applications: a tutoring system in which we conduct a user study and consider multiple unique fairness definitions; a loan approval setting (using the Statlog German credit data set) in which well-known fairness definitions are applied; and criminal recidivism (using data released by ProPublica). In each setting, our algorithm is able to produce fair policies that achieve performance competitive with other offline and online contextual bandit algorithms.


High-Probability Guarantees for Offline Contextual Bandits
Blossom Metevier, Stephen Giguere, Sarah Brockman, Ari Kobren, Yuriy Brun, Emma Brunskill, Philip S. Thomas
(Poster and Spotlight Presentation) 4th Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM 2019)

Abstract

We present RobinHood, an offline contextual bandit algorithm designed to satisfy a broad family of fairness constraints. Unlike previous work, our algorithm accepts multiple fairness definitions and allows users to construct their own unique fairness definitions for the problem at hand. We provide a theoretical analysis of RobinHood, which includes a proof that it will not return an unfair solution with probability greater than a user-specified threshold. We validate our algorithm on three applications: a tutoring system in which we conduct a user study and consider multiple unique fairness definitions; a loan approval setting (using the Statlog German credit data set) in which well-known fairness definitions are applied; and criminal recidivism (using data released by ProPublica). In each setting, our algorithm is able to produce fair policies that achieve performance competitive with other offline and online contextual bandit algorithms.


Lexicase Selection Beyond Genetic Programming
Blossom Metevier, Anil Saini, Lee Spector
Genetic Programming Theory and Practice XVI (GPTP 2019)

Personal

Blossom

Apart from the grad student grind, I enjoy running, reading, and badminton. I love Hemmingway's theory of omission and Ursula Guin's alternative worlds. I am also a fan of the DC universe, specifically the Teen Titans, and I follow a number of Japanese comics. I also like playing with my cat (featured on the right) and dog!