Emily Bembeneck:
Welcome everyone, I'm Emily Bembeneck. I lead the Center for Applied AI here at Chicago Booth under the leadership of Sendhil Mullainathan whom I'll introduce in a moment. The center, just to give you a sense of what we do if you're new to our work, we support a variety of research projects and applied AI projects that focus on how AI can improve human lives and augment human decision-making through policy tools and our research.
Emily Bembeneck:
Over the past year we've been working on an algorithmic bias initiative, which is primarily focused on bias in healthcare, where we work with governmental and commercial organizations, healthcare institutions, and commercial entities to help them identify and then mitigate the bias that arises in their systems. So today I'm really excited to have a panel to talk about bias and AI at a high level, and hear about how they've been encountering and addressing issues like this in their own fields, and excited to learn from them today.
Emily Bembeneck:
I'm going to introduce you to Sendhil. Sendhil Mullainathan is our faculty director here at the center. He's the Roman [Family] University Professor of Computation and Behavioral Science here at Chicago Booth. He's a literal genius and he's written some fantastic work on algorithms in healthcare, poverty, discrimination, etc., and I'm really excited to have him introduce everyone today.
Sendhil Mullainathan:
Thanks, Emily. And thanks to everyone who's joining to listen in, and especially to the panelists. I'm super excited for this conversation because algorithmic bias is a topic that I think is particularly important. You might've heard a lot about it, but I think despite the level of discussion, it's something that merits even more discussion. So I'm really eager to hear what the panelists have to say. Let me just give a quick introduction to everybody.
Sendhil Mullainathan:
Amy Strader has a really interesting career. She lived through the AI winter in the 1990s at NASA of all places. And at that point when AI interest was drawing up, she moved to Booth, did a degree here, and then for the past 15 years has been working at Microsoft, where she's had a front row seat for seeing all of the stuff developing. Ben has the misfortune of having been a student of mine but somehow he managed to survive that and actually thrive despite that, and is currently director of product marketing at H2O.ai, and has actually written a book which I hope you'll have a look at published by Riley Media just a few months ago.
Sendhil Mullainathan:
So this is pretty exciting. Amy Hsueh heads partnerships at TensorFlow at Google, and previous to Google, she's worked at Nest and Time Warner Cable, and she also graduated from Booth in 2013. Andrew, who's not with us, so I think I won't introduce him, Emily. Should I? Probably not worth it. Okay.
Emily Bembeneck:
I'd wait until, if he comes in then—
Sendhil Mullainathan:
If he comes in I can do it. And finally Chenhao, who I've known for quite some time is currently at UC Boulder as a faculty member, but will actually be joining us in the CS department at Chicago very soon. And I was very, very happy to recruit him. He's worked on a variety of topics, largely in natural language processing, and has numerous distinctions as an academic, including an NSF Career Award. So I will hand it over to you and looking forward to the discussion.
Chenhao Tan:
Oh, I guess, are we just starting right away? Or is there any homework that we need to do? Okay. Cool. Great. Thanks, Sendhil, for the introduction, and Sendhil has told us a lot about the impressive work that the panelists have done. Let's start by just sharing what you're working on now and how AI is impacting your industry as of now, I guess. Why not start with Amy? And we have two Amys, I guess Amy Strader.
Amy Strader:
Okay, I'll go ahead and start. Right now I work at Microsoft. I am in the sales organization at Microsoft and my current interest is understanding how we can improve the efficiency of our sellers overall, reinventing our sales process specifically of our cloud products at Microsoft to improve the cost of sale but also increase our customer satisfaction. So when I think about the impact of AI on what I do, there's really two places.
Amy Strader:
There's us eating our own dog food, as we say at Microsoft, using our AI technology to improve the efficiency of what our sellers do every day. And then obviously there's the conversations that our sellers have with the customers at Microsoft who are using our technologies to implement AI within their own businesses.
Amy Hsueh:
And—
Chenhao Tan:
Yep. Amy, sure, you can go next.
Amy Hsueh:
Sure, yeah. Thank you all for having me. This is such an amazing time, I think, to be talking about this issue. Google Research, I think is one of the pioneering departments that's really helmed AI. I looked after the partnerships related to TensorFlow which was open-sourced five years ago, right? And so we had a bunch of engineers and distinguished scientists at Google Research, including Jeff Dean, who were helming that effort.
Amy Hsueh:
And the interest in open sourcing was important for Google for a number of measures, right? So first of all, internally, we apply machine learning to a number of different areas within all of our different apps. So for example, if you use Gmail, you might notice some really nifty auto-fill completion suggestions now, depending on what you want to write. That's been really fun. Google Cloud is really leveraging AI in terms of a number of different diagnostic areas.
Amy Hsueh:
For example, I believe they just announced several healthcare tools that enable basically discerning and capturing loose text and being able to interpret what that loose text string might imply for a diagnosis. You can see how AI is really just front and center to Google. We've always wanted to be an AI-first company. And open-sourcing TensorFlow is really part of that mission. We wanted to democratize and open the toolkit for other practitioners, everyone from researchers all the way to hobbyists who want to play in this field.
Amy Hsueh:
And it's very much the ecosystem approach with which I look after TensorFlow. And we are trying to both seed the market in terms of building partnerships with Microsoft and H2O.ai, but then also ensure that internally we're using the best of Google AI to power some of our products and services.
Chenhao Tan:
Great, that was a very nice overlook. What about you, Ben?
Ben Cox:
So you used a great word, and that's democratize. My company's motto is democratize AI, and I think—so my title is director of product marketing, but a lot of what I do at H2O is around machine learning interpretability suite, which is essentially our toolkit for explainable AI. And that holds a lot of post-talk analysis tools like a data scientist would use to debug or do discrimination testing or fairness testing or explainable AI features. And so a lot of what I end up doing, as Sendhil mentioned, one was writing a book.
Ben Cox:
So we have our post hoc analysis methods, but our view on bias and fairness in AI is that it's a much bigger picture and that there's this whole supply chain of data science workflow and post hoc review for discrimination or disparities is one thing. So a lot of what I spend time on is writing this book and it's around the people process and technologies that go into defining a more fair AI ecosystem. And that starts from the minute you start collecting or creating data all the way to model monitoring and review.
Ben Cox:
So we have one book that's machine learning interpretability, that's specifically around what we think are the best methods for debugging or reviewing a model. But then we also have this book, which is around how do you build that team, how do you structure the projects, how do you design that workflow, and how do you build the technology that can appropriately nest all of these tasks in order to make better decisions as a data science unit?
Ben Cox:
And then it starts to think about the economics of the larger, like who's impacted, what are the externalities, what are the network effects of your machine learning decisions, right? So that's a lot of what I spend time thinking about, writing about, and doing product strategy to encapsulate that.
Chenhao Tan:
Great. That's a very nice segue. It's certainly getting ahead of the theme here. So next question is exactly going a little back and just, can you talk a little bit about what algorithmic bias looks like in your field? And that is the theme today and I'm sure a lot of audience are interested in learning more about what you are seeing in practice. And Andrew has joined, I guess. Sendhil, you can maybe give a quick introduction. I guess not. So let's go back and this time let's start with Ben.
Chenhao Tan:
And I guess Ben was already getting ahead and trying to solve the fairness issue.
Ben Cox:
Sure. So what does fairness look like is a really tough thing, right? We just put out a paper that we partnered with the LDS, which the director of AI Fairness is also a Booth alum, Nicholas Schmidt, and then Discover Financial because one area ... I have a finance background and one area that I'm particularly passionate about is fair lending, right? We talk about the decisions that AI systems make and not enough conversation goes into what are the externalities and the impacts of that, right?
Ben Cox:
And I think access to capital and lending is super, super important, and it gets a bad rap because the idea is just, oh, a credit card, who's going to get it, who's not? But I would assert that disparity and access to capital has created waves of impact and GDP loss over decades in society. That aside, what we spend time doing is sort of defining what is fairness, right? And the things that we use are disparate impact analysis where there's a couple of metrics like marginal quality, SMD.
Ben Cox:
And you start to look at with one category of a group, whether it's ethnicity, gender, age, is it being treated disparately, and are we seeing more false positives or more false negatives within that class? So that's how we look at fairness and just disparity or discrimination at H2O. I should caveat that by saying, one, I'm not a lawyer and, two, I'm not an ethicist, so we try to tread this very lightly and not assume that by giving you access to disparate impact modeling methods we're not saying, okay, your AI system is good to go.
Ben Cox:
This is just one piece in a much larger puzzle of solving, does this machine learning system even make sense? So that's where we're starting, and we think it's a good place to start. And one of the things we were talking about before everybody jumped on is that our hopes with this paper with Discover and DLDS is that we get this kind of summit or symposium where people say, "Okay, we agree that this is a definition of fairness." Because a lot of times we'll see in finance, well, there's no defined definition so we're not going to touch that yet.
Ben Cox:
So it has to start with everybody getting in a room and agreeing this is what disparity looks like in a machine learning decision or not. So that's where we start.
Amy Strader:
Yeah. I think what you said is very true. There's no definition of fairness that will fit all situations, right? Part of it is what's fair in the context of the problem you're trying to solve and the audiences you're trying to serve. Fairness is a concept that philosophers have struggled with for hundreds of years, right? So to say that we're going to have a fairness algorithm that we can apply to the work that we're doing is a bit ambitious or maybe egotistical of us to say, right?
Amy Strader:
So I think the way that we look at it at Microsoft is that we want to minimize any harm that could be done by bias that's in a system, right? And so that takes into account the audience as well as what you're trying to accomplish. But it's impossible to minimize bias on every dimension. It's a difficult problem. So we look at it in terms of, like you were just saying, where's the harm, who are the audiences that we're serving?
Emily Bembeneck:
Hey, Andrew has joined us. So I just want to take a second to welcome him to the conversation. Sendhil, do you want to a second to maybe introduce him?
Sendhil Mullainathan:
Yeah. When you asked earlier, Chenhao, I didn't have my notes. So I pulled back my notes up. Andrew Zaldivar, sorry, I'm sure I'm saying your name incorrectly, Andrew Andrew Zaldivar-
Andrew Zaldivar:
No, you got it.
Sendhil Mullainathan:
I got it? Oh, okay. Next you have to say my name. That's the deal. No, no, no. Andrew is a senior development relations engineer for the ethical AI team and so he's going to be ideally suited for this conversation. I'm really glad you could join us, Andrew.
Andrew Zaldivar:
Thank you very much and sorry for the hiccups here. I am experiencing a power outage, but I suppose this will do for now. Thanks again, and I appreciate everyone's patience.
Chenhao Tan:
Thank you for joining. It's great to have you. I guess the question we're at right now is we are trying to look at what algorithmic bias looks like in practice in your field or in your industry. And what we discussed a little bit briefly before was what you're working on right now. So maybe Amy Hsueh go on with the current question and then you can jump in and tell us what are you working on and how algorithmic bias might look like in your industry.
Andrew Zaldivar:
Okay, cool. Cool.
Chenhao Tan:
You can go ahead.
Amy Hsueh:
You can go ahead, Andrew. Please.
Chenhao Tan:
Andrew, go ahead then.
Andrew Zaldivar:
Okay. My apologies. So I'm dialing in by phone not by Zoom so I can't see everyone's, the little green box that indicates that they're talking. So if I rudely interject to someone, I'm happy to put my mic on mute for a moment and then—
Chenhao Tan:
No worries. Just go ahead and tell us what are you working on and how algorithmic bias may look like in your industry.
Andrew Zaldivar:
Okay. So I caught the tail end of the last person whom I don't know, because I can't see, talk about or describe notions of fairness and some of the challenges there, and the same applies in my domain. One thing that I will add to that is I also think that part of that challenge comes from just the current state of how AI, and data technologies, in general is being educated to the soon-to-be practitioners and what have you.
Andrew Zaldivar:
And so there's a lot of knowledge gap that needs to be closed between those that have been studying and investigating and pushing, not pushing but rather identifying guidance around trying to best navigate this really challenging aspect of how does one define fairness and then how does one even go about identifying notions of unfairness in either their data or in their trained algorithm.
Andrew Zaldivar:
And so that's the role that I play, is I try and sit in between the larger communities and the various research and product teams at Google that are tackling on these challenges and are creating these tools, these resources, publishing papers, and then going one step further and to responsibly democratize all of these new technologies, all of these new guidances, to the broader, larger community in order for all of us to collectively help AI evolve ethically over time.
Andrew Zaldivar:
Now, the other part of the question as to what I'm focusing on recently, I've been involved in some of the transparency efforts that's going on at Google. I happened to be one of the coauthors on this paper Model Cards, which has since been turned into a resource that is available in TensorFlow, and also many other institutions, organizations, companies outside of Google have adopted Model Cards. And basically the way that I would describe it is think of it as a boundary object.
Andrew Zaldivar:
Just an artifact that can help teams, groups, etc, understand the contextual nuance of the technology that they're building, whether it's the actual model itself or the data that's accompanying the model. Either way, a lot of these problems with regards to harm and the amplifications of these very problematic systemic biases that exist in algorithms, a lot of that stems from the lack of context that was used in order to create the data, in order to create the model.
Andrew Zaldivar:
And then those data sets, those models, then get used in context for which they're just not well suited. And so we realized that the way in which practitioners, non-practitioners, technologists, non-technologists, because everyone's involved in the process. It's not just a sole engineer, we're talking teams, and even those that are impacted by the outcomes of these algorithms should be able to participate. And yet there's nothing that exists in order for them to be able to participate.
Andrew Zaldivar:
So we're hoping that the transparency effort here by way of Model Cards and similar efforts that are going on will, again, as I said earlier, help to close that knowledge gap and to tighten the relationships between the various people that are involved in creating these automated decision making systems.
Chenhao Tan:
Great. I'm a big fan of the Model Card work. Great to know. Amy, what about you. I guess I just realized that both of you are from Google, maybe you actually have a [inaudible 00:20:16].
Amy Hsueh:
Yes, we're both from Google, but Andrew sits more on the research side and I sit a bit more on the infrastructure language side. So I would add just very quickly, in addition to the Model Cards that Andrew's team worked on is now being open sourced to TensorFlow, there are other tool sets really. But taking a step back, what I would say is that what I enjoy about every company's approach to building AI systems is there's a...
Amy Hsueh:
I think it's because of the influence of researchers being involved actually in the productionization of AI, meaning they're either at companies that are creating AI systems, or they're researchers that are sitting within large companies like Microsoft and Google, I find that there's a real thoughtfulness that's approaching the building of the AI systems.
Amy Hsueh:
And because of this thoughtfulness we are coming through with actual analytics and tool sets like Model Cards, Fairness Indicators, and like a What If tool, for example, to enable practitioners to start implementing and just testing whether or not their AI system includes bias. I think what's really fascinating about the machine learning workflow is that because a human is in the loop almost any step of the way, bias can be introduced at any level, right?
Amy Hsueh:
So from a partnerships perspective, what I'm thinking about is, first of all, as I'm building partnerships in the ecosystem, am I working with companies that have an awareness and thoughtfulness approach in terms of understanding and limiting bias in the solutions that they provide? So recently we've gotten really close with three startups in this space, one is called Darwin AI. They do a lot of explainability and interpretability of models.
Amy Hsueh:
So they're thinking about bias in terms of auditing and being able to understand what an algorithm said, and then the double-click on that is whether or not the algorithm itself is biased. They actually recently even published a paper on auditing the image net data set, which my understanding is that data set was frequently used in a number of computer vision tasks. And through their technology and platform they identified that 40% of the data set was identified as female.
Amy Hsueh:
Males, 15 to 29 was like 27% of the data set. So they were just already understanding and able to unpack that data set that's being used a lot for computer vision tasks is potentially having bias because they didn't have representative example. Another company that I work with is called Determined.AI, and they basically enable deep learning machine learning framework to happen on its platform via high primary tuning.
Amy Hsueh:
And one of the cofounders of Determined.AI is currently at the machine learning department of CMU. So what they're trying to understand is unpacking the overfitting issue that happens within modeling, and whether or not when you over-fit are you then under tuning something such that it gives rise to bias in the outcome? And then the last company that I'll speak about is called Label Box and they're really focused on attacking bias at the labeling data side of the problem.
Amy Hsueh:
Meaning they're trying to put in flags for anyone using their platform to understand whether or not bias is occurring at the labeling stage. So I think three very different companies, they're all very aware again about how ML bias can occur, and because of that knowledge point, because of sometimes their touch points with researchers, they're able to start implementing that and putting it in their workflow. So I appreciate, I guess, looking out how folks are identifying the stop gaps and be able to stop it in the process.
Chenhao Tan:
Oh, thanks, Amy. Actually just a side note about the general Q&A session, I think the attendees can type questions in the chat window, and there are already some questions coming in. And other attendees can actually also like these questions so that we can have some kind of ranking within these questions. And I think what Emily and I discussed earlier was that we are going to do some questions that we prepared for maybe the first 45 minutes, then we are going to go do some Q&A from the audience.
Chenhao Tan:
And it would be great if you can like the question that sums up the questions that you want to be answered. And one of them actually really connect to the theme that everyone's thought about just now, like Amy was just talking about different role, for instance, the role of then researcher could have been played in this process. And democratized has been a key issue. And the question that I had in mind was what role different entities can play in the process of developing more fair algorithms.
Chenhao Tan:
For instance, our government, this is a question that was raised is from Setco, I guess I tried my best, [inaudible 00:25:23] in the QA window as well, and I guess I was also personally interested in what do you think the universities can play or the academia can play in general in addition to the industry? And I guess this time ... Ben just unmuted himself, so we can start with Ben.
Ben Cox:
Yeah, I'll respond because one of the projects I'm working on is alongside academia, a group at Carnegie Mellon that was originally at University of Chicago, data science for social good. And the project is around predictive fairness in criminal justice decisions, right? So if we say there's disparity in certain parts of criminal justice decisions, how do we, one, start to unwind that with machine learning, how can machine learning intervene and create more equity and fairness across decision-making?
Ben Cox:
Again, what does fairness even mean is a big portion of the writing that they did. But then also how can we apply machine learning to reduce recidivism? And one of the reasons I think academia and academics play a big role in this is they're not interested in creating a model with the most lift for a company, right? A lot of times, I was a consultant in data science before, fairness was not a thing that came up a lot. And that's why I got into this space, because our clients would never ask us, "Okay, but where's the bias?"
Ben Cox:
They would just ask, "How many people can I fire?" And that's not the right question. That's the worst question probably. But what I appreciate is that data scientists, and I think there were some comments from Andrew about this before, I'll take it a step further and say data scientist is a pretty bad name for a job because these aren't trained scientists, right? These are a lot of folks who come from different backgrounds who've fallen into this space.
Ben Cox:
And so there's not many data scientists who observed the scientific method, right? And we have people who build machine learning systems and they don't have an experience in academic or behavioral science and how do you design that experiment from the start? So we end up with a lot of confirmation bias for our projects because you didn't start from a place of actually designing an experiment that made sense or that was trying to get to the truth versus trying to get to a result.
Ben Cox:
So academia does this in a good way, right? I think in Sendhil's class one of the things that we talked about is the algorithm in a lot of cases doesn't matter. It's how you set up the problem and how do you frame it? And what's the data coming in, how did you structure it? And that's where academia does a really, really good job and it's really exciting, at least for me in the industry, to see these grassroots in academia and some researchers at places like Google, Facebook, H2O, coming together to work on these projects that create things like Kaylon or Shapley, or Disparate Impact Analysis.
Ben Cox:
Because it's cannibalistic to our business model in general, right? We're saying like, ah, this is a little more rigorous than we're acting like it is in software sales. So it's really awesome because academics have nothing to prove to a corporate entity, which is really, really great that they're coming in from an unbiased perspective. And that's where I think they play a really big role to get to the heart of what we're trying to solve in algorithmic bias.
Amy Strader:
I think academics clearly has a huge role to play, that I think as the tools get easier to use and more widely used, awareness of bias and fairness in AI and how you can control for that can't be limited to the academics, right? Like for example at my company, every employee has to take training on fairness in AI, right? Whether you're directly doing it or not, but whether you're in marketing or sales or someone who's implementing it for people, that awareness and understanding that ...
Amy Strader:
Fairness isn't something that you add on as an ingredient at the end of the system, it's something that you have to think about when you define the task, when you define the data set, when you train the model. Every aspect of it is something that you think about. So I think certainly there's a huge role for academia, but I think one of the challenges is how do you take that awareness and that care and translate that into what people are doing, who as you say are data scientists, but aren't really scientists.
Chenhao Tan:
Sorry, I guess my question maybe better to ... Another side of that question, we were [inaudible 00:29:58] was also interested in the role of government in this process, and maybe potentially I guess regulation, maybe what was in the yard. But I guess we can go on with Andrew or maybe this time let's go on with Amy Hsueh and then Ben and Amy can unpack if you have anything else on the commerce side.
Amy Hsueh:
So if I heard that correctly, did you want me to jump in on that and give you a point about—
Chenhao Tan:
Yeah, go ahead. I guess you have seen a difference. Yeah, go ahead Amy.
Amy Hsueh:
My apologies again. So yes, you mentioned regulation. I think that's a very important component to all of this. I am under the strong belief that in order to best ensure that data and AI technologies continue to steer in the ethical direction it has to be bottom up. And by that I mean we have to look at it from a holistic perspective because the reality is that we live in a prolific society, right?
Amy Hsueh:
And even things that we can all come to a consensus on as being "fair" may not even apply or make sense in a completely different context. And so how then do you compete with these different notions of fairness across these different societal cultural norms? And so I think a better way to, maybe not a better way, but certainly a method which we can get involvement from the bottom up so that it's not just engineering and data scientists and research, but also those that are in the community that are also trying to ensure that these technologies don't propagate erasure or any other form of harm that could arise, whether it's representational harm or allocative harm and stuff like that.
Amy Hsueh:
And this speaks to why I value transparency and education in this space. And researchers play a pivotal role here too, just as everyone else, in that they can oftentimes serve as the champion voice of those respective communities. And that can help to come to the realization that, for example, maybe these algorithms shouldn't even be used in the context for which, in like the criminal justice system, right? Things like that where ...
Amy Hsueh:
It goes even beyond fairness, and it's just understanding the landscape and coming to the realization that maybe this technology just shouldn't exist in this domain, period. And to come to that consensus, you will need regulation, you will need elected representative, you will need people's voices, but you also need the data scientists, the researchers, the engineers, product management, designers especially. All of them should be able to have their take on it, irrespective of even being an ethicist.
Amy Hsueh:
Because we all have our personal ethics, we all have our personal morals, but we also have organizational ethics and organizational morals. And then we have societal level ethics and societal level norms or morals. And so all of that has to interplay with one another. And that's where the real challenge is at.
Ben Cox:
I'll jump in, just to add to my original point in the government question. I wouldn't take this stance on a lot of other topics, but I think regulation is very needed. We did a big paper about what we see in various global regulatory or oversight guidances. And the reality is just that machine learning is technology that's being used to make a lot of predictions that materially impact people's lives and the people around them. And a lot of these systems are being built without really any expertise in there.
Ben Cox:
And additionally, there's not a good infrastructure of if a machine learning system ... So finance has SR11/7, right? If a model breaks or a model makes predictions haphazardly, there is an accountability. There isn't that right now in a lot of machine learning, where if your model makes a prediction that somebody shouldn't get alone and that's a false negatives, who's going to be held accountable? There's not really a system for that.
Ben Cox:
And there's not really any way for a government agency to come in and say, "Okay, your system has been designed haphazardly." And the problem is that similar to nuclear energy or air travel, there are material people who stand to benefit or have negative effects from these systems. And there's just not a good system in place right now. And there just needs to be a better one. So I would argue that there are certain countries, GPR put a really good framework in place.
Ben Cox:
PDP Singapore is a really good oversight of how to start thinking about regulation machine learning. And the reality is that a lot of the proposed guidance ... I like PDC Singapore a lot, and FTC has guidance in the States. Things like the right to appeal or the right to explanation, or the right to know if there's a third party system in the mix, is not something that will, reduce your top or bottom line by some extreme amount.
Ben Cox:
It's just very basic things that you could implement within an org to really do some CYA in case a regulatory authority comes in and says, "Wow, you have really put on a lot of risks to the people that you supposedly serve." That's my two cents.
Chenhao Tan:
Amy Hsueh, just to make sure you have a chance.
Amy Hsueh:
Yeah. I really appreciate what everyone has said, and I would only add on and think about technology has a very strong role to play in helping regulators understand what we are developing. I think it's a massive failure that it took this long for legislators to understand the digital advertising model, let's take for example, right? They should not have understood how digital advertising worked in front of a congressional hearing three years ago.
Amy Hsueh:
We should have been ... We collectively as an industry ought to have built that into our conversations with our regulators to help them understand what we're doing so that they have the opportunity to ask their questions ahead of time, right? I think that regulatory philosophies are being really driven by the EU currently in this area, and I suspect that they will continue to lead the charge in holding an industry accountable for what we're developing.
Amy Hsueh:
And it's really on technology companies really to bridge the gap and make sure that folks understand what is good, what is not good, and then to have an open conversation about what's allowable versus not
Chenhao Tan:
Great. We have a lot of fantastic questions, so I guess I just jump to the audience question now. Aden asks, I know you are not lawyers and cannot define what is fair, so it's difficult to personally recommend actions that make models or data that are fair, but most of the discussion so far has been around the how to measure bias, which is a great first step. What are some specific ways that you have seen companies take actions to produce more equitable outcomes in their algorithms and data?
Chenhao Tan:
I guess there's a question towards the specific ways that you or other companies are taking to ensure equitable outcomes
Ben Cox:
I could jump in here. So there's some good tools, right? I think one of the favorite ... There's two things that I really, really like. One is actually a tool that is very heavily used by a Booth alum, Nick Schmidt at BLDS, an economic consulting firm, talking about the Pareto frontier of fairness and accuracy, right? So you can build a system where you have all of these models that you've created and there's this leaderboard of accuracy, and you can have accuracy on the X axis and then on the Y axis you can have average fares, right?
Ben Cox:
And so you can plot the frontier of all of the models and see that there is a universe of models that perform right where you want them to, that have a certain level of fairness. And then you can decide, okay, what's the trade off between a more on average fair model, versus the accuracy trade-off. And so you can make, given a necessary accuracy level, the most fair choice on average, right? So there's tools like that where you plot the interaction between accuracy and fairness.
Ben Cox:
And then I think there's some really good tools in the adversarial de-biasing, adversarial testing, where you can feed into your model, say obviously our first metric is AUC or whatever your accuracy metric is, but then under the condition that fairness or disparate impact across a certain age or ethnicity, I want that secondary metric to be indexed on for that disparate impact to be fair. So there's nice adversarial de-biasing testing methods, there's the disparate [inaudible 00:40:02] of fairness and accuracy.
Ben Cox:
And I think those are two really good steps forward. So I think they actually boost adversarial models and stuff like that. I'm at tech shop so that's what we're looking and what we build, but from a technology perspective that's where we start.
Andrew Zaldivar:
Those are awesome. If I may, I like to add one more type of technology into that mix there, and this comes from some of the research work, not just within Google, but collectively within the ML community. And it's this idea of constraint optimization. Now, I'll try to keep this as high-level concept as possible, but basically the idea is that all of these machine learning algorithms, they're all optimized to minimize the defined loss function.
Andrew Zaldivar:
So, for example, most of these algorithms by default are optimizing for the highest level of accuracy. But the problem there is that it doesn't quite have a nuanced look with respect to performance outcomes. So even if your model, for example, has 95% accuracy, that doesn't mean that everyone represented in the distribution will perform at that level. You'll have some of these disproportionate outcomes where depending on certain sensitive categories it could be greater than 95%, it could be lesser than 95%.
Andrew Zaldivar:
The 95% there is just the average. But instead, with this constraint optimization work, which there's a package available in TensorFlow. And this was all done by the research work of Maya Gupta as was Andrew Cotter and Hari Krishna, but the way that it works is instead, you can define upfront how you want your model to be optimized. And you can say, for example, that I want the false positive rates to be equally low for these particular subgroups that are present in the data set, and that will likely exist when this model is being pushed into production and running in the real world.
Andrew Zaldivar:
And so you can set those parameters up ahead of time so that you can see how well your model will try to achieve that notion of parity, right? And again, that goes back to more like how does one define fairness, right? But assuming that you get to that point, or assuming that you come to a consensus on what seems reasonable, given the thought exercise of gauging a harm and what have you, there are tools now where you can have these expressive notions of optimization that better reflect real-world impact as opposed to just leaning on one particular metric.
Andrew Zaldivar:
Constraint optimization, that's another one in that list alongside with what was brought up just now as sets of tools and technologies that are really helping to have practitioners think more about notions of fairness.
Chenhao Tan:
Amy, I guess, do you have something to add?
Amy Hsueh:
Perhaps there's an element of that question, if I interpret it correctly, is just like how maybe companies are really thinking through and catching bias. I think we've answered on the tool side and the technology side, but I would just say that having sat in TensorFlow for about 18 months and in Google Research for about that time as well, what I've observed is it started really ...
Amy Hsueh:
I think the organization understands that it starts with human design and creating teams that are reflective and diverse and come from a lot of different backgrounds. So we've got a really fantastic research team. I think, and Andrew, keep me honest here, I think it's being led by Timnit Gebru, right?
Andrew Zaldivar:
It's Gebru and Meg Mitchell.
Amy Hsueh:
Right, right. And so they're coming at it from the representation lens, right? They themselves are really fantastic researchers and scientists and they're focused in building a team whose mandate it is to understand the role of how bias comes into AIML, and how do you approach designing, I guess, fair systems. So that's like, I would say, everyone shared the technology side.
Amy Hsueh:
But from the human capital side, if we ourselves are designing these systems [inaudible 00:45:13], we want to make sure that you're getting the right representation in that design focused meeting on which to have these conversations. And so I'm a little bit over my skis in terms of discussing how they do that work, but I, as an observer, see that we are moving in that direction at Google. It's very important to ensure representation is at the table when we start thinking about these issues.
Chenhao Tan:
Cool. I think we have time for maybe one or two questions. Let's see how this one goes. And I think Emily told me that we needed some wrap up time for Sendhil to make some final comments. This question goes back to the regulation question. How do you think about the trade-off between higher transparency, which would allow for greater scrutiny of the algorithms by the public or the regulators versus the proprietary nature of all these algorithms? They need to create a profit for individual companies.
Ben Cox:
So my first job was in quantitative finance, or quantitative trading, which the whole business of that is come up with an algorithm that generates alpha and then lock it down under 30 layers of hidden code and make sure nobody ever finds it out. I'll say one of the reasons I fell into data scientist and really appreciated this space is because all of the scientists and research democratizes these things, right? You can go get TensorFlow algorithms, you can go get H2O algorithms, right?
Ben Cox:
We open source everything. And so the proprietary nature comment is oftentimes use case specific, right? The flavors of how you're going to permeate that algorithm or code in cost functions or design it within a larger system, will be based on the business objective or business problem that you're trying to solve. But if you end up being at a company where you've created a new fair boost algorithm, you're not going to really keep that private.
Ben Cox:
The flavors of how you deploy it and monetize it is going to be where you keep proprietary, and that'll just be like the data is confidential, right? And so what's great is that these algorithms just get published and shared around and get improved on. So the innovation curve of AI and machine learning has just been awesome to watch, versus coming from finance where no progress has ever made because everything is under lock and key.
Ben Cox:
So that's been one of my favorite parts about this space is that Google, Microsoft Facebook, and H20, all have democratization of algorithms and insights as a core culture piece.
Amy Hsueh:
I think recently I heard ... There's a startup called Papers With Code, that I think now Facebook acquired. And I think that when I think about the trade-offs what I really enjoy is that the participants right now are self-regulating. We don't necessarily need a ton of government pushing us because we have really good actors at the table with right motivations and incentives for the most part, I think.
Amy Hsueh:
And so for example, with Papers With Code, they've done a partnership, I think with Archive where like papers are published and then everyone can have access to at least the data not the algorithms. So it's really to improve reproducibility amongst researchers. So I think that would be a nice trend to continue pushing on in this area, because we can then essentially as a community commit to a standard that says we want to be more transparent with our data sets and our algorithms.
Amy Hsueh:
And so it allows for other folks to test it and to see whether or not they have the same results or to start questioning you if there's a bias. And so a community that self checks, in my mind is actually the best kind. And I see that kind of movement currently within this area.
Amy Strader:
I don't see how you can really have fairness without having some degree of transparency, right? Because if you don't know what the model is doing, how do you know if it's unbiased, right? So the two go hand in hand, and I think from my perspective it's also important for user acceptance, right? If you have a model that you're expecting some human to act on the recommendation that you're giving, if there's no transparency behind that, if you can't explain to them why it's being recommended, they're not going to do it.
Amy Strader:
And it doesn't matter, even if you can prove that the model is good, it can be difficult to have people accept the recommendation if intuitively it doesn't make sense to them. So when we said this was about bias and fairness, I said, hey, transparency has to come into the conversation.
Chenhao Tan:
Great. Just a quick follow-up, I think I'm certainly amazed by the openness of the CS community in terms of algorithms, and I remember in our conversation with Sendhil how we make all our lectures public because it seems unimaginable for both economists. But on the data side, I think going back to Ben's point, how do you see the transparency of data? How would that play out in an auditing setting or maybe for the ... Like I know Google and the GDPR already some have some tools in place.
Chenhao Tan:
Maybe you can also add some comments on that to help our audience understand what's there and what do you want to see in the maybe near future?
Ben Cox:
I think that data privacy and security is a really big component, right? I've only really tackled this from the machine learning side, and from a security and privacy perspective, how do you make a machine learning model that you can gain into giving you information about the decision thresholds on certain features, right? So how do you stop the model from being hacked? I think GDPR ... So in California, they just voted to increase a new consumer protection act around data privacy.
Ben Cox:
And I think that's a really good first step, right? I think at the end of the day a lot of the problems in machine learning, the fail cases of where the AI goes wrong, it's not really going wrong, it's like the data that was fed into it tells a pretty bad story, right? So there's a world that we need to understand our data better and understand where historical biases are creeping into the model and the model is just picking up on that signal.
Ben Cox:
But there's also a big world of how do we lock down consumer rights with respect to data? And I wouldn't even know where to begin with that massive piece of decision making.
Andrew Zaldivar:
Yeah. And to that point, again, I can't emphasize enough the importance of context here, because as I mentioned earlier, a lot of these incidents that have arrived in the misused or mis-classification of machine learning systems that have disproportionately impacted under-represented groups, stems from the fact that, at least for many of the cases that I've looked into, the data that was used to train the algorithm, it was a data set that was ill fitted for its intents and purposes.
Andrew Zaldivar:
And without that context, one can't even begin to figure out how to be held accountable, how to remediate, how to mitigate any of the skewness that if left there could increase harm. So with that said, in addition to data stewardship, data handling, and abiding by these strict but necessary privacy guidelines, there is something that can be done to better contextualize the societal and cultural elements of a dataset without necessarily putting an individual that's in that data set at risk.
Andrew Zaldivar:
And so I hope that in the near future there's more conversations around that. So that next time if a researcher is looking to download a data set, they'll have way more information about whether or not this data set is useful in the context that they want to continue to explore and pursue whatever it is that they're trying to work on, also in a way that at the same time it doesn't put anyone at harm there.
Andrew Zaldivar:
And if anything, that individual then is better able to decide, make a more informed decision as to whether or not to continue to move forward, or if they do so they'll know exactly where they need to address things. So trying to interject context into metadata, there's something that needs to be explored there. But hopefully in the near future, we'll have a more nuanced way of looking at data that doesn't compromise individuals.
Chenhao Tan:
Cool. Any final comment from Amy Hsueh or Amy Strader? Great. Thanks. I thoroughly enjoyed this conversation and just many of the interesting, and I think actually has some references as a researcher in this area that I didn't know. So I think Emily mentioned it before that maybe we can collect some of the references and maybe send them to the attendees next after maybe a couple of days. We have three minutes left, I guess that's the right amount of time for Sendhil to take over and conclude the session.
Chenhao Tan:
Thank you so much again, to the panelists.
Sendhil Mullainathan:
Yeah, thanks to everyone. This was really really terrific discussion. I'm not sure I have three minutes worth to say anything, but I will just wrap up by just making one comment which I think is related to the point about how the ML community had independent of fairness, has this openness and that's been a strong suit of it. I want to just make a meta comment, which is it's also noteworthy how the fairness literature has been quite open.
Sendhil Mullainathan:
I think a lot of the comments that have been made have been, oh, we have these models, you can apply these things, Model Cards is open source. And so it's quite satisfying that the fairness tools have gone in an openness direction early in this area. This area is very young, but early on in the area there was actually a move more in the direction of consulting services, which I think is powerful, but led to something that looked like it was going to be much more closed,
Sendhil Mullainathan:
But I think that is not what's happening, so that's promising. The second comment I'll make is just to put this in some temporal perspective. I think it's fair to say research in this area, action in this area, is not really much older than three years, four years max. And if you look at a four year horizon, how quickly this area's taken off and how much work has been done and how much work is taking place, I think that should speak to ... We should have some optimism about this.
Sendhil Mullainathan:
Because we've very quickly blown through to get to the point where we are now talking about transparency, talking about checking models, what is the right metrics of fairness? And now I think some of the questions like Louisa asked are now like, well, what is the role of government? And on that let me just conclude with the last thing, which is, independent of what you believe the role of government is, I just want to make one observation that didn't come up, which is that there is ... Government is already involved in this.
Sendhil Mullainathan:
You cannot avoid it, simply because there are regulations in place already, not about algorithms, but about hiring discrimination about credit score discriminate. These algorithms are written because algorithms didn't start this, people started this. And we had rules for what people could and couldn't do. So there is, I think, the next area, if we've had four years of a big run, the next area that people will have to confront is even if you believe there should be no net new regulation, well, okay, how should I take the regulations that exist for disparate treatment, disparate impact applied to humans?
Sendhil Mullainathan:
How should they even be applied to algorithms? Is the algorithm like a human, should it be treated differently? And the reason I'm putting that as a call to action is this is already being decided, and it's being decided right now. New York state is doing some interesting stuff. The Senate has chosen to step into some of these issues. And I think if the panelists and the people like the panelists don't get involved in those issues, it's not going to be decided in a smart way.
Sendhil Mullainathan:
It's going to be decided in some way that's going to have massive ripple effects. So I hope that people start getting involved in those conversations. But this has been a terrific conversation. Thank you all. And I'll just hand it to Emily.
Emily Bembeneck:
Thank you so much, Sendhil, and everyone for participating. I learned a lot today, scribbled my notes, so hopefully I can make sense of them and send out some references and thoughts to everyone. For the students who attended today, the panelists have all very graciously agreed to be open to communication from you. So please reach out to them on LinkedIn and form a connection. If you'd like to connect with them directly you can reach out to the center and I'll connect you with the panelists directly.
Emily Bembeneck:
Thank you everyone again so much for coming and for speaking, I hope you have a wonderful weekend and stay well during this time. Wear your masks, make good decisions, and we'll see you at the next event. Okay. Bye-bye
Sendhil Mullainathan:
Thanks a lot.
Amy Strader:
Bye. Thank you.
Sendhil Mullainathan:
I hope you get your power back, Andrew.
Andrew Zaldivar:
Yes, I hope so too. Oh dear. Thank you all.
Amy Hsueh:
Thank you.
Chenhao Tan:
Thank you.
Emily Bembeneck:
Thank you. Bye-bye.
Chenhao Tan:
Bye.