How machine learning can amplify or remove gender stereotypes
TLDR: It’s easier to remove gender biases from machine learning algorithms than from people.
In a recent paper, Saligrama, Bolukbasi, Chang, Zou, and I stumbled across some good and bad news about Word Embeddings. Word Embeddings are a wildly popular tool of the trade among AI researchers. They can be used to solve analogy puzzles. For instance, for man:king :: woman:x, AI researchers celebrate when the computer outputs x = queen (normal people are surprised that such a seemingly trivial puzzle could challenge a computer). Inspired by our social scientist colleagues (esp. Nancy Baym, Tarleton Gillespie and Mary Gray), we dug a little deeper and wrote a short program that found the “best” he:x :: she:y analogies, where best is determined according to the embedding of common words and phrases in the most popular publicly available Word Embedding (trained using word2vec on 100 billion words from Google News articles).
The program output a mixture of x-y pairs ranging from definitional, like brother-sister (i.e. he is to brother as she is to sister), to stereotypical, like blue-pink or guitarist-vocalist, to blatantly sexist, like surgeon-nurse, computer programmer-homemaker, and brilliant-lovely. There were also some humorous ones like he is to kidney stone as she is to pregnancy, sausages-buns, and WTF-OMG. For more analogies and an explanation of the geometry behind them, read more below or see our paper, Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings.
Bad news: the straightforward application of Word Embeddings can inadvertently *amplify* biases. These Word Embeddings are being used in increasingly many applications. Among the countless papers that discuss Word Embeddings for use in searching the web, processing resumes, chatbots, etc., etc., hundreds of articles mention the king-queen analogy while none of them notice the blatant sexism present.
Now you might think that we could solve this problem by simply removing names from embeddings – but there are still subtle indirect biases: the term computer programmer is also closer to baseball than to gymnastics, and as you can imagine, removing names wouldn’t entirely solve the problem.
Good news: biases can easily be reduced/removed from word embeddings. With a touch of a button, we can remove all gender associations between professions, names, and sports in a word embedding. In fact, the word embedding itself captures these concepts so you only have to give a few examples of the kinds of associations you want to keep and the kind you want to remove, and the machine learning algorithms do the rest. Think about how much easier this is for a computer than a human. Men and women have all been shown to have implicit gender associations. And the Word Embeddings also surface shocking gender associations implicit in the text on which they were trained.
People can try to ignore these associations when doing things like evaluating candidates for hiring, but it is a constant uphill battle. A computer, on the other hand, can be programmed to remove associations between different sets of words once, and with ease it will continue along with its work. Of course, we machine learning researchers still need to be careful — depending on the application, biases can creep in other ways. Also, I mention that we are providing tools that others can use to define, remove, negate, but also possibly even amplify biases as they choose for their applications.
As machine learning and AI become ever more ubiquitous, there have been growing pubic discussions about the social benefits and possible dangers of AI. Our research gives insight into a concrete example where a popular, unsupervised machine learning algorithm, when trained over a large corpus of text, reflects and crystallizes the stereotypes in the data and in our society. Wide-spread adoptions of such algorithms can greatly amplify such stereotypes with damaging consequences. Our work highlights the importance to quantify and understand such biases in machine learning and also how machine learning algorithms may be used to reduce bias.
Future work: This work focused on gender biases, specifically male-female biases, but we are now working on techniques for identifying and removing all sorts of biases such as racial biases from Word Embeddings.