Removing bias from word vectors

I think it’s important to remember that algorithms are not neutral, objective truths. This is especially true when they’re trained on unfiltered public data. So this writeup about removing bias from the ConceptNet Numberbach word vector dataset is compelling on both a practical and theoretical level.

For the practical, they have word vector data that has measurably less gender bias embedded in its gender analogies. For the theoretical, they discuss some methods that can be applied to other kinds of machine learning, and link to more research.

https://blog.conceptnet.io/2017/04/24/conceptnet-numberbatch-17-04-better-less-stereotyped-word-vectors/