Tay and Twitter Bots

I wasn’t planning on saying anything else about Tay, but some of the aftermath bears mentioning.

First, Microsoft Research has released a statement. It includes the rather interesting detail that they have an earlier XiaoIce chatbot in China. They’re light on the implementation details, so we still don’t know how Tay works under the hood, but they blame a coordinated attack on a vulnerability in Tay as causing the problems. It’s not clear if they just mean the repeat function–which was obviously abusable–or some other functionality.

Unlike a lot of other bots, Tay made the national news. The Washington Post’s overview is a fairly good high-level look at the recent events. I note that at one point they describe the bot as being confused because it responds to similar input with completely opposite sentiments. This is tricky, because anthropomorphizing a bot can obscure how it actually functions. We don’t know if the bot has any understanding of the concepts its words signify, and we are almost certainly seeing more patterns in the output than actually exist.

Indeed, several people have suggested that one of the problems is that Tay was being presented as an anthropomorphic entity at all. After all, even adult humans can have trouble navigating online interactions, particularly when intentional abuse is involved. A conversational interface creates expectations that the computer can’t always live up to. As linked to last time, Alexis Lloyd believes that conversational interfaces are a transitional phase.

Russell Thomas, a Computational Social Scientist, has his own estimation of what went wrong. Historically, AIML chatbot code included a repeat feature, and his contention is that the Tay bot didn’t include much of what he considers AI to be: he suggests that it was mostly using search engine algorithms, rather than any kind of concept modeling or natural language processing.

Search engines can be sophisticated algorithms, but they don’t encode the contextual understanding we’d expect in a conversation. Worse, the marketing for the bot didn’t line up with either its capabilities or its target audience.

As Allison Parrish reminds us, A.I. is still “a computer program someone wrote”. While it’s fun to dream about the future possibilities of an artificial intelligence that treats us like people, the existing state of the art has the same relationship with us as other forms of software.

How much blame should Microsoft Research shoulder for this? Is it possible to make a bot that’s completely foolproof? Their algorithm was vulnerable to the abuse, but as Alex J. Champandard points out, the actual abuse was caused by a coordinated attack intended to exploit the bot. Will expectations of foolproof bots hurt AI research? In response, Joanna Bryson posits that “Bot authoring is not tool building” and draws a distinction between moral agency and general agency (and Alex responds in the comments).

In general, I agree that it’s laudable that Microsoft Research has stepped up to take responsibility for not predicting this as a possible outcome. After all, it’s not the first time that this has happened: Antony Gravin shared a story about the time one of his bots turned racist.

The botmaking community had intense discussions about the implications of the news. A number of botmakers were interviewed about “How Not To Make A Racist Bot”, including thricedotted, Rob Dubbin, and Parker Higgins. They talk about what could have been done differently, the problems they’ve had with their own bots, and some suggestions for making more ethical bots. Probably the most comprehensive response, from people who have made bots themselves and dealt with some of these issues before.