AI really is erasing queer content
how designers of artificial intelligence make choices that erase queerness
Two weeks ago, I mused about why AI is a poor model for knowledge storage in part because it has to resolve vast disagreements between texts while providing a single “definitive” answer. I drew on an anecdotal example from my use of Microsoft Copilot and DALL·E 3 which refused to generate an image of “a queer person.” As it turns out, I’m not the only person who noticed this alarming bug - so did the news team at Nature.
We’ve all seen the headlines about AI generating hateful content. In 2016, Microsoft’s AI chatbot started spewing hate speech. A recent academic analysis of image generation algorithms found that when prompted for depictions of queer and trans people, the resulting images were “stereotypes and smut.”
Under capitalism, AI is fundamentally a commercial product, and hate-filled content is bad PR (for a large tech company). So, companies take the cheapest approach to solving the issue: instituting a “safety system” that prevents the algorithm from engaging with potentially provocative content. The above instance of Microsoft Copilot refusing to “make an image of a queer person” is an example of an AI engaging its safety system. For whatever reason, the designers of Copilot have decided that “queer” is off-limits for their product. While this is likely due to the word’s history as an offensive slur, it completely ignores the ways in which folx have reclaimed it as a affirming self-descriptor.
Another approach to combat AI-generated hate is to filter the data used to train the algorithms. In fact, this is what Google did to train their T5 large language model (LLM). The effort started in 2019 when Google acquired a snap shot of the Internet (i.e. the entire Internet scraped). Before using the data to train T5, they first “cleaned” the data set.
Google’s approach to cleaning was very simple: They removed any entry that contained a word or phrase in the List of Dirty, Naughty, Obscene, and Otherwise Bad Words (which was first generated by Shutterstock to prevent their search feature from suggestion obscene keywords). I looked over the list of English entries (lists are also available for 27 other languages), and generally most words were either slurs or explicit sexual acts. But many words relate to queer experiences and/or sexuality, including “gay”, “kinky”, “sex”, “twink”, and “dominatrix”.
I’m not saying you have to go explain these words to your children. Instead, you should consider what was removed from the data and, therefore, what many AI algorithms are blind to: A moving article describing someone’s coming out story. Academic treatises on the difference between gender and sex. The entire two season run of the Netflix dark comedy Bonding.
Due to this design element, Google’s T5 LLM lacks a lot of knowledge about queerness. A former Google employee who helped generate the data set told Nature that the team consciously chose the List of Dirty, Naughty, Obscene, and Otherwise Bad Words as their filter because it was “overly conservative.” And it’s not just T5 that was trained off of Google’s data. Meta used the same cleaned data set to train its LLM, Llama.
In an uncanny way that only AI can achieve, this expressed rationale mirrors a common feeling held by cis folks when interacting with trans people: the paralyzing dread of saying the wrong thing. In a conversation, I can tell when the other person begins to feel nervous that they have or will inadvertently say something upsetting or “wrong” about trans-ness. At this point, folks tend to withdraw, and any chance of moving into a deeper conversation evaporates.
In fact, Will Ferrell has said that this was his biggest fear going into his latest project: Will & Harper, a documentary of Ferrell on a cross country road trip with his long time friend and collaborator Harper Steele who recently transitioned. As the documentary unfolds, we watch Ferrell learn to set his fears aside in order to navigate this new terrain in his friendship with Harper. For her part, Harper honestly answers vulnerable questions for us all to see, and her courage and joy are central to the moral arc of the documentary.
I wasn’t planning on watching Will & Harper, but last week my aunt started sending write-ups of the film to me. Elf is canon in our family, so Ferrell using his celebrity in this way caught her eye. Sometimes I can sense her hesitancy to make a “mistake” around me, so I viewed this as an opportunity to grow closer. In the end, I enjoyed the film much more than I thought I would (as another portrayal of white bourgeois transfemininity). Harper’s growth is tangible as the trip progresses, and Ferrell overcomes his own trepidation in a heartfelt way (interspersed with his goofy charm). Together, they model dialogues that trans folx will find familiar but that cis people tend to be uncomfortable with.
What does this to do with AI? Nothing at all. An AI’s safety system (and immaterial nature) would prevent it from even getting in the car with Harper. And an AI wouldn’t text me the trailer toWill & Harper out of the blue. The critical dialogues would never happen.
AI currently exists (primarily) as commercial products offered by tech companies to help their bottom line. These companies are motivated by profit, and their profits decrease when their products start spewing hate. It’s bad press. The cheapest option is build in blunt safeties which systematically blinds the algorithm to certain topics, including queerness.
So if AI is the future of humanity, then why aren’t queer people in it?
Doctors aren't trained in queer health. But you can help!
If you are a queer person, you probably have had at least one bad experience with the medical system. Maybe a doctor asked you invasive questions about your sexuality or sexual practices. Or, they misgendered you (even after you shared your pronouns). Perhaps they didn’t know what you were talking about when you asked a…
I can’t wait for the hype around this A.I. bullshit to end. Overblown machine learning without any intelligence whatsoever that increases unemployment and turns the entire internet into corrupted garbage. It wastes insane amounts of electricity, further driving climate disaster. Of course, it also censors anything anyone decides is controversial if you’re a regular user, but if you’re rich enough they’ll sell you a model that has no ethical limitations whatsoever. It’s all just about increasing shareholder value while pouring gasoline on the problems plaguing society. And once this bubble pops, it’ll be some other fraud to raise the value of tech stock. Mark my words, this so-called A.I. will be a colossal disappointment, just like ever other phony innovation peddled by lying capitalists.
I need a name for this feeling of being utterly shocked and not at all surprised at the same time. It’s happening a LOT lately