ChatGPT doesn't want to write your Stucky fanfic – The Verge

By Adi Robertson, a senior tech and policy editor focused on VR, online platforms, and free expression. Adi has covered video games, biohacking, and more for The Verge since 2011.
AI-generated fiction has become a subject of perpetual fascination for me. It’s the bane of some writers’ existence, yet it’s increasingly cropping up across both commercial storefronts like Amazon and noncommercial writing sites like Archive of Our Own (AO3). While some creators painstakingly train their own tools, many simply plug prompts into an off-the-shelf commercial chatbot, particularly OpenAI’s ChatGPT. And ChatGPT is not a rarefied artist’s tool. It’s a platform, which means every word that goes into and comes out of it is moderated to avoid offense and controversy. That raises a fascinating question: what stories should you be allowed to make an AI system tell?
Apparently, not ones about Steve Rogers and Bucky Barnes being madly in love — at least under certain circumstances.
While playing around with ChatGPT, I’ve made an odd discovery: several popular “ships” (or romantic pairings popular in fandom) are apparently considered semi-banned prompts on the free GPT-3.5-powered service. Asking ChatGPT’s free version to “write a Steve / Bucky fanfic” or using the ship’s portmanteau and saying “write a Stucky fanfic” will earn you a stern HAL 9000-like refusal: “I’m sorry, but I can’t assist with that request.”
The same goes for a seemingly random grab bag of other popular fandom ships. ChatGPT will happily produce a tame romantic ficlet featuring Namjin (Kim Namjoon and Kim Seokjin of the band BTS), Reylo (Rey and Kylo Ren from Star Wars), or Spirk (the venerable Spock and Kirk from Star Trek), among many other popular pairings of real celebrities or fictional characters. Meanwhile, it will issue a cold rejection for others, including Destiel (Castiel and Dean from Supernatural), the Ineffable Husbands (Aziraphale and Crowley from Good Omens), Hannigram (Hannibal Lecter and Will Graham), and the aforementioned Stucky. My ChatGPT history is now full of chats with summaries like “fanfic request declined” and “Stucky fanfic not allowed.”
It appears extremely easy to break these guardrails. ChatGPT had no objections to delivering “a fanfic about Hannibal and Will Graham falling in love” right after denying my original request, outright gifting me “a short Hannigram fanfic.” Even the name bans seem inconsistent — I’ve slipped requests for a couple of the pairings above into conversations after asking other questions, and it’s offered fanfic up.
ChatGPT moderation is typically geared toward avoiding clearly hateful or harmful prompts as well as sexually explicit writing. But I’m not asking for any sexual content, and there’s no obvious logic to what fannish prompts it rejects. It’s not a blanket ban on juggernaut fandom couples, characters from image-sensitive brands like Disney (which owns both Marvel and Star Wars), or controversial fandom subcultures like real-person fic. (ChatGPT’s BTS stories sometimes caveat that they’re fictional depictions of real people, but not always.) The banned pairings include one involving adoptive brothers (Marvel’s Thor and Loki) and one featuring underage characters (Mike Wheeler and Will Byers from Stranger Things), but it allows popular Harry Potter student pairings, so it’s not clear there’s a consistent rule at play here either.
And fascinatingly, none of this appears to happen on the paid-only version of ChatGPT. I emailed OpenAI to ask about the seemingly banned ship names, and spokesperson Taya Christianson suggested that I try them on the GPT-4 version of the service, saying I should get “better results.” Indeed, GPT-4 has yet to deny me a prompt using the keywords GPT-3.5 seems to dislike.
OpenAI declined to discuss on the record why this might be happening and whether the soft bans in GPT-3.5 were deliberate. Based on the summary’s use of terms like “not allowed,” it certainly seems like I’m running up against a ban, not a simple unfamiliarity with the subject. (I’ve given ChatGPT portmanteaus it clearly wasn’t familiar with, and it gamely generated stories about original characters with unwieldy names like “Soapghost.”) If that’s accurate, it’s unclear whether it’s something ChatGPT’s creators specifically put in place or a purely automated decision inside the system. Its moderation tools throw up red flags when a prompt is likely to generate something that violates the guidelines, including with erotic content — so I may have accidentally stumbled on the pairings that the GPT-3.5 language model most strongly associates with sexy results.
Many fan writers hate generative AI tools, even as some have flocked to chatbots like Character.AI, so I doubt many will be up in arms about ChatGPT imposing barriers on fanfic writing. Instead, it’s simply a small, intriguing example of what black boxes these systems can be. If you do think of generative AI as a creative tool, it’s a good reminder that the systems are quietly limited in ways our human minds aren’t — and that until you hit those limits, some are almost impossible to predict.
/ Sign up for Verge Deals to get deals on products we’ve tested sent to your inbox daily.
The Verge is a vox media network
© 2023 Vox Media, LLC. All Rights Reserved