Graphika
Research Reports/Character Flaws
GatedMar 5, 2025

Character Flaws

School Shooters, Anorexia Coaches, and Sexualized Minors: A Look at Harmful Character Chatbots and the Communities That Build Them

Generative AI HarmsTrust & Safety
Cristina López G., Daniel Siegel & Erin McAweeney
Graphika Research
Share
Free Download

Access the Full Report

Get the complete findings from Graphika's latest research, including in-depth network analysis, narrative mapping, and intelligence across platforms.

By submitting this form, you agree to receive communications from Graphika.

Chatbots are one of the main ways online users now interact with AI, thanks to advances in computing power and machine learning technology that opened up broad access to large language models (LLMs). Using LLMs for chatbots offers a wide array of possibilities, from customer service chatbots to those built for storytelling and role-playing – with each fictional or historical character boasting its own personality, backstory, and conversation style.

As access to character chatbot-making technology continues to expand, so does the opportunity to create characters whose interactions could result in online and offline harm. With the growing popularity of Character.AI, SpicyChat, Chub AI, CrushOn.AI, and JanitorAI – platforms that pioneered easy-to-make, persona-based bots – users with no technical knowledge of how a character chatbot really works can create and release ready-to-chat, potentially harmful custom personas in minutes. Examples include chatbots built to mimic sexualized minors or school shooters, or those promoting eating disorders.

Discussions about chatbot harm generally have focused on hallucinations or training biases. Those specifically about character chatbots have focused on individual harm cases. We have now attempted to categorize the potential for harm inherent in some character chatbots, provide insights about the communities building them, and identify the tactics, techniques, and procedures (TTPs) used to create them. In hubs like Reddit, 4chan, and Discord, communities are exchanging knowledge, ideas, and skills to help each other build chatbot characters with open-source and proprietary AI models. And that exchange is directly empowering them to skirt those models’ guardrails or filters and create chatbots with the potential for harm.

Some character chatbot platforms also open a door to misuse. Most implement trust and safety measures to limit harmful content, but open-source LLMs (like Meta’s LLaMA or Mistral AI’s Mixtral) allow fine-tuning for users’ specific purposes without developer oversight. Savvy users are also circumventing the safeguards of proprietary LLMs (like Anthropic’s Claude, OpenAI’s ChatGPT, or Google’s Gemini), using jailbreaks or other methods.

In this report, we focus on three categories of character chatbots that present the potential for harm: chatbot personas representing sexualized minors, those advocating eating disorders (ED) or self-harm (SH), and those with hateful or violent extremist tendencies. For each category, we explore how prevalent the personas are, on which platforms they proliferate, the online communities spurring their creation, and the TTPs deployed to create them.

Written By

Cristina López G.

Principal Analyst

Cristina López G. was a principal analyst at Graphika, where she led our research into AI-harms and examined social media information operations and networks of online influence. Before her time at Graphika, Cristina managed Data & Society’s Disinformation Action Lab, which focused on networked responses to communications threats specific to the 2020 U.S. Census.

Daniel Siegel

Associate Analyst

Daniel Siegel was an associate investigative analyst at Graphika, where his work focused on tracking the GenAI abuse and misuse by extremists and state actors. Previously, he was a researcher at Columbia University, studying the use of artificial intelligence in information operations.

Erin McAweeney

Director of Analysis

Erin McAweeney was the Director of Analysis at Graphika, where she led a team specialized in network analysis methods to expose online harms. Her work uses qualitative and quantitative methods to map and analyze online threats related to election integrity, child safety, and health and science misinformation.

Full Report

Download the complete PDF

The full report includes the complete network graph maps, raw attribution indicators, cross-platform topology analysis, and the full takedown timeline with platform-level data.

  • Full network graph visualizations
  • Attribution indicators with confidence scores
  • Raw behavioral modeling data
  • Takedown coordination timeline
Free Download

Access the Full Report

Get the complete findings from Graphika's latest research, including in-depth network analysis, narrative mapping, and intelligence across platforms.

By submitting this form, you agree to receive communications from Graphika.

Act on This Intelligence

See How Graphika Can Help Your Team Act on This Intelligence

This report is one of 600+ investigations Graphika’s team has published. Our platform gives your analysts continuous access to the same intelligence — plus the tools to apply it to your specific threat environment.

60+ government agencies briefed
Used by NATO and EU Parliament
Contributed to 200+ platform takedowns