The rapid integration of Large Language Models (LLMs), such as GPT-5, Gemini and Grok, into daily life has generated remarkable opportunities for communication, creativity, and information processing. These systems are capable of drafting text, generating code, and sustaining seemingly coherent conversation. Yet, because they are trained on massive, largely uncurated internet datasets, they also reproduce the full range of human expression—including insults, slurs, and other forms of harmful language (Bender, Gebru, McMillan-Major & Shmitchell 2021, Weidinger et al 2021). Recent events, such as the public outcry over Grok’s offensive responses on X/Twitter and Turkish court blocking access to Grok after it generated insulting content about the President of the Republic of Türkiye, highlight a central tension (Toksabay & Gumrukcu 2025).
This gives rise to a critical inquiry: what does it mean to be “insulted” by a machine, and more fundamentally, is the human response of feeling offended by an artificial entity ethically and socially intelligible?
This tension calls for a conceptual re-examination of the nature of insults and the grounds on which humans experience offense. At one level, LLMs lack the intentionality and consciousness required for genuine insult (Chalmers 1995, Wang, Koneru, Venkit, Frischmann & Rajtmajer 2025). They do not “mean” what they say. At another level, however, the human response of offense is both intelligible and justified, since algorithmic outputs are embedded in social contexts and often reproduce historical forms of bias and abuse (Noble 2020, O’Neil 2017). In this sense, offense does not stem from attributing agency to the machine but from recognizing the harms—symbolic, psychological, and political—that these systems can amplify.
Here, I will advance the claim that while LLMs can generate semantically insulting language, they cannot perform the act of insulting in the full philosophical sense because they lack intent. Nonetheless, the experience of being offended by such outputs is ethically and socially valid, reflecting both the persistence of human biases in training data and the accountability of those who design, deploy, and regulate these systems.
The argument will unfold in three stages. First, it will examine the technical architectures of LLMs and their capacity to generate toxic content. Second, it will explore philosophical perspectives on agency and consciousness to explain why these systems cannot “intend” insult. Finally, it will analyse why human feelings of offense remain warranted, drawing on psychological, sociological, and legal accounts of online abuse.
Let us start with the first and second stages here. The last stage will be featured in the next blog post.
The Technical Capacity for Generating Insulting Content
Modern large language models (LLMs) are advanced neural architectures trained through next-token prediction tasks: they generate text by “predicting the likelihood of a token (character, word or string) given either its preceding context or (in bidirectional and masked LMs) its surrounding context” (Bender et al 2021). Their apparent fluency is a direct function of scale, as they are trained on vast corpora drawn from the public internet, which inevitably contain both cultural knowledge and entrenched forms of prejudice. As Brown et al observe in their trained GPT-3, scaling model parameters and training data “greatly improves task-agnostic, few-shot performance”, but it also magnifies the absorption of linguistic regularities, including those reflecting harmful stereotypes (Brown et al 2020).
Bender et al famously describe such systems as “stochastic parrots” emphasizing that they “haphazardly stitching together sequences of linguistic forms it has observed in its vast training data, according to probabilistic information about how they combine, but without any reference to meaning” (2021). This characterization underscores that LLMs operate at the level of statistical correlation, not semantic understanding. If training data repeatedly associates particular demographic identifiers with pejorative terms, the model internalizes this distributional relationship. As Noble has shown in the context of search engines, algorithmic systems can “reinforce oppressive social relationships and enact new modes of racial profiling” (2020). LLMs, when trained on such corpora, risk reproducing these same structural harms. The production of an insult by an LLM, therefore, is not an intentional act but the mechanical result of pattern replication. What appears as malice is, in reality, a probabilistic continuation of training data correlations—an insult by statistical association rather than by deliberate offense.
The Absence of Malice: Intent, Consciousness, and Agency
An insult is not reducible to a mere sequence of words; it is a speech act imbued with intention, designed to inflict emotional or social harm. In the philosophy of language, J. L. Austin’s account of illocutionary acts stresses that meaning derives not simply from the words themselves, but from the speaker’s intention in uttering them and the social uptake they achieve (Austin 1975). For an utterance to function as an insult, it must carry a purposive orientation: it is performed as an act of harm, rather than as an accidental arrangement of tokens.
Thus, for an entity to insult another, it must possess, at a minimum, three attributes that current AI systems lack: intentionality, semantic understanding, and consciousness and qualia.
1. Intentionality
For an utterance to qualify as an insult, it must be directed at someone with the purpose of causing offense. This presupposes a purposive mental state—an orientation toward the world that combines belief, desire, and value. As Frischmann and Selinger (2018) explain, “intention is a mental state that is part belief, part desire, and part value. My intention to do something … entails (1) beliefs about the action, (2) desire to act, and (3) some sense of value attributable to the act”.
This triadic structure makes clear that insults are not accidental strings of words but purposeful acts of communication. By contrast, AI systems such as large language models operate without such purposiveness. They are not agents with goals, but statistical engines optimized for predictive accuracy, and their objective function—minimizing prediction error—bears no relation to the formation of intent. This distinction is philosophically grounded in John Searle’s (1980) argument that computers lack the intrinsic intentionality of mental states, exhibiting only a simulated as-if intentionality. Furthermore, without the subjective experience that David Chalmers (1995) describes as “what it is like to be” a conscious entity, a system can form no genuine “desire” to harm.
Therefore, when an AI generates offensive language, it is not a purposeful communicative act but a byproduct of statistical pattern-matching on data.
2. Semantic Understanding
Beyond intent, an insult requires that the speaker comprehend the meaning and social weight of the words used. Insults derive their force from a shared awareness of language’s symbolic power, and without this understanding, an utterance may mimic an insult but cannot realize its full function as a speech act. Searle’s “Chinese Room” argument remains the paradigmatic critique of computational semantics, illustrating how a system can manipulate symbols according to syntactic rules to produce appropriate answers without any grasp of their meaning. Large language models operate in precisely this manner (1980, 1990). Their fluency arises from statistical correlation, not semantic insight.
Consequently, when a model reproduces a slur, it does so without awareness of the historical and affective charge the words carry. This semantic void is particularly salient when considering context; as Wang et al (2025) note, algorithmic systems cannot reliably distinguish between a slur reclaimed by a marginalized group and the same word deployed as an act of domination. As Noble (2020) has shown, algorithms can propagate harm without understanding why certain associations are offensive, merely reflecting the prejudices in their data.
Thus, LLMs manipulate tokens mechanically, and when they generate slurs, it is not from a recognition of their painful history but as a statistically coherent continuation of textual patterns.
3. Consciousness and Qualia
A genuine insult is intrinsically bound to the emotional states of a conscious agent—anger, contempt, disdain—and is directed at provoking a corresponding subjective experience in the recipient. This relational dynamic presupposes consciousness: a felt awareness of one’s own state and the intended effect on another. As David Chalmers (1995) articulates in his “hard problem of consciousness” the true mystery is not explaining cognitive functions but explaining qualia—the subjective, “what-it-is-like” texture of experience. It is this phenomenal quality that gives human insults their distinctive character. By contrast, artificial systems lack any such subjective standpoint. While they can generate sentences that resemble anger, this is merely the statistical reproduction of linguistic patterns found in their training data. These “stochastic parrots” produce fluent text without any grounding in meaning or feeling. Without phenomenal consciousness, there is no internal anger behind an AI’s insult and no recognition of the pain it may cause (Bender, et al 2021). An AI-generated insult is therefore a hollow simulation; it mimics the external form of insult while lacking the conscious states that make such acts socially and experientially potent.
From a philosophical standpoint, an AI-generated insult should be understood as a simulacrum: it reproduces the outward form and linguistic structure of an insult while lacking the intentionality and conscious agency that constitute the act itself. The appearance of malice is an artifact of statistical patterning, not an expression of belief, desire, or emotion. To interpret such outputs as genuine insults risks a category mistake, in Gilbert Ryle’s sense (2009), by attributing personhood, purposiveness, and moral agency to what is, in reality, a sophisticated but non-sentient computational tool.
To read more about algorithmic bias, check out this article: Aytekin, Ahmet Bilal. “Algorithmic Bias in the Context of European Union Anti-Discrimination Directives.” In EWAF. 2023. Available here: [LINK]
SUGGESTED CITATION: Aytekin, A. Bilal: “Should We Be Offended by AI Slurs? The Case of Algorithmic Insults (I)“, FOLBlog, 2025/09/12, https://fol.ius.bg.ac.rs/2025/09/12/should-we-be-offended-by-ai-slurs-the-case-of-algorithmic-insults-i/