03 Semantic noise - ░ ░ ░ ░ ░

**Links to**: [[NLP]], [[Semantics]], [[Syntax]], [[Linguistics]], [[NLU]], [[Entropy]], [[Noise]], [[Entropomorfismo]], [[Entropicalia]], [[Energy]], [[Negentropy]], [[Thermodynamics]], [[Reversibility]], [[Evolution]], [[Degeneracy]], [[Redundancy]], [[Kripkenstein]], [[Modulations]], [[Prompt engineering]]. ### [[Postulate]]: Meaning is collectively (de)generated by indeterminate, combinatorial processes involving massive amounts of (semantic) noise. These exchanges cannot be thought of as voluntary and/or unpublic under _any_ circumstances. See also: [[Degeneracy]], [[Choice sequences]]. # Semantic Noise and Conceptual Stagnation in Natural Language Processing^[N.B. There exist two published versions of this argument, please find the external links to the articles below: [de Jager, S. “Semantic Noise and Conceptual Stagnation in Natural Language Processing.” _Angelaki_ 28.3 (2023): 111-132.](https://www.tandfonline.com/doi/pdf/10.1080/0969725X.2023.2216555) and [de Jager, S. “Semantic noise in the Winograd Schema Challenge of pronoun disambiguation.” _Nature, HSSC_ 10.1 (2023): 1-10.](https://www.nature.com/articles/s41599-023-01643-9)] **Abstract**. _Semantic noise_, the effect ensuing from the denotative and thus functional variability exhibited by different terms in different contexts, is a common concern in natural language processing (NLP). While unarguably problematic in specific applications (e.g., certain translation tasks), the main argument of this paper is that failing to observe this linguistic matter of fact as a generative effect rather than as an obstacle, leads to _actual_ obstacles in instances where language model outputs are presented as neutral. Given that a common and long-standing challenge in NLP is the interpretation of ambiguous—i.e., semantically noisy—cases, this article focuses on an exemplar ambiguity-resolution task in NLP: the problem of anaphora in _Winograd schemas_. The main question considered is: to what extent is the standard approach to disambiguation in NLP subject to a stagnant “image of language”? And, can a transdisciplinary, dynamic approach combining linguistics and philosophy elucidate new perspectives on these possible conceptual shortcomings? In order to answer these questions we explore the term and concept of _noise_, particularly in its presentation as _semantic noise_. Owing to its definitional plurality, and sometimes even desirable unspecificity, the term _noise_ is thus used as proof of concept for semantic generativity being an inherent characteristic in linguistic representation, and its _concept_ is used to interrogate assumptions admitted in the resolution of Winograd schemas. The argument is speculative and theoretical in method, and the result is an analysis which provides an account of the fundamentally dialogical and necessarily open-ended effects of semantic noise in natural language. <small>Keywords: noise, semantics, dialogics, NLP, Winograd schemas.</small> <div class="page-break" style="page-break-before: always;"></div> ### Introduction Natural language processing (NLP) is a particular subfield of artificial intelligence (AI) with the aim of developing language systems which can process input and generate human-like linguistic behavior. Its results are pervasive: from simple chatbots and large language models (LLMs), to explainable AI and generative tools such as text-to-image or video. Given their dominant advances and wide range of applications, the ubiquity of these phenomena will only continue to increase (Shanahan). Stemming from the conceptual tendency towards the stabilization, compression, and discretization of representations, “practical” NLP solutions often promote the elimination of semantic noise, as an obstacle impeding “accurate” representation. In the case of LLMs—the configurations of which are historically based on a lot of NLP solutions—statistically predictable linguistic coherence is _the_ guiding parameter structuring their effects (ibid.). Because of this, conceptually, NLP and related language-formalizing fields are ultimately geared towards the statistical schematization of natural language (NL): the production of a _fini-unlimited_^[“[A] finite number of components yields a practically unlimited diversity of combinations” (Deleuze, 1988, p. 131, cited in Rieder, translation amended by Rieder: “While the common translation of ‘fini-illimite’ as ‘unlimited finity’ may be more elegant than ‘fini-unlimited,’ this amounts to a rather drastic change in emphasis” (31)).] combinatorial universe of linguistic structures through the amassment of sufficient examples. This tendency not only ignores foundational observations in linguistics and the philosophy of language, such as the fundamental social plasticity of language (which we will refer to as _dialogical_, following no particular thinker), but it also promotes conceptual stagnation because of the _closure_ of NL it proposes. Diagnostically, this stagnant closure suffers the fate of one-dimensionality which characterized much of Herbert Marcuse’s critique of operational and functional visions of linguistic representation. For Marcuse, the concreteness of “[t]his language, which constantly imposes images, militates against the development and expression of concepts. In its immediacy and directness, it impedes conceptual thinking; thus, it impedes thinking” (2013 (1964), p. 98). We will not charge NLP with the impediment of thought altogether, but it is necessary to point out that the “image of language”^[Juan Luis Gastaldi following Deleuze’s “image of thought.”] that it circulates is certainly one where meanings are statistically given, and not debatable and dynamic (i.e., _thought_). As will be argued later, in very general terms: statistical closure (e.g., frequentist retrieval from the past) is a different phenomenon than probabilistic (and possibilistic) consideration (e.g., hypothesis-based speculation). As proposed by Prado Casanova: “in the resolution of uncertainty that information entails, there must be the chance of a result completely ‘perturbed’ by noise” (2023, p. 8). While NLP can be further subdivided into various domains, we will focus on the specific NLP task of pronoun ambiguity resolution in Winograd schemas (WSs). An example of a WS would be: “the nurse helped the patient even though she was upset,” where “she” is anaphoric and thus ambiguous. Terry Winograd himself recognized the semantic plurality emerging from indefinite cases, but he also believed that a human subject “filters out all but the most reasonable interpretations” when presented with ambiguity (1972, p. 31). This _common-sense bias_, assumed to apply invariably across human subjects, pervades through the main body of NLP literature on WSs, and can be charged with Marcuse’s one-dimensionality, as it is unanimously proposed that these anaphoric cases are unambiguous to human readers. This ambiguity is commonly presented—yet hardly _conceptually_ addressed—as a problem of _semantic noise_ (SN) (e.g., Luo et al., 2022). SN problematizes the calculability of Shannon and Weaver’s non-semantic, mathematical formulation of noise, as it implies the variable interpretations of semantic information, and its capacity to affect conduct, which extends SN to the realm of the possibly incalculable. Unsurprisingly, the concept of SN observes multiple definitions in NLP, but was originally defined by Warren Weaver as: “the perturbations or distortions of meaning which are not intended by the source but which inescapably affect the destination” (p. 26). While issues pertaining to ambiguity present NL with said distortions, this situation can hardly be said to merely require statistical closure as a “solution.” As we will observe, the interpretation of SN as a problem to be overcome is pervasive in the technical literature and owes much of its contemporary influence to developments in information theory (IT). The promise that IT set the ground “for a real theory of meaning” (Weaver 1949, p. 27) is presented in Shannon and Weaver’s seminal account, but never delivered, semantics only appears in the form of questions or open-ended proposals. It is crucial to consider that this meaning-avoidance in early IT is a root cause of the semantics problems that NLP currently faces. These problems have often been framed as pertaining to _bias_, for example, by prominent AI scholars such as Timnit Gebru, Emily Bender, and Kate Crawford, or classically and more generally as problems of _formalization_ (Rieder 2020, p. 28), and these criticisms have mostly emerged as calls to rethink large-scale systems with claims to universality. LLMs have also been notoriously criticized by Luciano Floridi and Massimo Chiriatti for their mathematical, semantic, and ethical shortcomings, albeit, again: against the background of an elusive common sense supposedly possessed by humans. In the context of a failed semantic test, the authors argue that “[c]onfused people who misuse GPT-3 to understand or interpret the meaning and context of a text would be better off relying on their common sense” (Floridi and Chiriatti 2020, p. 689). As we will see, a variety of problems emerge from such proposals: issues regarding spatiotemporal reasoning all the way up to political sentiments. This article proposes that SN is what renders concepts interesting to scientific and speculative inquiry, something which has been argued at large for the concept of “noise” in general, by philosopher Cécile Malaspina, and even for the conceptual variations on the notion of a “black hole” by philosopher and physicist Erik Curiel. In our terms: SN is what renders concepts as _generative_. Generativity is understood here in two interrelated ways, (1) straightforwardly: as the excess generated by how a model follows its logic and installs patterns (e.g., to say that “all swans are white” includes the excess, the possibility of non-white swans). And (2) in a more amplifying tone: it is understood as the adaptation or reconceptualization of a model, on the basis of (the recognition of) SN (how the concept or model of “swan” changes once black swans are discovered). In observing SN as generative, we find that it grants the possibility of speculative dialogue and probabilistic-possibilistic reasoning around terms. Conversely, discarding SN as incidental ignores the outside it generates, and thus ignores alternatives, possibilities, negativities, etc. In later sections we will explore how the generativity that ensues from SN results from the permanent reconsideration of what is assumed to be _common_, based on difference/likeness estimations as the basic patterns which organize perception. It is in favor of this generativity that the present proposal considers the standardization of Winograd schema challenge (WSC) solutions as conceptually stagnant and politically problematic, because they assume a normalized or “one-dimensional” interpreter. The argumentation takes issue with what can already be observed in one of the earliest considerations of modern mechanical translation, by Warren Weaver: “There is no need to do more than mention the _obvious_ fact that a multiplicity of language _impedes_ cultural interchange between the peoples of the earth, and is a serious deterrent to international understanding” (1949, p. 1; my emphasis). While this may have been intended and received as alluding to peaceful communication back then, today it certainly inspires questions: how do technical solutions come to “normalize” diverse cultural behavior? How do these normalizing proposals of apparent communicative coherence result, in fact, in incommunication? And, when difference-affirming concepts such as “creativity” and “diversity” dominate the discourse around AI (ethics) today, how come its technical applications continue to be geared towards a monological monopoly? In order to consider these questions, it is argumentatively crucial to propose that formalization does not need to mean normalization or closure: the most modest form of this argument would be that the concept of _hypothesis_, as a provisional conjecture, is done scientific justice in NLP. It is instances of pattern (e.g., formalized meanings in statistics) vs. noise frictions (e.g., challenges to said statistics, or novel hypotheses) which set conceptual activities in motion. We tend to find a tendency towards the opposite in LLM formalizations, given the (commercial) demand for “practical” applicability, where the results exhibit the conceptualization of dynamic multiplicity as a problem, and the (mostly unintended, we hope) proposal of univocity as a solution. The market-driven solutionism that promotes their fast adoption is what currently impedes academic reflection on said matters and dangerously misleads the general public with regard to the “optimality” of their performance (Floridi and Chiriatti 2020; Bender et al. 2021; Shanahan 2023). &emsp; ### What humans (still) can’t do: Winograd schemas The most famous instances of ambiguity in NLP are WSs: sentences (in English) in which demonstrative pronouns like “it” or “they” can be related to multiple predicates; their analysis was popularized by Terry Winograd in his doctoral dissertation of 1972.^[Though this has been an issue for logic and philosophy since Frege, see, for example, Donald Davidson’s essay “Truth and Meaning” for an overview of positions.] WSs are often alluded to as something “machines (still) can’t do” (Dreyfus 1972), and have even been proposed by renowned AI researcher Hector Levesque as an alternative to the Turing test, because they are “designed so that the correct answer is obvious to the human reader, but cannot easily be found using selectional restrictions or statistical techniques over text corpora” (Levesque et al. 2012, p. 1). According to the developers of the 2020 GPT language model, WSs are “a classical task in NLP that involves determining which word a pronoun refers to, when the pronoun is grammatically ambiguous but semantically unambiguous to a human” (Brown et al., 2020, p. 16).^[The paper version accessed was with page numbers 1–25 in the version which is made accessible online by the authors: https://arxiv.org/abs/2005.14165.] We will analyze WSs from the perspective of SN in order to question what is generally assumed about the “typical” human interpreters who are capable of resolving these challenges putatively effortlessly. In these assumptions we find that conceptually creative differences are effaced in favor of the establishment of statistical normalcy. SN is certainly a pressing issue arising from any attempt to design an inferential model the purpose of which is providing “objective” answers to questions regarding translation, interpretation, etc. Beyond NLP this issue is also exemplified by the _frame_ or the _steering_ problems in AI, where the design of autonomous agents is presented with the problem of _relevance_ as well.^[See, for example, Bacchus et al. for an early proposed solution to the frame problem, where relevance is approached in terms of Bayesian belief updating under noisy conditions.] But one could make the claim that when _humans_ deal with SN (due to the effects of cognitive biases, sociocultural partiality, etc.) they are presented with the same (ethical, belief-updating) challenges. Let us explore why this is so with the original, and most famous of all WSs:^[We will focus on just two, the reader can find an analysis of many more in de Jager, S. “Semantic noise in the Winograd Schema Challenge of pronoun disambiguation.” Nature, HSSC 10.1 (2023): 1-10..] &emsp; >The city councilmen refused the demonstrators a permit because they feared violence. Or: >The city councilmen refused the demonstrators a permit because they advocated violence. &emsp; The idea should be that in order to make an instantaneous assessment of who “they” are, a human reader reasons by elimination that “fear” is most likely experienced by the city councilmen, whereas the advocacy of violence most likely relates to the demonstrators. However, something that has yet to be remarked with serious attention in the NLP literature, is that one might very well be partial to arguing for the _exact opposite_. This alternative meaning is relatively more “far-fetched” but certainly possible given the appropriate context. The first sentence could be in the context in which demonstrators fear violent retaliation, and the city councilmen refuse them a permit to avoid said violence. Or, it could be in the context in which the demonstrators fear violence and the city councilmen _expect_ them not to, which is why their permit is refused. The second sentence can be interpreted as having the meaning that the city councilmen themselves advocate violence by limiting citizens’ freedoms, and therefore refuse the demonstrators a permit. There have been many proposals for disambiguation towards _one_ particular instance of meaning in this and other WSs, a quest that is still ongoing, despite claims from influential AI scholars such as Kocijan, Davis, Lukasiewicz, Marcus, and Morgenstern who propose that the WSC has been conceptually defeated as a yardstick for “common sense” (2022, pp. 31–33). In the present article, we are in full agreement with Kocijan et al. that the WSC is not a proper benchmark for common-sense reasoning. However, we would like to point out that while Kocijan et al. suggest that (common-sense) knowledge about “stereo-typical attitudes toward (non-state-sanctioned) violence no doubt also plays a role in the disambiguation” (p. 2), in the example of the above-mentioned WS, they also quote Levesque’s proposed criteria for WSs, and devote special focus to the criterion that “Both sentences must seem natural and must be easily understood by a human listener or reader; ideally, so much so that, coming across the sentence in some context, the reader would not even notice the potential ambiguity” (p. 3). The authors fail to provide a reason for why the “natural interpretation” is, in fact, supposed to be natural. While they rightly acknowledge that the “commonsense reasoning problem remains” given that the WSC is not an adequate test of “common” sense (pp. 31–33), as well as point to the work of linguists who study the complexity of pronoun disambiguation and its sometimes impossible resolution (p. 25), they also conclude their paper with the proposal that AI language systems seem more prone to spatiotemporal common-sense reasoning errors than errors of a higher degree of abstraction (p. 33). However, as we will see in the following sections, it is not that “AI tends to stumble over basic concrete realities much more than over abstractions” (ibid.), it is that human readers interpret abstractions generatively, as words denoting concepts with potentially vast semantic possibility spaces. This inclines human readers of language system results to read these with a certain degree of charity, which ensues from the human capacity to handle in SN by making _dialogical_ inferences, guided by likeness and difference estimations, as we will continue to explore. Without contextual information, which is much too often hastily disregarded as SN, the disambiguation of the WSs above can in fact be said to be unsolvable. As Rahman and Ng suggest in their 2012 paper: “when given these sentences, the best that the existing [NLP] resolvers can do to resolve the pronouns is guessing” (p. 2). But human interpreters resolve ambiguous schemas in a similar way: by contextualizing and estimating, which cannot be said to be anything different than an inference to the best explanation, i.e., an inference based on _preferred_ possible guesses. Rahman and Ng (2012) address different types of schemas in their paper, and the ones that are easily resolved involve object-language situations such as “Lions eat zebras because they are predators” or “The knife sliced through the flesh because it was sharp” (ibid.). Kocijan et al. also mention that of all the WSs proposed, “The trophy doesn’t fit in the brown suitcase because it’s too [small/large],” has become the most prominent, standard example (p. 4). However, whenever abstract concepts are involved (“popular,” “beautiful,” “angry,” etc.), the resolution of the schemas becomes increasingly difficult. The argument for the appreciation rather than the dismissal of SN here is that these unstable, context-dependent concepts are negotiated as they are applied. This occurs just as much in scientific literature as it does in conversation, and thus exhibits a dialogical, non-individual character. Other proposed solutions to the automation of interpretation in WSs exist, most of which involve adding the “correct” categorical associations to concepts. In “Interpreting Winograd Schemas Via the SP Theory of Intelligence and Its Realisation in the SP Computer Model,”Wolff suggests labelling terms with associations such as “peace-loving” to “city councilmen,” in order to supplement the language system with the “common-sense” knowledge necessary to arrive at unambiguous interpretations. This proposal is a very clear example of the much-criticized unattentiveness to _bias_ (Gebru et al., 2021), as it contrabands a specific, one-dimensional sociopolitical opinion posing as a neutral position. Bias, while ultimately unavoidable—as meaning-agnostic relevance is impossible—is usually discernible in examples in which attributes such as “neutrality” or “common sense” are employed in favor of a certain normalcy.^[It is also noteworthy to mention that in his thesis, Winograd originally used the word “women” instead of “demonstrators” (Davis). In a similar vein, John McCarthy (responsible for coining the term _artificial intelligence_) decided to—playfully, but seriously formalized in modal logic—refer to _common-sense knowledge_, as that which “any fool” knows (McCarthy et al.). No utterance is innocent. Writing this at the time that my home university is violently removing peaceful student protests from campus with the help of riot police, further urges the need to produce any and all cautionary efforts to promote a nonviolent image of demonstrators, including this footnote.] Another example of an effort to resolve the WS challenge (by means of semantic graphs) is presented in a 2019 paper by Arpit Sharma, which observes the following schema as unambiguous to human readers: &emsp; >The man couldn’t lift his son because he was so heavy. Or: >The man couldn’t lift his son because he was so weak. &emsp; Again, without context, we cannot truly say that we are unambiguously able to distinguish between possible meanings. The context could be that the man himself was too heavy and could thus not lift his son, or the reverse (this is the implied “common-sense” meaning attributed to the sentence). Similarly, in the second sentence, albeit requiring some far-fetching, it is not impossible to say that the man could not lift his fragile son, because his son was, indeed, too weak. What we would like to draw attention to by presenting these instances is that the SN inherent in the inferences that human agents make about other agents is more interesting in the quest for common sense, than the desire to automate “objective” common-sense reasoning. This is because current NLP proposals for the automation of said common sense all imply a highly limited version of these inferences (e.g., demonstrators are inherently violent), without even addressing them _as_ inferences, thus leading to conceptual stagnation. In a similar vein to the “Stochastic parrots” argument (Bender et al., 2021), where the authors observe that linguistic coherence is not all that matters in NLP results, Elazar et al. (2021) suggest that the majority of Winograd disambiguation attempts aimed at discerning the possibility of _the learning of common sense_ should in fact disentangle the concept of “actual” common-sense reasoning from the _learned_ common sense presented in supposed WS resolutions. The common sense purportedly presented is not only a probabilistic assessment based on a limited corpus, but it is especially concerning if phrased as “common-sense reasoning” when—as in the case of GTP-3, for example—the training data includes WS challenge materials: meaning that coherence is to be expected, if the corpus already contains the variety of common sense its makers expect to find (ibid.). As is hopefully clear by now, we can observe serious problems emerging from pronoun disambiguation in NLP. The drive to reduce semantic uncertainty installs objectivist projects which unavoidably limit the complexity of the objects at hand. Shannon and Weaver present the idea that an increase in uncertainty can represent an increase in the “degrees of freedom” of a message, in terms of its information (p. 16, p. 27). That is, the less that is known with exactitude, the more that is possible. While, for them, this doesn’t always signify an increase in “meaning,” we can certainly understand the case of SN as meaning-generative in this way, as it produces the conditions for possible experience: when the word “demonstrator” is employed, it can be “noiselessly” taken for granted (as a naively sketched concept of a violent agitator, assumed to be tacitly shared) or questioned, reconsidered, misunderstood, thus granting new perspectives on an unavoidably incomplete concept. In order to reduce the length of this article, we have seen only two examples of WSs, the reader is referred to Ernest Davis’s 2011 collection for an overview of many more.^[Many examples from this collection are analyzed by the author in the author’s aforementioned 2023 article, see: https://www.nature.com/articles/s41599-023-01643-9.] We will now move on to the speculative, theoretical argumentation of why the notion of SN can help us understand these NLP considerations better. &emsp; ### Early conceptual problems in information theory: monologue vs. dialogue Weaver’s interpretation of noise in _The Mathematical Theory of Communication_ (_TMTC_, 1964) is hierarchized: at the “general” and mathematically describable level of interference, and noise at the level of semantics and its ensuing behavior. In this context, he makes a passing reference to the “conventional” meanings of terms, and reflects on what “thinking” may be: &emsp; >... the ideas developed in this work connect so closely with the problem of the logical design of great computers […] that Shannon has just written a paper on […] And it is of further direct pertinence to the present contention that this paper closes with the remark that either one must say that such a computer “thinks,” or one must _substantially modify the conventional implication of the verb_ “to think.” > >Weaver 25–26; (our emphasis in italics). &emsp; In our context, this passing comment is a fundamental reference, as it points to the implications of the modest observation that concepts undergo permanent, dynamic transformation: dialogical conceptual behavior determines how something as fundamental as “thought” itself is to be considered. In other words: if thought is as thought does, divorcing dialogical linguistic representation from “thought” as two separate realms (as “encoding/decoding” (Rieder 28)) is akin to saying that thought can view itself from the “outside” in its entirety, and moreover, that it can only happen within the confines of the human skull: rendering incommunication between agents. Thought, a social phenomenon, happens as agents estimate, expect, express, and negotiate possible meanings. It is in light of this permanent change—which we could speculate is precisely what dialogical, linguistic thought _is_—that the understanding of “units” of meaning (whether this be symbols, words, sentences, paragraphs) as generally unambiguous and nonpolysemous, must give way to a revised analysis of the inherent bias towards the semantic stability of terms in the WSC (and beyond). As Weaver points out: &emsp; >Here again a general theory at all levels [the mathematical, the semantic, and the behavioral] will surely have to take into account not only the capacity of the channel but also (even the words are right!) the capacity of the audience. > >Shannon and Weaver 1964, p. 27. &emsp; This begs the question of the capacity of the _sender_, as well as the relationship between sender(s) and receiver(s). As pointed out by Karen Spärck Jones: a speaker cannot be assumed to produce “the right words,” even to the best of their knowledge (2004, p. 8). The fact that “the words are right” does not imply the original message is “correct,” given that a message assumes an interpreting receiver, whose capacity to discern the semantic noise-to-signal ratio is just as relevant as the capacity of the message sender to _accurately_ formulate what they want to say, in light of the variability of contexts which elicit behavior. The following quote in TMTC is worth analyzing in full length as well, for the sake of historical context: &emsp; >An engineering communication theory is just like a very proper and discreet girl accepting your telegram. She pays no attention to the meaning, whether it be sad, or joyous, or embarrassing. But she must be prepared to deal with all that come to her desk. > >Shannon and Weaver 1964, p. 27. &emsp; Again, to remark on the notion of meaning: a “very proper and discreet girl” will surely apply some degree of semantic interpretation if the telegram needs to be summarized, for example. The fact that most sentiment analysis tasks in NLP are aimed at things like translation and summarization should make contemporary researchers particularly wary of this fact. They continue: &emsp; >This idea that a communication system ought to try to deal with all possible messages, and that the intelligent way to try is to base design on the statistical character of the source, is surely not without significance for communication in general. Language must be designed (or developed) with a view to the totality of things that man may wish to say; but not being able to accomplish everything, it too should do as well as possible as often as possible. That is to say, it too should deal with its task statistically. > >(ibid.) &emsp; This quote represents, perhaps, the main monological bias influencing contemporary NLP that this article wishes to underline. To reduce the idea of language-design (or development) as capable of envisioning “to the totality of things that man may wish to say” is to render a stagnant image of language. Secondly, while a statistical understanding of meanings is useful in many regards, it cannot be proposed, at least in the case of pronoun disambiguation, as closed. The statistical imperative has often been criticized as the art of difference-effacing generalization, and in the context of pronoun disambiguation, where the aim is specificity, relying on the statistical representation of sentiments leads us to situations with obvious detrimental effects in, e.g., the representation of minorities; challenges to authority; the possibilities of new meanings, etc. as we will continue to see below. &emsp; ### Current conceptual problems in NLP: idealizing speakers and interpreters Generative, unstable denotative plurality poses a problem to NLP, which has often—at least conceptually—required that words be finite categories: easily translatable, unambiguous _units_. As Spärck Jones observed (1972), in the context of early information retrieval or language modelling for the purpose of summarizing text: an LM user cannot even be guaranteed to be able to express their needs and/or use adequate expressions, even if they have a goal in mind. This is a frame-type problem, where it becomes very complicated to determine where the boundaries of relevance and coherence lie. In the recent, highly influential OpenAI paper “Language Models are Few-Shot Learners” (Brown et al., 2020, p. 4), the authors present the following view: &emsp; >... humans do not require large supervised datasets to learn most language tasks—a brief directive in natural language […] is often sufficient to enable a human to perform a new task to at least a reasonable degree of competence […] To be broadly useful, we would someday like our NLP systems to have this same fluidity and generality. &emsp; But where do these assumptions rest? The different concepts employed here (_useful, reasonable, competence, fluidity, generality_) are each deserving of in-depth analyses, it should suffice to say that not only do they imply vague notions which would drastically differ in definition when interpreted by even the most self-similar niche of interpreters, but they also suspend a much-needed conceptual analysis which unavoidably implies the metaphorical (e.g., how exactly is a human “fluidly” capable of something an LM is not?). The “Stochastic parrots” criticism presented by Bender et al. suggests that “an LM is a system for haphazardly stitching together sequences of linguistic forms it has observed in its vast training data, according to probabilistic information about how they combine, but without any reference to meaning: a stochastic parrot” (Bender et al., 2021, p. 617). While partly in agreement with this statement we also observe the following: the “reference to meaning” is semantically noisy and thus not entirely straightforward for humans either. The side-stepping of SN, by the hand of IT, through major conceptual advances such as distributional semantics (Harris 1970) or more recently _word2vec_ (Mikolov et al., 2013), has resulted both in stochastic parrots, but also in the persistent yet vague idea of “meaning” as something objectively, unambiguously present somewhere. Contrary to this conception, we argue that it often is the lack of accuracy in meaning (exemplified by how concepts, such as _noise_, are notoriously negotiated), which renders complex terms conceptually _efficient_, and this dialogical phenomenon should play a more prominent role in discussions where the purportedly unique capacities of human agents are compared to those of artificial ones. This is particularly poignant as it becomes clear that one’s own agility with wording, linguistic creativity, and interrogative prowess is what grants interesting exchanges with systems such as ChatGPT. Generally speaking, human users do not expect “expectable” statistical outputs, but surprising dialogue and novel results. In order to model words conveniently one might opt for self-updating distributions of probabilities, but the SN that most complex terms exhibit in dialogue remains an essential aspect of how they function, not an impediment to their “ultimate” meaning. Additionally, as we will explore in the following section, a challenge to the conceptual limits of vectorial representation is that linguistic devices of a complex analogical character—operations which relate object **x** to object **y** in order to underline a quality—abound in NL, while in NLP they are usually downgraded to simple analogies. Metaphors, examples, identity statements, comparisons, etc., can all be said to be analogically complex, they can be arbitrarily infinitely combinatorial in that the relationships they propose are almost always novel (at least at first, before they become “statistically conventional”). This places generative SN at the heart of communication, and as will be observed: at the eternally postponed genesis of concepts. Concepts need to be understood as fundamentally _unfinished_ in order to function within language as we use it. As Gastaldi (2021) notes: “Any attempt to extend the operations of word embeddings outside local, specific, and well-controlled conditions is likely to encounter innumerable obstacles, and hence can barely be relied upon in its present state for concrete real-world applications” (p. 161). These one-dimensional visions impede the unlocking of latent conceptual potentials in NLP technologies: if SN represents an obstacle standing between the LM and its supposed correspondence to the NL world, the solution should not be geared towards the reduction of noise but towards careful attention to the points where it generates relevant frictions. Conceptually conversely to this, most pursuits of automation so far have often been considered as challenges pertaining to scope and data, rather than problems of conceptual underpinnings. This can be observed in the so-called “brute force” sizing up of language corpora in NLP: the data grow but the conceptual model stays the same. If the standard assumption continues to be that in lacking the heuristic capacities of a human being, an LM cannot be expected to determine the “real meaning” of NL formulations, then this assumes an overly simplistic qualification of the function(s) of NL. It proposes not only the singularity of specific, (sometimes quite trivial) vectors of meaning, but also an “image of language” (Gastaldi 2021, after Deleuze) simplified as a picture-like correspondence between “the world” and utterances. Additionally, these assumptions about the possibility of semantic objectivity also present a specific ideological backdrop with regard to what counts and what does not count as _reasoning_. Kocijan et al. appeal to the psychology of Kahneman when they propose that: “in a sentence from a well-designed schema, human readers carry out the inference automatically[:] ‘System 1’ (Kahneman, 2011),” and even though they mention that “this inference seems to require commonsense reasoning of some depth and complexity” (Kocijan et al. 4), they fail to address that however “automatic” it may seem, there’s a high degree of complexity and depth in the _variability_ of _possible_ inferences. The reasoning problem remains: NL is upheld as the reliable “outside” NLP aspires to, and human readers as the “bounded yet rational” (i.e., noisy) users of language. However, neither one of these two assumptions can be said to rest on much else than unfounded sentiments about the capacities of human beings. Kocijan et al. even take notice of noise as an impediment: “Given the nature of the [WS] challenge, even the slightest _noise_ in the feature collection [which composes the common-sense model of any given system] can make the problem unsolvable” (p. 23; my emphasis). Noise certainly _can_ be characterized as an impediment to robust message relaying (Shannon and Weaver), and SN _can_ present a further impediment if and only if we conceive of communication as requiring that utterances represent unambiguous, stable positions. The issue underlined here is not that the characterization of noise as “unwanted impediment” is problematic, rather that the promotion of an image of language where most communication takes place _unhindered_ by noise, and is otherwise semantically stable and generally coherent, is promoting an unquestioned appeal to a supposed normalcy of language. Semantic coherence in NL is fundamentally noisy because it is dialogical: it functions _between_ agents and is not _of_ any one of them. The (social, political, etc.) relevance installed by any particular pattern (e.g., “all demonstrators are violent”), is and should always be open to semantic change, a process which involves a great deal of SN. A recent influential attempt contemplating this supposed problem beyond NLP is Daniel Kahneman et al.’s approach to a “smooth” ethics unimpeded by the perils of noise. They state: “Wherever you look at human judgments, you are likely to find noise. To improve the quality of our judgments, we need to overcome noise as well as bias” (Kahneman et al. 6). This conceptualization is subject to the same one-dimensionalities we’ve seen: not only a “naïve” view of (S)N, but also an exemplar case of common-sense bias. In this account of noise, crudely put: communication should improve if we remove it because “[n]oise is the unwanted variability of judgments, and there is too much of it” (ibid.). Surely if our examples are credit scores and algorithmic gender bias, we could be tempted to remove certain practical frictions from our decision-making. But if speaking of human relations at large, which the authors do, what is hereby missed is that the very conditions of being (a) situated (i.e., biased) and (b) relatively uncertain about what is relevant (i.e., affected by (S)N) are unavoidable parameters which often _positively constrain_ cognitive agents, as they are obliged to make inferences and communicate. Motivated by similar noise-reduction ambitions: the notion of ambiguity continues to be simplified as a solvable challenge to computer-mediated communication, a challenge humans are supposedly capable of resolving “noiselessly” (Levesque; Levesque et al.; Morgenstern et al.; Speer et al.; Mitchell et al.; Wolff; Brown et al.; Xie et al.; Kocijan et al.; Luo et al.). Below we explore a few causes underlying this situation, before moving on to an exposition of the pervasiveness of SN in NL. &emsp; ### Current conceptual problems: the lack of theoretical reflection in language modelling In an article titled “Explainable AI: Beware of Inmates Running the Asylum or: How I Learnt to Stop Worrying and Love the Social and Behavioural Sciences,” Miller et al. survey the influence of philosophical, social, and behavioural theoretical considerations in the realm of explainable AI—a field closely related to NLP—and the authors find little to no influence of the former on the latter. They consider this unsustainable, given the fact that what software engineers end up designing are tools that—as has been often pointed out _ad nauseam_ in (new) media or science and technology studies—work well in restricted domains, closed systems, narrow applications, etc. but are hardly pertinent as _one-size-fits-all_ solutions. As we have argued: this is not necessarily an avoidable problem, no model can capture things in their “totality” (as Weaver would have it), and even if this was possible such a model might not be very useful.^[To briefly mention an influential account of this effect, we could think of the 1:1 scale geographical map presented in Jorge Luis Borges’ “On Exactitude in Science,” the title of which is perhaps insufficiently translated as _exactitude,_ given the Spanish original speaks of _rigor,_ denoting a certain harshness in the discipline of modelling.] Different instances of noise—in our case: SN—which are purposefully removed, ignored, or fail to be considered, are in fact what determines the structural and functional “edges” of a model, and a model is often useful _because_ it is a simplification; an abstraction. The real problem is objectivising abstract aspects which lead to an over-reliance on the model’s capacity to accommodate reality. As Malaspina phrases it: “nowhere in the empirical world is a closed system realized in absolute terms” (p. 48). Idealizing a model through the bias of the possibility of absolute containment; of totality, is a dangerous overestimation of its capacity to deal with complexity. When we think about our WSC examples and the fact that NLP applications include scenarios such as complex, extensively predictive text,^[E.g., language systems which respond to personal e-mails or assist patients: both already a reality.] we need to admit that their users will be presented with situations where the making of a choice is a rather distant effect in a system, as opposed to a process which, traditionally, was considered as residing within the individual. Of course, it can be argued that this “extended mind” scenario (Clark and Chalmers) was always a reality: we have always been cyborgs (Haraway). The issue we point to here are the complex consequences of the current NLP scenario, where, e.g., the political “consensus” seems to be that of unambiguously idealizing demonstrators as violent, to point back to our leading example. As Haraway rightly notes: “grammar is politics by other means” and the search for a “common language” applicable to all is always biased towards the instalment of a dominating politics. What is irrelevant to the model is certainly relevant to that which becomes modelled: restricting the semantic space where NL users make choices is restricting thought, politics, material praxis, and beyond. Arguably, again, NL users already exist within a highly restrictive sociocultural network: that of NL itself. However, it is NL’s fundamental capacity for semantic change that, strangely, fails to become the focus point of systems attempting to somehow capture this noisy processual effect. In much of linguistics, semantics, and philosophy we find analysis of this issue to be a common target, but, as Gastaldi notes: “epistemological and philosophical reflections are scarce, at best, in the literature of [NLP].” Pursuing this, we will now move on to focus on specific conceptual constraints pertaining to these issues beyond the context of WSs and into an analysis of the concept of noise. &emsp; ### Noise and its generative relation to semantics Cécile Malaspina points us to Simondon for thinking about the (semantic) potential of information in the context of IT: &emsp; >What Simondon finds lacking in the quantitative definition of information is the notion of potential, the tension that polarizes and thus gives a sense, if not a signification to information. Simondon rejects the idea that information, measured in bits, could in any way encompass what we must understand by the quality of information, which is what characterizes not only the capacity to inform or regulate reality, but, crucially, its capacity “to illuminate new domains.” > >Simondon 2005, p. 549 in Malaspina 2018, p. 47. &emsp; Some concepts are more readily accepted as lacking a clearly defined boundary than others. A concept such as _noise_ is notorious for its multifarious definitions across domains and epochs, and it is in lacking a common definition across fields that “fuzzy,” “potential” concepts such as noise “illuminate new domains” (ibid.), motivate their own semantic exploration (Malaspina; Curiel). Malaspina’s _An Epistemology of Noise_ (2018) focuses on this variability, and in particular on how the conceptual multiplicity offered by the ambiguities we find in _noise_ rests on a simple yet conceptually dizzying premise: unpredictability and ignorance (examples of phenomena often considered under the guise of _noise_) are the definitional constraints of predictability and knowledge (e.g., _information_) (ibid.). In other words: when something is modeled, it is extracted from what is considered impertinent, yet as Malaspina notes, and as mentioned earlier: it is this impertinence which provides delineation to the model. The excessive and sometimes inaccessible contextual information which determines a representational state of affairs (i.e., are the demonstrators violent or the city councilmen?) is situational and dialogical. This means that agents need to employ a variety of cognitive, linguistic strategies (involving notions of difference and likeness), which unavoidably generalize and reduce complex realities. This requires that issues pertaining to SN in NLP, such as pronoun disambiguation, become approached from said perspective, and not from the perspective which sees SN as an obstacle to be overcome, as this will significantly entrench possibilities in the generative space that is the target of their model: NL. Arguing in the same direction, in the context of the circular dependence between contexts and terms in NLP, Gastaldi states: “this is not a defect of the model but a property of language itself.” Tautologies, for example, may present us (and computers) with logical challenges, but they are fundamentally useful, not only in everyday parlance,^[Though the argument could be made that it is not _exactly_ a tautology, perhaps we could also consider: “it is what it is” to be a colloquial tautology of sorts, which can be analyzed to a great degree of semantic depth, and in many different contexts.] but also in the formation of concepts: the tautology of a triangle being a three-sided figure contains much more than just an identity statement. Among other things: it informs one’s understanding of geometry, as it contains the excesses of other possible figures; the concept of sides, etc. Thinking about this epistemic interplay between noise (e.g., the implicative excess of a term like “triangle”) and pattern (e.g., the concept of triangle), between that which becomes ignored and that which appears to be relevant, is revealing of how knowledge and learning function dialogically through language. If human agents recognize each other to exist in similar conditions, and interact in relative uncertainty by inferring and making use of similar terms and concepts,^[With regard to this observation, one might try to reason about why this is the case. In their work _Relevance: Communication and Cognition_, Sperber and Wilson pose a similar speculation: “Is the ability to understand speakers’ meanings rooted in a more general human ability to understand other minds?”] then what image of language can we foreground, other than one where no single agent owns the “key” to the “total” meaning of their interactions? Neither is any agent capable of assessing “complete information”—in the context of NL—given the fact that NL dynamically evolves beyond the singular agent. These complex, processual realities render high degrees of unpredictability, which should bring into question any model with high predictability claims, such as LLMs. Owing to its semantic complexity, the concept of noise offers the possibility to analyze the question of the production of knowledge under uncertain conditions at several levels, and in a transdisciplinary fashion. Firstly, by way of its _conceptual_ exploration: looking into the various histories, etymologies, and trajectories which compose the concept. Secondly, by questioning its colloquial or scientific usage as a _case_ or _model_: drawing out certain aspects of its perceived functionality and suggesting how these can be reconsidered for the ramification of new descriptions of existing phenomena. Thirdly—though not lastly, these aspects are not exhaustive—by engaging the first two levels towards new interpretations of phenomena previously deemed to be either unaffected by noise or configured under rigid formulations of it. This tripartite structure reveals how a concept and its symbol contains much more than just a concept and its symbol.^[Similarly, a single word requires an entire language model, or lexicon, for acquiring its “sense,” resulting in the contradicting (yet generative) condition that it ultimately somehow actually is its own frame of reference.] Conceptually conversely to this, the general assumption in NLP seems to be that NL is a much less dynamic frame of reference. It is important to consider notions of _difference_ and _likeness_ in the epistemic interplay between noise and pattern that we mentioned above. In thinking about Levesque et al.’s reference to linguistic choice-making in the WSC, where an agent needs “to link which predicate coincides with which subject _best_” (my emphasis), we may want to interpret the comparative conceptual “work” a language thinker does when looking for a predicate which “best” coincides with a subject. A rough procedural sketch of this could be constructed as: (a) recognizing that the combination of subject and predicate signifies two separate, differing phenomena, (b) at the same time, also recognizing that while they differ: one is trying to find a “best” fit, thus admitting (c) an inference about two similar things which differ in order for the expression one is looking to make, to be meaningful. Phrased in more abstract terms, this can be thought of as a conceptual (probabilistic) choice about where the contrast between sameness and difference can be “best” observed, as well as about which “condition” becomes foregrounded: that of difference, or that of sameness (or, relatedly: that of complexity or that of simplicity). In choosing whether it is the trophy or the suitcase that is too big, in the WS referred to earlier, qualities that relate the similarities and differences between one’s concepts of trophies and suitcases (and fitting, size, etc.) are cognitively constructed in order to make this choice. In the realm of trophies and suitcases, we may find likeness and difference to be relatively straightforward, but in the context of highly complex, processual affairs such as human identities (e.g., city councilmen or protesters), it is crucial that the possibility space of likeness and difference remains under permanent negotiation, revaluation, and questioning. This openness to SN can be likened to another point Malaspina presents, about Simondon’s notion of the pre-individual state as “a positive ground for differentiation, in other words, as ground for the emergence of form and its transformation” (2018, p. 48). As we will see in the following section, this generative, virtual condition can be argued to exist at the borders of the permanent, transformational individuation of _all_ concepts, not just the “noisy” ones. &emsp; ### Semantic noise in natural language: the case of metaphors Phenomena like denotational undecidability (what does “NY” represent on a blurry map: the city, or the state of New York?), semantic complexity (what “noise” can mean in different contexts), or ambiguity (e.g., anaphora, certain cases of undecidability) all exhibit—considering the questions we can ask about them—an open-endedness which drives reasoning about their epistemic status. The arbitrariness of symbols is unavoidable, which renders them partially open to interpretation: sometimes narrowly, sometimes broadly, depending on the contextual constraints.^[Gregory Benford’s explorations in “Comporting Ourselves to the Future: Of Time, Communication, and Nuclear Waste,” deals precisely with this question: how to make sure an unambiguous message can be reliably communicated across large spans of time? The answer is expectable: it cannot.] Placing an X or an arrow on a map (as examples of fairly straightforward symbols) means something different than placing them on the graph of a function, on someone’s hand, on a to-do list, etc. The final interpretative _decision_ about any of these affairs will vary greatly depending on the given contexts they present themselves in, and the (semantic) noise they are subject to. Interpretation, generally speaking, can be said to be _probabilistic_ (and _possibilistic_), as it is given by (sub)conscious estimations determined by previous encounters and their interplay with contextual affordances. Prado Casanova, citing earlier work on the subject, refers to probability as the “epistemic condition of an agent that reasons about a system and not a property of such a system” (2023, p. 63), indeed: any assessment of future states is the entertainment of a possibility rather than the reflection of a property of reality. To return to our guiding leitmotif, a pervasive example of this variational open-endedness in semantic estimation is that of the _metaphoric_. As Malaspina (2018, p. 7) suggests: &emsp; >Metaphor, of course, remains a dirty word in the context of scientific and even of much philosophical discourse. Yet it is necessary to acknowledge that metaphors are used abundantly, if artlessly, in scientific and philosophical discourse […] Being in denial about the critical role of metaphors […] is to submit uncritically to their rhetorical power. &emsp; Metaphors form the basis of the vast majority of our linguistic constructions (Nietzsche; Derrida; Davidson; Lakoff and Johnson; Hofstadter and Sander). Observe for example how, in the previous sentence, “basis” points towards a sort of grounding for the following metaphor of “construction.”^[And observe the metaphors as well that ensue from that explanation.] Classically, for Aristotle, “To produce a good metaphor is to see a likeness” (_Poetics_, cited in Derrida 1974, p. 37). But considering our presentation of difference and likeness as comparative inferential parameters, as basic patterns of phenomenal orientation, we could escalate his statement to another level of abstraction. Not only is the metaphoric operation just as much about seeing _difference_ as it is about seeing _likeness_, but it is precisely between likeness and difference, between identifiable pattern and the uncertain horizon of noise, that we find the generative. From Plato and Aristotle to Lakoff and Johnson: a vast array of thinkers have contemplated the power of analogies (e.g., metaphors) as a fundamental characteristic of thought. More recently, Hofstadter and Sander have proposed analogies—in our terms: inferences about the possible likeness/difference relationships between two particulars—as “the core of cognition,” promoting the thesis that without analogies concepts simply cannot exist (Hofstadter and Sander 2013). Indeed, an awareness as well as a capacity to react to variation and sameness—or difference and repetition—is an ability which could be considered the fundamental drive behind all _intentional_ activity.^[Intentional both in the phenomenological sense of experience having an “aboutness” and in the sense of “acting deliberately,” being goal-oriented: choosing.] This act of generalizing, of exploring how more than one thing can be said to exist within a singular category, could be considered as one of the most elemental capacities of linguistic-cognitive agents (Hofstadter and Sander 2013). However, despite much attention to its constitutional role in conceptual cognition, the metaphoric continues to be relegated to the imprecise; the unnecessarily ungrounded; the “merely” poetic or figurative. An immediate example of how the metaphoric shapes the conceptual could be this very argument composed here: this argument “sheds light” on how two elements (“the argument” and “light”) can be contrasted, challenging the reader to compare—employing difference and likeness as parameters—their own concepts of “argument” and “light,” hereby “_illuminating new domains_” (Simondon in Malaspina 47). Moreover, in terms of the generativity we can observe in the metaphorical: an argument does not literally “shed light,” yet these words urge an interpreter towards the allocation of the intended sense.^[This _sense_ has been theorized in terms of linguistic, dialogical competence as _speaker meaning_ by Grice, or _first meaning_ by Davidson, for example. In our argument, where concepts exist in permanent dialogical negotiation and are thus subject to SN, we find the most interesting interpretation of this process in the _outside text_ of Derrida (“Il n’y a pas de hors-texte”): where the “outside” refers, among other things, to the idea that in any theory of legibility and meaning, one could be tempted to imagine meaning as existing objectively, yet in looking for it one discovers how it always points elsewhere. This could be in terms of circularity: as in the case of a lexicon, temporally: as in the concept of “artificial intelligence” given it is a concept/phenomenon in current development, or logically: as in the case of tautologies, to name but a few possibilities.] Concepts can almost always be etymologically traced back to their metaphoric origins, which certainly signals their processual, semantically noisy composition. Take a word such as “concept,” and observe that it stems from the Latin _con-_, the prefix for “together, with,” and _capere_ “to take” or “to grasp.” Resulting in “concipere”; to conceive, the meaning of which is also to “take in” or “become pregnant.”^[In considering this last connotation we cannot avoid mention of how Weaver closes his introduction to TMTC, as he quotes Arthur Eddington on the subject of the classification of words depending on their concept, where Eddington makes reference to the conceptual, denotative consequence of classifying entropy alongside beauty or melody as “a pregnant thought” (28). For him, this consequence would imply that something known to be measurable and arithmetically discernible suddenly reveals itself to speak the language of the aesthetic. Weaver agrees by stressing the fact that the _Mathematical Theory of Communication_ gets entropy one step closer to semantics.] Another interesting etymological example, in speaking of noise and patterns, is the word _example_ itself: derived from _exemplum_, which denotes “a sample, specimen; image, portrait; pattern, model, precedent.”^[An additional implication is noted to be that of “a warning example, one that serves as a warning,” the word _exemplum_ itself stemming from _eximere_: to remove, take out, take away, reflecting the _modelling_ impetus we have been pointing to, as well.] The discussion on the origins of words in the Western context goes as far back as Plato’s _Cratylus_, where Socrates, Hermogenes, and Cratylus discuss topics ranging from the practice of etymology, to the question of indexicality, to conventionalist or naturalistic approaches to language, etc. The _Cratylus_ presents us with an indication of the fact that the _map_–_territory_ conundrum can be found at the very nascence of the Western canon of philosophy, signaling the metastable reflexive tendency in the interplay between SN and variable semantic pattern we are after in our analysis. Moreover, the analogical mechanism operating behind metaphors reveals how they give rise to conceptual understanding in dialogue: if a speaker presents **x** alongside **y**, an interpreter—who in our reality is necessarily bound to some degree of causal inference—will attempt to draw a connection between **x** and **y**, even if this connection is simply the observation of the arbitrary fact that a speaker _chose_ to connect **x** and **y** (a commonly found generative occurrence in, e.g., poetry). As Bender et al. (2021, p. 616) suggest: &emsp; >[H]uman communication relies on the interpretation of implicit meaning conveyed between individuals […] even when we don’t know the person who generated the language we are interpreting, we build a partial model of who they are and what common ground we think they share with us, and use this in interpreting their words. &emsp; An agent’s ability to observe and/or construct in order to attend to and/or deploy a sign in a semiotic environment (i.e., an environment where other agents may or may not decode deployed signals), is a necessarily bidirectional; open-ended; generative affair. By extension, the metaphorical—and thus the conceptual—falls within the same domain: it is bidirectional because its interpretation depends on the different degrees to which the agent can affect its internal (causal) models about reality, and therefore (re)generative because of how these inferences in turn alter the (semiotic) environment. All these operations handle in different modalities of likeness and difference (or simplicity and complexity), affected by varying degrees of (semantic) noise. Expounding this to the realm of reasoning in general, we may speculate: an interpreter may or may not be able to infer why a poet or a mathematician draws a connection between **x** and **y**, but the existence of systems such as poetry and mathematics is observably robust enough in the interpreter’s causal world-model for this to provide sufficient evidence that their existence must have _some_ reason. This leads one to the _intuition_ of meaning,^[The word _intuition_ is used here is Kantian, to imply a sensation; an impression, but also a sensorial hypothesis; the postponement or suspension of meaning.] however intractable, circular, absent, paradoxical, etc. one may find it to be. In this way, one could say that the interpreter represents _noise_ to the systems of, e.g., mathematics or poetry, and vice versa: these systems are semantically noisy to the interpreter, resulting in varying degrees of bidirectional transformations. Either they—the system to the interpreter or the interpreter to the system—become part of an ignored, irrelevant background (which defines that which is _not_ them), or they find a way to account for each other’s existence, thereby including each other within their particular frameworks. As an example, one could say that mathematics requiring a pedagogy—i.e., the fact that it needs to be taught and learned—means that it, as a system, is attempting for interpreters to be included within it, or conversely bidirectionally: a learner of mathematics is attempting to account for mathematics within their semantic framework. This interaction is crucial because, as we learned from Weaver, its determination results in the coordination of behavior. The metaphorical is thus cognitively demanding, given that two elements are paired in order to express a third, without making explicit reference to the third. This effect is what renders all metaphors, and the ensuing complexes of concepts, such as _noise_, as generative conceptual objects: their meaning cannot be indexically given; they demand that an agent resolves it by way of dialogue, i.e., by the inferential reshuffling of world-models. Why is Juliet “the Sun”? Why is _truth_ “a mobile army of metaphors?” Interpretations will vary, depending on the particular biases in the world-models of an interpreter. Given that an interpreter needs to make cognitive effort in order to allocate the semantic relevance of a sign, it could be argued that the meaning of these complex conceptual objects always lies, in fact, in the _future_ (Derrida), given their status as _unresolved_ performative utterances; as “expression seeking” (Spärck Jones 1972); as “inventive construal[s]” on part of the speaker as much as on part of the interpreter (Davidson 1978). Their actualization is always a _promise_, dependent on a constantly changing _intuition_ of a world-model.^[This continually processual metastability is not restricted to the realm of the linguistic alone, the observation can also be made in other modes of perception, such as the effects at play in illusions of vision and audition, as demonstrated in [[04 Concepts as pre-dictions]].] The development of knowledge is conditioned by the unknown, and the composition of metaphors is one of the most common ways NL uses to sustain this knowledge–uncertainty relation, through difference and likeness operations. Douglas Hofstadter has referred, via Lakoff and Johnson’s foundational work on the subject, to the fundamental quality of the metaphoric as existing at the core of cognition (Hofstadter and Sander 2013). He is quoted on the subject of the WSC in Kocijan et al. (2022, p. 7) saying: &emsp; >I don’t think that anyone who will be moved to tackle this particular challenge is likely to take up the deeper and more general challenge of what language understanding really is. People are daunted by that, as well they should be, and no one is going to be motivated by a prize to suddenly tackle that gigantic challenge. Instead, very smart engineering types are going to be motivated to seek clever tricks that will allow computers to solve this very narrow type of linguistic disambiguation problem with a high degree of accuracy. &emsp; That _gigantic challenge_ is, in our view, unsolvable if we consider language as thriving on SN. On the other hand, perhaps similarly to problems pertaining to operations in chess, the WSC presents us with a closed world that is easy to dissect and disregards SN. However, this is precisely where noise plays a crucial role: an open system is constantly subject to input variability. If primacy is given to stability over variability; i.e., if NLP wishes to resolve the “problem” of semantic noise, then it is bound to an eternal game of linguistic catch-up with the linguistic landscape “outside” it, and its essential function will never exceed that of a dictionary, however responsive, dynamic, and structurally complex. This is not to make a case for human language users as essentially different than artificial language users, quite the contrary: human language users should be considered just as limited and bound to “error” as the artificial systems they propose. What _is_ different, however, is that SN is not avoided by NL speakers, it is often sought after and created. In the context of the definition of a black hole, for example, Erik Curiel argues that it is an investigative virtue and not a problem for astronomy, physics, and philosophy to have to admit variable definitions of the concept of a black hole, for the sake of furthering research. SN is both structurally constraining and functionally exploitable: in the outlining of conceptual borders, as already mentioned, or for example in exploring emotional compatibility through the use of humor: in jokes, double entendres, puns, etc. But _also_ in advancing a particular form of (sociopolitical) life by revealing ideological biases through collaborative discussion, as in the case of our leading example, where politically debatable terms such as “demonstrators” and “city councilmen” play a central role. As we have argued, attention to the SN ensuing from WSs reveals the processual, dialogical character of language functions. Within the deflationary (NLP) discourse that surrounds the concept of SN, we find preference for the _literal_ or concrete, as that which refers to the effect this sentence has when it _states_ that it is a series of words, composed of letters. While the _figurative_ refers to the effect that this sentence has when it _says_ that it flows like a stream. This is because, put in very simplistic terms, the meaning that ensues from a metaphoric comparison requires interpretation of what open-ended likenesses/differences can be identified in an expression, whereas the literal effect of a statement necessitates that it be taken at “likeness face value,” without reference to a differentiating, noisy comparative plane. Structurally, however, it can be argued that metaphors are analogical statements which offer comparisons between objects, and the comparative operation that an interpreter engages in to infer meaning when decoding a metaphor is not necessarily unlike the “literal” interpretation of any sentence or word. Not only because most words are originally metaphoric at root, but also because the meaning of almost _any_ word which appears straightforward at face value, can be found to be markedly ambiguous, contextual, and generative, as clearly exemplified by the concept, term, and phenomenon of _noise_. Noise not only escapes a specific transdisciplinary definition, but in most cases, as a concept, it also attempts to define precisely the effect that something has when it, itself, is markedly ambiguous or undefined (Malaspina). If the ultimate goal of synthetic NL understanding is for it to possess the “fluidity and generality” of a human interpreter (Brown et al.) then NLP needs to engage with SN at a serious conceptual level, rather than dismissing its function and potential as incidentally ornamental, e.g., in the case of metaphors. In relation to this, we follow Gastaldi in proposing a shift in attention in NLP research, suggesting a move towards the exploration of new ways in which language modelling can reveal something about the nature of dialogical activities in general: &emsp; >[I]f we want to disclose the image of language animating the entire series of those [NLP] models, we need to consider their success as something more than a purely technical feat with respect to specific aspects of language, and redirect that question to the _nature of language itself_. In other terms, to the question “why can computers understand natural language?” we should direct our attention to natural language rather than to computers, and ask: _what must natural language be for the specific procedures of MMs and word embedding models to succeed in revealing some of its most essential aspects_? > >(p. 173; emphasis in original). &emsp; While we agree that LMs reveal a great deal about the functions of NL, our analysis of the WSC in NLP still begs some additional questions, the ones guiding our investigation: what linguistic functionality is lost when SN is excluded from conceptual considerations in NLP? And can we arrive at a different interpretation of the semantic capacity of noisy linguistic phenomena if we recognize them as generative rather than unsolvably problematic? If we recognize them as generative (i.e., requiring cognitive effort to make an assertion _beyond_ what they _seem_ to propose indexically or “commonsensically”), this acknowledgement can certainly shed light on the socially distributed production of knowledge. Among other things, it forces us to observe the—necessary but conceptually insufficient—critique of biases under a different light: biases are not problematic “glitches” to be avoided, incidental noise nuisances to be removed, but are actually fundamental to perception and (collective) meaning-making. It is the dynamic individuation of these biases that drives NL. The effects of _noise_ as an unstable concept, which itself tends to denote the unstable, are comparable in many ways to the analogical-metaphoric effect which pairs two elements in order to underline a third, without making explicit reference to it. The intended meaning is supposed to be _understood_ by virtue of the analogy being (more or less) successful at illuminating the possibility space of a concept. Like with _noise_, the third element—the likeness and/or difference—is grasped by a generative cognitive attempt, an inference which can be said to be left unfinished: **x** is like **y**, but it doesn’t (always) need to be specified _how_. Even if specified, the demonstration as presented in the _x is like y_ formulation suffices to represent both the rule and an example of the rule. In the case of a metaphor like “Juliet is the Sun,” once it becomes fully specified how Juliet is like the Sun _exactly_, its generative possibilities change (and if exhaustively listed: become rather limited), since an indexical reference is created, a specific computational output. Similarly, in the case of some specific instances of noise: the identification of something _as_ noise means a categorical distinction is being made within a certain corpus, thereby installing an indication of _order_ or at least of limited scope. Calling the uncertain to order makes it unavoidably certain, and again: if we base this order on statistical closure, then we unavoidably cancel the probabilistic (and possibilistic) approaches to knowledge which fundamental scientific research in hypothesis-testing, and even perception at large if we understand the perceptual dealing with uncertainty as following the principles of active inference (Friston et al). This is what we can refer to, following the proposals made by Malaspina, as the argument _from noise_: the intractable and yet-not-actualized is not marginal but in fact central to the plastic renewability of language and thus collective conceptual cognition. This interpretation is opposed to the realist or “common-sense” interpretation of language as simply an ever-vaster repository of evermore-accurate concepts, a view which, unfortunately, seems to be the dominating perspective in NLP today. In the case of computation, where _noise_ was traditionally considered in terms of mere disturbance to the functionality of a system, many of the contemporary computational uses of noise are actually conceptually attuned to its functional qualities, as a constraint which _can_ yield interesting results when a system is exposed to it. Instances of this can be the identification of patterns at various scales with the use of stochastic gradient descent, for example. Outside traditional computation, but still computing, we observe a variety of noise instances which become “patterned”: the analysis of originally seemingly erratic behavior (e.g., the dance-like behavior of bees); the sectioning off of previously opaque surfaces for closer inspection (e.g., quadrant sampling, the Hubble Deep Field, the CMBR); or even the execution of noise _music_: classically considered an oxymoron in itself (is it still “noise” once it’s become a repeatable practice?). In these specific instances of noise, as soon as something becomes recognizable (i.e., a pattern is identified) in what was previously deemed noisy, concepts are created and/or rules are discovered. We can further reflect on this effect in NL by thinking about the following analogic comparison: “_X is like Y_ like _Juliet is the Sun_.” How so? Well, the mere fact that this formulation is presented ought to inspire the reader to interpret how their likeness differs, or vice versa (e.g., both are similes, both are examples, both are abstract, both differ in _x_, _y_, _z_, etc.). In analogies, which make up the very fabric of language, we allude to a third element that needs to be interpreted, we point to a semantic horizon _beyond_ the two elements that compose the analogy, most often without actualizing them specifically. In the case of the computability of complex terms, much research in NLP considers as impertinent much of what can theoretically be observed as fundamental to NL. This is not to take away from the fact that research such as OpenAI’s GPT is able to actually bring these questions to light, which is the reason for the generation of the current article. However, it seems like a missed opportunity to only reflect on the possible misuses of language-generation tools (such as plagiarism, redundancy, imitation, etc., e.g., Floridi and Chiriatti 2020, p. 681), when the fact is that not only are these problems already pervasively at play in the case of human beings, but also that aspiring to imitate “human-level” generativity, while at the same time acknowledging the bias inherent in it _and_ pretending that it is possible to mitigate it, is a fundamental contradiction. Ignoring this not only impedes the possibility of said generativity, but also promotes a highly reductive image of language and all it can afford. In order to insist on that last note, perhaps we can conclude this section by speculating about the role of _artificial intelligence_ itself as complex metaphor. Considering how analogical reasoning drives the intuition of meaning existing beyond the sum of parts, that is, considering the imaginative effect analogies elicit at the core of conceptual thought, we could say it is then this non-deterministic characteristic (because it is so vastly combinatorial) that makes concepts creative, speculative devices. Importantly, this comparative dimension connecting the elements which make up an analogy is considered to be _tacitly implied_. What is thus left unsaid when employing an analogy could be understood as the driving motor behind (meta)linguistic intuition, a central pillar of what we consider “valuable” in intelligence. Considering AI, then, as the construction of the most elaborate metaphor humanity has ever employed provides perhaps an interesting alternative for the conceptualization of the tacit semantic horizon which underlies the comparison between “artificial” and “human-level” intelligence. What is left unsaid in the employment of concepts such as “common sense”? As we have observed, quite a lot, if not _everything_. This “analogical” route of the differentially negative can, perhaps, help both philosophy and AI understand concepts such as _artifactuality_ or _intelligence_ in novel ways. Its methodology implies, among other things, that analysis begins from a perspective where both concepts co-determine each other dynamically, instead of where one (e.g., human intelligence as common sense) is assumed to already be socially understood. Understanding meaning as (semi-)static ignores the dialogical, conceptual virtuality of language, and thus also unnecessarily mystifies concepts such as _creativity_ and _imagination_, two chief aspects of “intelligence,” often framed within the _meaning-making_ NLP discourse as things “machines can’t do.” The importance of the effects of current developments in NLP is not to be underestimated. Given its wide range of applications, it is not an exaggeration to observe that we are witnessing the future (of language) to come: as NL prompts will come to determine the creation of anything from movie scripts all the way to legal frameworks and _actual_ virtual realities, bringing in a variety of ethical problems (of representation) in the process (Bansal et al., 2022). Cautionary observations about the potential problems resulting from the uncritical production of LLMs range from issues of behavioral contamination (i.e., the more human agents interface with models, which on a surface level appear to communicate intentionally, the more humans learn to communicate in a way which works _for the system_, and vice versa) to the homogenization of diversity in dialogical exchanges (i.e., the lack of minority representation, the loss of many dialects, etc.) and beyond. Current NLP interpretations of the function of language cloud the _generative_ and _social_ aspects of communication. However, the arguments presented here should not be read as a plea for (1) the abandonment of current NLP enterprises, (2) the idolization of human language as superior or irreproducible. We simply present these arguments to underline the problems ensuing from a reductive image of language in the trajectory of NLP thus far, and to propose the possibility of conceptually improving it by paying closer attention to these issues. &emsp; ### Conclusion In this chapter we have focused on the concept of SN as presented in IT, and proposed it as a generative function present in all of NL. We have made a case for promoting the acknowledgement of the dialogical engagement with the unstable conceptual variation of terms, which we repeatedly presented against the objectivist desire to procure stable definitions and disambiguated meanings in NLP research. The object under analysis was that of WSs, anaphoric cases which are considered as challenges in NLP due to the interpretative ambiguity, or SN, of pronouns in simple sentences, e.g., “The nurse helped the patient even though she was upset.” The standard assumption in NLP being that, in lacking the “common-sense” capacities that a human being possesses, a language model is not able to determine the “meaning” of many of these frequently occurring syntactic formulations (even in cases with enough contextualizing evidence). Ignoring how human agents can be said to struggle with the exact same disambiguation issues presents many problems. We have provided arguments for why ignoring these problems is a naive interpretation of (semantic) noise, one which proposes a specific “normalcy” of language as well as presents a specific ideological backdrop with regard to what (common-sense) reasoning is and how it functions through NL users. The problem with the representation of knowledge in the WSC examples we have seen is the prominent reliance on the idea of meaning as _fixed_. As we observed, indeterminacy in the structures of NL is abundant. It is not only the basic comparative characteristic we observe in the mechanism of analogical propositions such as metaphors, but we observe it at large in colloquial expressions such as: “more or less,” “kind of,” “it is what it is,” and a myriad more terms which are exemplary of the fundamentally ambivalent noise of communication, serving the purpose of underlining and making certain indeterminacies explicit _and_ helpful. These expressions sometimes display (a useful or questionable) vacuity, and oftentimes they are signs of the search of mutual ground; coherent collaboration; solidarity; communicative collaboration, among other things. The pervasiveness of SN in NL as a dialogical process seems to make its own case for the fact that it’s relevant. The argument we have been after, however, is in no way a glorification of instability or ambiguity, but simply its acknowledgement. Moving beyond a critical analysis of NLP, the theoretical intention of this article was also to make a case for noise at several levels: by observing the potential of the _concept_ itself, by observing it as a _case_, and by questioning its _functionality_ in NL. The material analyzed, an exposition of a leading problem in NLP, focused primarily on issues arising from the interpretation of WSs (linguistic anaphora), in order to elucidate how the “noisy” generative processes proposed are often overshadowed by an ideological technical impetus which simplifies language as an indexical; denotative; propositional “mirror of reality.” Guided by the example of _noise_ as concept, term, and phenomenon, we concluded by demonstrating how an epistemic shift in NLP, a shift which takes the generative potential of language more seriously, can inform the design of different models, as well as shed light on alternative purposes for language models. Echoing Bender et al.: “we call on the [NLP] field to recognize that applications that aim to believably mimic humans bring risk of extreme harms” (2021, p. 619). These harms may be as banal as the inability to determine “who was upset” in a specific utterance, to the needless perpetuation of rampant discrimination across a wide range of increasingly pervasive NLP applications, to actively promoting _mass one-dimensional forgetting_ of how NL renders social agents. <div class="page-break" style="page-break-before: always;"></div> ### Footnotes