**Links to**: [[Concept]], [[05 Prediction]], [[Active inference]], [[Hyperprior]], [[Negation]], [[Paradox]], [[Tautology]], [[Illusion]].
### Concepts as pre-_dictions_: the necessarily paradoxical (linguistic) experience of _negation_
 
>Pure mathematics consists entirely of assertions to the effect that, if such and such a proposition is true of _anything_, then such and such another proposition is true of that thing. It is essential not to discuss whether the first proposition is really true, and not to mention what the anything is, of which it is supposed to be true. Both these points would belong to applied mathematics. We start, in pure mathematics, from certain rules of inference, by which we can infer that _if_ one proposition is true, then so is some other proposition. These rules of inference constitute the major part of the principles of formal logic. We then take any hypothesis that seems amusing, and deduce its consequences. _If_ our hypothesis is about _anything_, and not about some one or more particular things, then our deductions constitute mathematics. Thus mathematics may be defined as the subject in which **we never know** what we are talking about, **nor whether what we are saying is true**. People who have been puzzled by the beginnings of mathematics will, I hope, find comfort in this definition, and will probably agree that it is accurate.
>
>Bertrand Russell, _Mysticism and Logic and Other Essays_, first published 1910, version 1917, p. 46, own emphasis in bold.
 
>I want to see LLMs say “no” or “but” like children might.
>
>François Chollet, _MLST_ 2024 interview.
 
>The word “not” seems like a poor expedient to designate all that escapes my understanding [...].
>
>Rosemarie Waldrop, _Lawn of Excluded Middle_, 1993, nr. 15.
 
<div class="page-break" style="page-break-before: always;"></div>
### Concepts as pre-_dictions_: the necessarily paradoxical (linguistic) experience of _negation_
**Abstract**. This chapter presents an intervention which schematizes the concept of ‘negation’ in the broadest sense, to explore its predictive dimensions. By combining questions in philosophy of language(s) (what is ‘negation’?) with those of perceptual phenomenology and psychology (can we experience ‘nothing’?), the main proposal advanced is that ‘negation’ can be understood as a **hyperprior** (a basal cognitive assessment) at the level of the generative structures of natural language—whose functions have implications for out cognitive extensions into formal languages—and that its coordination of perception and communication, which sometimes results in paradox, is comparable to experiences of a paradoxical nature in non-linguistic phenomena such as sensory illusions.
<small>Keywords: Negation, Active Inference, Predictive Processing, Natural Language, Paradox, Illusion.</small>
<div class="page-break" style="page-break-before: always;"></div>
### Introduction
 
>[T]he privileging of concepts reveals a will to nothingness.
>
>Kofman, _Nietzsche and Metaphor_, p. 19.^[Following our interest in seeking the (predictive) functions in concepts, the extended citation reveals slightly more: “metaphorical style is the sign of a plenitude of life, just as “demonstrative” style indicates a poverty of life. The deliberate use of metaphors affirms life, just as the privileging of concepts reveals a will to nothingness, an adherence to the ascetic ideal” (Kofman, _Nietzsche and Metaphor_, p. 19).]
 
Following Wilfrid Sellars’ proposal that it is the task of philosophy to produce widely-encompassing categorial abstractions: “The aim of philosophy, abstractly formulated, is to understand how things in the broadest possible sense of the term hang together in the broadest possible sense of the term. Under ‘things in the broadest possible sense’ I include such radically different items as not only ‘cabbages and kings’, but numbers and duties, possibilities and finger snaps, aesthetic experience and death.” (Sellars 1962, p. 1), we begin. This proposal which could be said to have multiple frictions with, but can also be fruitfully compared to, Gilles Deleuze and Félix Guattari’s suggestion that philosophy is that which generates concepts, perhaps by highlighting a dialethic dimension of remaining with paradox: “The philosopher is expert in concepts and in the *lack* of them.” (Deleuze & Guattari 1994, p. 3, emphasis added). Or: “Sense is always produced as a function of nonsense” (Deleuze 1993, p. 72) and: “Paradox is initially that which destroys good sense as the only direction, but it is also that which destroys common sense as the assignation of fixed identities.” (ibid.).^[Deleuze, 1969/1993: p. 3), cited in Kusters 2023.] Not to forget: “[Paradoxical] circularities were called vicious circles; ... the epitome of what had to be shunned. [T]hey [should] be called virtuous and creative circles.” (Varela 1984, p. 309). Setting the tone through these incursions, this paper presents an intervention which schematizes^[Frederic Jameson calls the schematism, in Kant, “the X-ray of a concept” (2023, p. 31). We take from this metaphor the reductive yet functional image of the X-ray as that which reveals an underlying, general structure behind something more dynamic and fleshy.] the concept of ‘negation’—which we see as fundamenting _para_-doxes—in the broadest sense, in order to observe some of its predictive functionality in perception and dialogical coordination.
Taking active inference (AIF)/predictive processing (PP) as our explanatory avenue, concepts—as the structures of (linguistic) cognition—^[In a broad sense, as Kantian categories, and where ‘moving’; ‘Sun’; ‘which’; ‘the’; ‘number’ anything else that is **expressible,** is a concept.] are to be conceived of as **predictive structures of thought**.^[It is important that we mention that the definitions presented here of terms like ‘concept’, ‘negation’, and ‘prediction’, should be the ones assumed to hold for the arguments. We are not presuming background knowledge of e.g., Kant and Hegel on negation or infinite judgments, nor specific insights from Gestalt psychology on illusions, or generativity in linguistics. Whatever is argued here relies on the definitions and explanations provided.] Following related proposals in psycholinguistics, and adhering to an interpretation of conceptual scaffolding as inferential and enactive (Friston 2011, Ramstead et al., 2019), and thus of “conceptual structure” not as strictly linguistic but as fundamentally perceptual and cognitive (Jackendoff 2002, p. 123, Lupyan and Clark 2015, Vasil et al., 2020, Bouizegarene et al., 2024).^[Do note that Jackendoff (2002) highlights the advantages of Chomsky’s Generative Grammar, but also criticizes its fundamental problem: that of syntactocentrism. Please also note that our use of the term “generative” signifies no allegiance to Chomksy, but refers to the generativity inherent in perception, more on this in relation to PP’s **generative models.**] An account of the concept of ‘negation’ will be given as a **hyperprior** or ‘protoconcept’, drastically generalized here to include ‘nothing’, ‘non-(being)’, ‘absence’, ‘lack’, and other related negative notions. The interest of our argument lies in the possibility to combine questions about negation in philosophy of language(s) with those of phenomenology and psychology, towards a naturalization of the experience of paradox as **contra**-diction or cognitive irresolution—that is: that is: all inherently negative concepts—in both natural language (NL) and non-linguistic experience. Following—misosophous—Negarestani in the proposal that philosophy is “an ascetic program” (2018, p. 414) where negation sits center-stage and is the necessary condition for cognition (ibid., p. 422), we present negation and some of its implications when parsing it through its predictive dimension.
We begin by providing an elucidation of predictive processing towards an understanding of language-use and concepts, and after that we will explain our interpretation of ‘negation’ as a hyperprior at the level of the generative structures of natural language. In the context of active inference, hyperpriors are high-level, abstract beliefs that temper how a cognitive system filters information at other levels (e.g., hyperpriors can be: object permanence, the expectation that light often comes from above, gravitational expectations, etc.). Deeply embedded hyperprior beliefs function as organizing principles that constrain lower-level inferences, thereby setting fundamental expectations about how reality is to be tracked. Negation as a hyperprior can be understood to operate at a functional system level, making it foundational and more resistant to updating than more flexible beliefs, and in fact, perhaps being precisely that which enables belief flexibility. After this, we will trace an account which compares the linguistic-philosophical experience of negation with that of illusory or ambiguous physiological experiences. The suggestion made here will be that the coordination of perception and communication by the general function of negation, which sometimes results in paradox, is comparable to experiences of a paradoxical nature in non-linguistic phenomena such as sensory illusions. In order to ground our suggestions we will provide a series of examples of linguistic paradoxes (where negation is observed as the protoconcept or hyperprior sustaining them) and physiological paradoxes (where conflicting expectations sustain ambiguities or illusions).
After these expositions we will make a few remarks about how the functions of negation in NL have implications for the construction and conceptualization of formal languages. The conclusions reached, guided by our interpretation of active inference as a framework which proposes uncertainty-reduction as a hierarchical process distributed across different scales (cellular, somatic, conceptual) and systems (individual, collective, infrastructural) will be that paradoxes, in light of their inherent negation of themselves, are better interpreted in a *dialethic* fashion (Priest), given that this interpretation reflects how they are capable of **containing multiple possibilities** which can be resolved in different ways. This interpretation emphasizes the open-ended dynamicity (and perspectival sociality) of thought: concepts, as predictive-constructive structures, are effectuated by a collective and need not necessarily reach a final form(alization) in any one agent.^[This is not to say that they need not reach formalizations at all. Here we take a pragmatic approach to formalizations and suggest that collectives can agree on temporary formalizations in order to structure certain aspects of conceptual thought. The suggestion is that the resolution of paradoxes make no sense if these are to be conceived as resolving themselves for a single agent.]
 
### Predictive processing and language use
PP suggests that cognition can be understood most parsimoniously as a process which predictively produces experience by projecting expectations and adjusting these on the basis of what it encounters (Clark 2023, pp. 10-1). The dynamics of perception-cognition-action, understood as a continuum, are framed as a hierarchical process of cascading estimations which are dynamically contrasted with incoming sensory evidence (Clark 2023, Friston 2011, 2012, 2013). The emphasis on _hierarchical_-processing implies not only a notion of scales—that is, everything coupling, e.g., the skin and the nervous system—but also that some priors are more ‘adjustable’ than others: while we may be capable of modifying some of our concepts or aesthetic interests, it is harder to imagine that we may become capable of modifying our interest in staying within a certain temperature range.
The evolutionary emergence and use of language is framed by the leading voices of PP as an uncertainty-reduction structure capable of coordinating human collaboration (Vasil et al., 2020, Bouizegarene et al., 2024). If the tendency of organic systems is to maintain themselves in a restricted set of states in the face of uncertainty while relying on a partially observed environment (Friston 2011, 2012, 2013), then language-development and use can be understood as a collective strategy towards the production of a more predictable environment. Simply put: indexicality, and having names for things, coordinates collective behavior in usefully structured ways: we need not run into uncertainty time and time again, if we have a name for a returning regularity which can be shared in a group. The corollary of this coordination is that we can share mental states via language structures, as well as continually adjust them in the face of experience’s returning contingencies. The production of the cultural niche that is NL reveals how organisms such as ourselves effectuate-construct a generative model of the world: naming things such as objects in our environment drastically reduces the need explain or interpret them anew each time. In the context of much of the environment currently being language itself, where the generative models of language-users—i.e., their (linguistic) way of modeling the world—become committed to “a (reliably) shared narrative” (Vasil et al. 2020, 2), concepts can be understood as having evolved to a high degree of abstraction **as well as** concreteness because they reflect the need not only for aligning our mental states (Vasil et al., 2020) but also specifying detailed aspects of experience (Butz et al., 2021), in the context of an ever-growing ‘shared action space’ (Pezzulo et al., 2013) composed of necessarily diverse perspectives.^[Perhaps all-too-speculatively naturalized: philosophy as the practice of constructive conceptual attempts at the broadest margins (Sellars 1962, Deleuze and Guattari 1991) reveals possibilistic explorations in a search space towards ever-generative organic evolution.]
In order to adapt to the future, to be able to at least minimally _pre_-sense what the patterns in a projected _countercurrent_ (i.e., not currently the case) will feel like, what next (and next, and next) situation will ensue, a **generative model** is possessed by a projective agent. It is the “filter” (Bergson)^[In relation to future lightcones (Levin 2019), we can also relate Bergson’s take on projective memory and its tangent with the real, in the image of his famous memory cone, which is an inverted version of the projection of a future lightcone.] which “processes signals that track _divergences_ between expected and actual sensory data.” (Bouizegarene et al., 2024, p. 3, our emphasis), it is the basic “difference engine” possessed by complex adaptive systems. It is **generative** because the agent _expects_ and thus _produces_ a specific reality. In a dialectical fashion, it is both a **model** and therefore defines what an agent is, to itself, while at the same time its very processing of divergence tracks everything that the agent is **not**.^[In active inference, this active co-creation of reality between agent and environment is dictated by the minimization of free energy (see [[Free energy principle]]). Because of the complex temporal dynamics that come with being able to future-project, AIF distinguishes between two types of free energy: _variational_ and _expected_ free energy, see expanded explanation in [[Generative model]].]
The generative model of any (organic) agent is challenged with maintaining coherence in the face of dissipation _now_: keeping within a certain temperature or humidity range through movement or nutrition, for example, as well as (in the case of agents with significant projective capacities) keeping itself together in the (distant) future, based on what it learns about surviving _now_. How does an agent’s generative model, which is flexible enough to endure a lot of contingency (from our mortal perspective), navigate the possibly rather vast seas of contingent variations? It estimates. In AIF, this estimation is understood as approximate Bayesian inference because, minimally, at least, we need to say something has _priors_, based on characteristics which incline it: it has ways it _is_ and ways it has learned, which lead to the formation of _posteriors_: estimated results that will come to tune future estimations. All of this ensues from the evolutionary dynamics of niche-organism. This adaptive condition characterizes all organismic determinations, ways in which organisms perceive and couple to, or co-produce, their context.
In this light, ‘negation’—again, drastically generalized, encompassing **counter**factual reasoning—can be understood as one of the most fundamental ways in which cognitive agents can deal with ever-novel presentations of uncertainty, with divergences from expectation. Considering the social need for collaborative alignment, as well as the need for challenging sedimented understandings, negation is a fundamental function for evolving the cultural niche that is natural language. This point is crucial, as: “communicative systems should evolve toward a balance between usability and learnability (simplicity) on the one hand, and on the other, increasingly arbitrary, hierarchically deeper (complex) action sequences” (Vasil et al., 2020, p. 16): without this dialectic they would not evolve, nor become capable of accommodating novel actions.
Concepts are not simply “ways to communicate our preexisting thoughts but highly flexible (and metabolically cheap) sources of priors throughout the neural hierarchy” (Lupyan & Clark 2015, section 1). One way in which negation can be understood as ‘metabolically cheap’ is that it is easier to think “**not** going that way” than to consider all the other possible directions we may take, or: to simply **disagree** with someone rather than attempt to model their entire argumentative structure (or life) in one’s mind.^[This effect, and others comparable ones such as ‘confirmation bias’, can be understood as resulting from priors, or preferences, which are sometimes rather difficult to modulate.] Another way in which concepts function as uncertainty/surprise/free energy-reducers is that they exist in a **shared** environment, which implies that even though one agent might not have encountered or understood a particular word in the past, they may still parse and employ it, because of the prior expectation that this word has meaning for others (i.e., an argument amounting to *externalism*).^[Otherwise they would not have encountered it to begin with. Here, of course, we speak of an encounter with an already-existing concept and not an invented one, although in the context of the creation of terms we could imagine that given an agent understands words have uncertainty-reduction-potential—here this is also understood as highly generative meaning-generating potential: agents seek novelty to reduce uncertainty, too (see: Schwartenbeck et al. 2013)—they can create new structures in NL, expecting that these will generate meaning for others.] Understanding language under the guise of prediction also enables us to understand its functions, as a whole, as an abstract counterfactual structure itself, where reality can be ‘probed’ before being ‘encountered.’ In other words: it allows NL “to act as an “artificial context,” helping constrain what representations are recruited and what impact they have on reasoning and inference.” (Lupyan & Clark 2015, ibid.).
A consequence of this is that we produce representations “at multiple levels of abstraction” and that “the higher-level (more abstract) representations formed in the goal of minimizing prediction error in the present enable better predictions in the future.” (ibid.). This interpretation goes in line with what will follow on negation as a _hyperprior_, but for now let us reframe this in the context of the abstract character of philosophy with a remark by Sellars (1962, p. 34, italics in original):
 
>Thus our concept of ‘what thoughts are’ might, like our concept of what a castling is in chess, be abstract in the sense that it does not concern itself with the _intrinsic_ character of thoughts, _save as items which can occur in patterns of relationships which are analogous to the way in which sentences are related to one another and_ to the contexts in which they are used.
 
However, in our account, we have both: we are looking at something intrinsic and as a variable pattern. Considering the PP exposition above: deriving the most abstract aspects of experience does mean, perhaps, revealing aspects of (unchangeable) priors in our thinking structures, as well as the possibility to modulate them (once we become aware of the abstraction(s) we are subject to). If cognition tracks experience by producing predictions about incoming sensory input and updates these predictions based on what it receives as input, then _negative_ concepts can be understood as predictive counterfactual modulators. If experience is directed by physiological set-up and by learned patterns of behavior (adaptive priors: from walking to talking), then in order to adaptively navigate contingency, (linguistic) negation might reflect a more fundamental structure of the processes which allows for adaptive _learning_, that is: for the tuning of generative models and their ensuing hypotheses about states of the world to come, and how these are challenged and (when successful) change accordingly.
Cognitive agents are collectives of hierarchically structured levels (these levels being, e.g., cellular, somatic, conceptual, social) seeking—sometimes conflicting—approaches to uncertainty reduction. High-level priors such as negation can be understood as predicting more abstract and generalized aspects (‘**not** going that way’), while lower-level ones handle shorter-term sensory details (as we will see with ambiguity in illusions). Experiences of negation—at the conceptual and the sensorial level—can be understood as _necessarily_ paradoxical because we must observe them as a process which adjusts hypotheses about incoming sensory data. In considering _possibilities_, it is a **requirement** that multiple options hold at the same time, because they are being considered towards one or another predictive dimension. It is in this sense that the title refers to concepts as pre-**dictions**: before they are pronounced and socialized, they are speculative (proto-)counterfactuals in the structures of thought as it casts itself into the future.
If _nothingness_ is to be understood as “the absence of everything” (Priest 2014, Casati & Fujikawa 2019, Costantini 2020), then we can, perhaps, parsimoniously refer to it as representing a hyperprior which allows for foundational re-alignment of certain predictive parameters. E.g., when we contemplate nothingness we set **everything** into question, this is a major cognitive task. Paradoxes expose this effect in cognition by allowing the mind to go back and forth between possible predictions. If nothing is the absence of everything and this presents us with a situation in which there is nothing to **pre**-dicate over: our predictive account brings some naturalizing clarification as to why this is experienced as generatively confusing, as paradoxical. There is literally _nothing_ to pre-dict, as our expectations are being negatively challenged. This way we can frame dialetheism, allowing for statements to be **both** true and false, as a tempered, adaptively predictive approach which keeps options _open_: not exploiting one specific dimension—one particular way of minimizing surprise—but accepting that most predictive proposals considered in the linguistic experience are open-ended; novelty-seeking; explorative. As Friston puts it (2011, p. 90):
 
>... if I am a model of my environment and my environment includes me, then I model myself as existing. But I will only exist iff (sic) I am a veridical model of my environment. Put even more simply; “I think therefore I am, iff I am what I think”. This tautology is at the heart of the free-energy principle and celebrates the circular causality that underpins much of embodied cognition.^[More in [[Free energy principle]].]
 
A dialetheia is, then, most certainly a two-way truth: it can be understood as representing a cognitive state in which options are considered. This back and forth is tempered by the predictive conditional that underpins it: modeling the world (and our existence within it) unavoidably **as if**, because organic perspectives are necessarily partial observations (Friston 2012, 2013). This is why the _Dao Teh King_ is and is not: it has (not) to be, for the sake of an evolving generative model which deals with eternally-new presentations of contingency. _Absolute_ negations are either impossible, or fascist and/or deadly.
 
### The concept of negation as a high-level prior
The tenet of PP/AIF is that organisms are compelled to model their world in accordance with their generative models: these ensue from evolved adaptive priors. Adaptive priors are “evolutionarily endowed, heritable beliefs that guide characteristic patterns of cognition and behavior in conspecifics” (Badcock et al., 2019, p. 1329). Shaped by cycles of selection, these priors depend upon (epi)genetic inheritance, which includes the formation of structures such as language. Crucially, in NL, their function is to “constrain the space of prior beliefs learned during ontogeny to enable adaptive action in local cultural niches” (Vasil et al., 2020, p. 2). Given the fact that we are (dialogical) collectives, even within highly homogenous conditions, agents still **differ** in perspective: constructs such as negation function as modulators for learning and modifying expectations towards collaborative construction (where not all is positive, collaboration thus necessitates functional doubt, dissent, discrepancy, divergence, and other related negative notions/effects). It is in this way that species-typical patterns of cognition-behavior “can be explained in terms of _adaptive priors_: inherited expectations ... that have been shaped by selection to guide action-perception cycles towards unsurprising states (e.g., “I will keep moving **until** I am rewarded”)” (Badcock et al., 2019, p. 1329, our emphasis in bold for stressing negation).
Negation, under this consideration, includes cognition-behavior patterns such as “stop”, “enough”, or, indeed, **paradoxes**: where the predictive directives of language provide conflicting options. “This sentence is not true” first orients the mind to predict that a sentence is taking place, that the mind should follow a linguistic pattern we call ‘sentence’—which is a description of a state of affairs (Priest 2022) which, as we frame it: serves as an indicator of further predictions—but it is then stopped by the negation “not true,” which in most contexts indicates that a prediction should be reconsidered.^[“The real discovery is the one that makes me capable of stopping doing philosophy when I want to.—The one that gives philosophy peace, so that it is no longer tormented by questions which bring itself in question.” (Wittgenstein 1958, p. 51), cited in Kusters 2023.] The mind can thus go back and forth, parsing the predictive impetus of the sentence recursively, as it reinforces its status as a sentence and as something which, in its very pronouncement, cancels the anticipatory resolutions we ‘usually’ expect of sentences. Formulating this in terms of the evolution of adaptive priors as constrained by strategies which seek to effectuate a generative model, we can interpret this foundationally-negative paradoxical effect as ensuing from cognition only registering salience (predictive efficacy). Sensory states are only registered (i.e., learned) as “valuable (i.e., unsurprising) if [they] lead to another valuable state” (Badcock et al., 2019, p. 1329): in the context of the liar’s paradox and comparable phenomena, multiple states are valuable in the experience, hence the contradiction.
The concept of negation, generalized as a hyperprior (governing lower-level priors) to include counterfactuals; notions of ‘not’; ‘nothing’; ‘against’; ‘lack’ and many others, can thus be viewed as stemming from functionally-adaptive effects which sit at the core of agents’ generative models. Negation is a _possibilistic_^[We employ this term for now to avoid ‘probabilistic’, as there exist criticisms of PP’s Bayesian inference-ladenness, see Kwisthout & van Rooij 2020.] overarching expectation: a proto-counterfactual. In order to stress this point, we present a few directions in which arguments may be taken, but in the interest of time we present them only in brief.
Negation can involve the ability to predict the possible **absence** of something. This ability can be understood, as mentioned earlier, as a way by which other predictions may be interrupted. Otherwise: cognition would have no capacity to deal with contingency, it would entrench itself in highly specified predictive dimensions and fail to encounter or produce novelty.^[Again, something like confirmation bias can be understood as such an effect.] Abstract understandings of **non-being**, falsity or nothingness can modify expectations in practical _and_ in highly abstract situations. Practically: “I am not hungry,” or: to reinterpret sensory input whenever a prediction needs to change “the light _seemed_ to be on, but it was _“not_”. Or in highly abstract situations: “the name that can be named is not the enduring and unchanging name” or “this lie is a sentence.”^[A formulation the author prefers for the sake of a refreshing presentation, and is surprised does not present more often, as far as Google can tell (last check May 30th 2024), it only returns one result: https://www.reddit.com/r/AskReddit/comments/ajfwh5/how_often_do_you_lie/, first accessed circa February 2022.] Negation can be understood as that which allows for the tuning of expectations and handles non-being across various temporal scales (short: e.g., practical, long: e.g., abstract, or undefined: e.g., mathematical); sensory modalities and different contexts (social, physical, logical, and others).
As explained earlier in reference to Lupyan & Clark’s (2015) remark that concepts are ‘metabolically cheap’: negation can be understood to enhance predictive **efficiency**. By reducing the space of predictive possibilities, a high-level ‘negation’ prior allows for the processing of conflicting hypotheses or evidence (in experience it is always a combination of these, of course). This reduces cognitive load: if we know it will **not** rain on our way to work we drastically reduce our uncertainty about clothing, temperature, mode of travel. At the same time negation also supports abstract reasoning: the ability to consider what is ‘not’, ‘marginal’ or ‘excluded’ is crucial in scientific hypothesis-testing and model-building. These variable effects of negation reveal learning through experience and interaction in ever-diversifying contexts: hence the evolution of many different **types** of negative concepts. Lastly, framed by the FEP (Friston 2011), active inference upholds that the most basal description **and** effect of an entity which persists in spacetime is its very existence in **distinction** from its environment (Friston 2011, 2012, 2013).^[More on this in [[Markov blanket]] and [[10 Bias, or Falling into Place]].] Without this basic dialectic: we cannot have an entity, perspective begins at situated difference. Thus, even at the most foundational aspect of PP: the being versus non-being paradox is precisely what holds everything in place.
Under PP/AIF considerations, and in our interest to provide as ample as possible of an interpretation, negation thus can be seen to function as a hyperprior: a fundamental, multimodal cognitive principle that modulates the formation and adjustment of further priors in perception-cognition-action. Below we move onto some considerations which compare the linguistic-philosophical experience of negation with that of illusory or ambiguous physiological experience. The suggestion made here will be that the coordination of perception and communication by the hyperprior of negation, which sometimes results in paradox, is comparable to experiences of a paradoxical nature in non-linguistic phenomena such as sensory illusions. In order to ground our suggestions we provide examples of paradoxes in linguistic phenomena (again, where negation is observed as the protoconcept or hyperprior sustaining them) and physiological experience (where conflicting expectations sustain ambiguities or illusions).
 
### Paradoxes as philosophical-physiological experiences
A paradox is generally defined as a **contra**-diction, as something running **counter** or **against**, as that which is in **opposition** to expectation. It is therefore, an inherently negative concept: we foreground negativity because we are considering the ‘positive’ as that which results in **no surprise** (learning/adapting), we can understand positivity as everything which runs predictively smoothly in the background.^[Understanding ‘the background’ as anything from how we (mostly) do not need to ‘think’ about digestion or heartbeats, but how sometimes we do about breathing, and how we always do about challenging spatiotemporal problem-solving, especially in unison with others.] In this way, we can understand Priest’s nothing(ness) as the ultimate contradiction (Priest & Grabiel 2022) and as the “ground of reality” (Ralón in Priest & Gabriel 2022, p. 2): its very effect, as a “limit-concept,” (ibid., p. 10) perhaps grants us a “natural” window into the learning, free-energy-minimizing structures of the mind. Paradoxes, both in their linguistic/formal and in their perceptual presentations, are understood in our context as insights into the very functioning of perception as it adjusts-effectuates its predictive models.
When processing incoming sensory data, priors; expectations, are dynamically effectuated, which—in light of the permanent changes which any organism is challenged with—also **permanently** results in conflicts between expectations and sensory input. An “illusion” (a barber’s pole, Shepard’s tone; but also a rainbow, or simply: perspective) can be understood as a perceptual experience revealing the very _mechanics_ of the experience, rather than as a conflict of stable ‘truths.’ This, at a highly abstract (hyperprior) level, and like any other process which leads an agent to question its own predictive dimensions _as_ predictive dimensions (i.e., abstraction-contemplation occurs whenever deeply seated cognitive habits are _negated_; questioned), is a rather disconcerting and (generatively) ungrounding experience. Most often, in the interest of stabilizing certain patterns over others, cognition—at the unconscious, conscious and collectively-conscious level—resolves contradictions towards _specific_ types of predictive consistency (or positivity). These adaptive preferences result in **contra**-diction whenever they collide with each other, as we saw with “this lie is a sentence.”
Following PP then, in the paradoxical experience there is an expectation, and then this expectation is challenged: paradox as **contra**-diction or irresolution, houses negation within it. What one observes as the abstractive aspect is a generative model’s consideration of possibilities: A or B or A or B or A or B, possibly _ad infinitum_. Fortunately, even when presented with two equal stacks of hay, human users of natural language are capable of resolving paradoxes, not necessarily by ‘choosing’ the ‘correct’ expectation, but most often: by letting it exist as a possible bifurcation of options. This is because, in the cultural niche that is NL, the context—other conspecifics, the linguistic environment—reduces uncertainty when the predictive capacities of single individuals find no ‘definitive’ resolution. It is precisely in this way that “cooperative communication becomes a self-fulfilling prophecy: ... by gathering evidence for their adaptive priors, the low-level dynamics of interactants appear to create and maintain ... the observable coherence of a contextualizing scale of (cultural) organization; namely, a communicative system.” (Vasil et al., 2020, p. 18).
Paradoxical (visual) experiences, an exemplar phenomenon in the PP literature, are often presented to emphasize that vision does not “veridically reflect” sensory inputs but is informed by predictive priors (Lupyan & Clark 2015): the experience of visual perception emerges as top-down prior expectations are combined with bottom-up sensations, resulting in a representation that best fits the combination of both.^[E.g., when we see a shimmering light/a shadow/other and interpret x, y or z: we are often challenged by how we misinterpret when we learn what we understood as the input was something completely different.] Most somatic-sensorial predictions are unconscious (Helmholtz, more on this in [[10 Bias, or Falling into Place]]), as opposed to philosophically-generative _conundra_ such as paradoxes. However, in visual (and other sensorial) experiences, we know of countless examples where we witness experience *as* predictive learning revealing its inner mechanisms (this is also how can frame Priest’s nothing(ness)). For example, all manner of optical illusions, where we know there is, e.g., no motion, yet we **see** it:
 
![[rotating snakes.png|300]]
<small>Fig. 1. Rotating snakes illusion, Akiyoshi Kitaoka, 2003. I see these types of illusions as excellent examples of Anscombe’s “I do what happens”, at a very fundamental level. Watanabe, Kitaoka et al.,. also revealed how neural networks trained on motion-detection also predicted motion in the rotating snakes, see Watanabe et al., 2018.</small>
 
Otero-Millan et al., explain the rotating snakes illusion as a result of “transient oculomotor events such as microsaccades, saccades, and blinks” (2012, p. 6043). Essentially: the sensorimotor mechanisms which enable the visual registration of motion are primed to interpret specific types of patterns (gradients, high contrast) as salient, and eye movements themselves can trigger an interpretation of motion as they ‘scan’—especially at the microsaccadic level—for differences (i.e., surprise). We can extrapolate to similar effects in other ‘illusory’ phenomena by framing things in predictive terms: we observe a paradoxical dialectic of “it’s there, no it isn’t, it’s there, no it isn’t...” This effect is perceptually salient, as perception involves expectations, and conflicts in these result in the need to contemplate options. Margaret Masterman, NLP scholar with a strong constructivist take on the open-endedness of language’s expansive _swellings_ (see, e.g., chapter 2 in Masterman 2005), understood concepts as operating the same way as gestalt phenomena, too. For her, a concept is a phenomenally ‘creative’ “composite, like a gestalt figure. For it too is an aggregate that appears to have different aspects, according as you pick it up by different synonyms.” (2005, p. 68). Her understanding, like ours, allows for a naturalizing approach to concepts.
Philosophically, and laconically, phrased: **the paradox of negation is that we resolve it by the negation of paradox**. That is; negation is a paradox, as it reflects the learning, interpreting and effectuating of difference in the dynamics of perception, which necessarily involve a dialectic that needs to be abandoned in order not to get stuck in a spinning void. The consideration of possibilities: A or B or A or B, in the context of language as a massively distributed dialogical system, does not resolve by ‘choosing’ the ‘correct’ expectation, but by **letting it exist**: as we can expect the context to take care of uncertainty reduction when we find no definitive resolution. This context is other users of language, and our—constructed and given—niches beyond that. We treat further aspects of this in [[Mortal computation]] and [[10 Bias, or Falling into Place]].
 
### Formal languages and negation
 
>I believe that at the end of the [20th] century the use of words and general educated opinion will have altered so much that one will be able to speak of machines thinking **without expecting to be contradicted**.
>
>Turing, 1950, p. 442, our emphasis.
 
>‘Language is like a net’ (I might have said, making a very long story very short) because the set of uses of any word is like a fan. Masterman 2005, p. 55.
>
>[T]he units of language as we want it have got to be _concepts_. (Ibid., p. 59).
>
>And the idea of a language as consisting of a set of interacting concepts was precisely that which Russell, following Moore, found that he had to abandon, before he could start framing the propositional calculus. Only one form of contrast, negation (which is, for practical purposes, far too few) and, at most, three connectives (which is quite often, for the scientists’ purpose, two too many) and a set of disconnected gaps to be filled not with concepts, but with ‘expressions’, isolated sentences, that is, squeezed of all semantic juice, this is all that is left when you have so depleted language as to make it able to turn into propositional calculus. And language thus conceived is far too weak, too little information-bearing to be of interest to science: no scientists mind about whether they have got their propositional connectives right in an argument anyway. So it is clear that propositional logic is not what we want here. (ibid.).
 
Negation is more than a formal matter. As cited in this project’s introduction, François Chollet would like LLMs to “say no like children might” (2024). His indication here is that negation, being an unmovable mover, is a defining trait of intelligence. L. Horn’s (1989) _Natural history of negation_ opens: “All human systems of communication contain a representation of negation.”^[Horn’s quote continues: “No animal communication system includes negative utterances, and consequently, none possesses a means for assigning truth value, for lying, for irony, or for coping with false or contradictory statements.” , 1. But binary negation of the kind “Stop! / No! / Enough!” that humans employ very often, can certainly be compared to functionally-similar growls, barks and bites in other animals, an easy example are domestic dogs and cats.] We include formal languages and, by extension, computation here: as a specific kind of formalized communication system.^[As stressed in this project, much can be understood as computation. See: [[02 Introduction to the Poltergeist]] and [[Computation]].] To specify the generalized abstraction that we refer to as a computer here (abstracting _beyond_ a capacity to effectuate logical/arithmetic operations): it is anything which arrives at a definite state after the possibility of formal bifurcation. Presented as such, computers, as **difference engines**, are classical negators: based on this simplistic binary logic, one could hardly imagine something more negative than a computer. Like mentioned earlier, we can think about all niche-formation, including computers, as strategies of uncertainty-reduction (Vasil et al., 2020). This would include, fundamentally, their capacity to settle on a state when a bifurcation presents. Formal languages as the chunking and parsing of highly specialized realities, also reveal computation and information-compression as the making predictable of matter: we can allow for silicon dust to settle on a very specific state, and know that we may return to it in exactly that state.^[Of course, never with 100% certainty, cosmic rays disturb microchips all the time.] In our interest of seeing languages (both natural and formal) as “artificial contexts” (Lupyan & Clark 2015) through which we probe reality, we can think of (formal) languages (and their effectuations in computation) as the most reliable and dynamic prediction-modulations currently available.
The view of negation as an effect of a predictive high-level-prior effect has implications on our understanding of formal languages. As we saw, it is _natural_ that in natural language we may entertain paradoxical experiences. Their back-and-forth dialectics can be compared with those of illusions, where we observe that our senses are dealing with incoming data in an efficient way: energy is not ‘wasted’ on certain structures because we can physiologically assume them to—almost! _Never say never_—always^[The always/never negation joke is rather abstract here, please try to read in a predictive way and observe what ensues.] behave in the ‘same’ (comparable, adaptive) way. In formal languages such as propositional or Boolean logic, the absolute symmetry that holds between something being the case and **not** being the case is different than in NL, because in NL the computations are open-ended—i.e., they can but do not need to arrive at a definite state—whereas formal languages are constructions which reduce uncertainty in more restricted ways, where a definite state—_staticity_—is a prerequisite for its functional stability. Open-ended means that enactive, extended, ecological, embodied, embedded (5E) systems such as mortal human collectives are able to resolve their uncertainty by relying on contextual 5E factors: the environment (such as housing, clothing for reducing weather/temperature-related uncertainty); other people (division of labor); animals and plants (domestications of all sorts), as well as: **formal languages** themselves (in notation, in computers). In the case of the latter: these do not possess organic types of generative models as the ones possessed by human animals. They are built to explore-exploit uncertainty as an extension of organic generative model purposes, but their structures do not reflect free-energy-minimization in the way that organic beings do.^[More on this distinction in Ororbia & Friston 2023, [[Mortal computation]].]
In our current dialogical landscape, the formation of this predictive stratum can only be analyzed at the conceptual level, which is why it has been in the interest of our arguments to remain as abstract as possible. The abstractive (negative: effacing of differences) capacities of languages (both natural and formal) are the most dynamic prediction-modulation capacities we know of right now. What a communication system such as NL can effectuate and sustain versus a FL is predictive suspension: (linguistic) minds are able to entertain paradox because, as we argued, negation means intervention in an action or prediction which results in uncertainty-reduction occurring **beyond** the locus of action or prediction: if I say “no” to something it means I am hoping to reduce uncertainty for the sake of my purposes, but for something necessarily beyond me. Not having the prerequisite of landing at definite states is what allows humans to wonder at the concept of nothingness. The compelling consequence of this is that this is what makes thought inevitably **social** and **distributed**. Formal systems and natural language exist on a spectrum rather than as opposites. Software or anything else we may consider substrate-independent—i.e., most formal systems—despite the _appearance_ of precision, still contain all manner of ambiguities, tacit assumptions, and contextual dependencies—just like natural language. The difference being that formal systems are structured to minimize uncertainty towards _very_ specific, narrow goals. This is what makes them precise, but brittle.
 
### Conclusion
Concepts, in this view, are dynamic constructs that help reduce uncertainty by modulating vast, diverse, distributed scales of sensory in/outputs, resulting in guiding adaptive behavior. All concepts are abstractive, which in our argument has meant that all concepts are therefore **negative**. The presentation of negation as hyperprior or protoconcept allows us to understand how cognition explores possibilistic reasoning. The suggestion, therefore, has been that “by gathering evidence for their adaptive priors, the low-level dynamics of interactants appear to create and maintain, at least for some period, the observable coherence of a contextualizing scale of (cultural) organization; namely, a communicative system” (Vasil et al., 2020, p. 18). Emphasis on “appear to create and maintain, at least for some period” is in line with our suggestion that concepts have effects with a necessarily **indeterminate**, explorative logic. This view offers a socially-comprehensive and functionally-adaptable framework for exploring communicative processes, because such an interpretation emphasizes the open-ended sociality of thought: concepts, as predictive-constructive structures, are effectuated by a collective and need not necessarily reach a final form(alization) in any one agent.^[This is not to say that they need not reach formalizations at all. Here we take a pragmatic approach to formalizations and suggest that collectives can agree on temporary formalizations in order to structure certain aspects of conceptual thought. The suggestion is that the resolution of paradoxes make no sense if these are to be conceived as resolving themselves for a single agent.]
So, in the interest of full abstraction and the creation of conceptual X-rays (Jameson 2023): in the assigning of names to presumed patterns which apply at different spatiotemporal scales (e.g., the concept of “linearity” or, relatedly: “history”), we are bound to complex evolutionary interference: where concepts, inevitably housed by different agents and communities, track different predictive perspectives. Interference between the assumed predictability of scales is perhaps the most salient aspect of philosophical disagreement. Disagreements about these and any other concepts are, in our argument, interesting to analyze in terms of predictively adaptive evolutionary strategies. This is not to reduce their function, but to expand it: it is this very metacognitive capacity to **see as** (and see seeing as as as... _ad infinitum_) which generates fantastic recursions in (non-)linguistic experience. Some things can thus be clarified or simplified by our predictive account of concepts, and in particular that of negation. An example could be the Fregean distinction between sense (_Sinn_) and reference (_Bedeutung_): sense can be understood as more explorative, possibilistic, imaginative and predictively risky (all of which highlight the same process: free-energy minimization), whereas reference is much more exploitative, probabilistic, indexical and predictively safe. This understanding legitimates the necessarily paradoxical functions behind tautologies; terms without reference; and other phenomena. Non-being thus certainly plays a parsimoniously explanatory role in philosophy, when exposed under a predictive light.
One line we have not explored but one which is certainly relevant here is Goodman’s work on projectibility, which directly ties to the predictive understanding of concepts we have presented. Goodman famously introduced the _grue paradox_ to illustrate some of the challenges with priors, with induction and its possible projectibility. A pre-dicate (a concept as a pre-diction) like “grue” (defined by Goodman as green until time _t_, then blue) is a demonstration of the predictive aspects that concepts house. A “definite state” predicate such as “green”^[A color which also puzzles Sellars, more on this in [[06 Principle of Sufficient Interest]].] which should apply indefinitely, is more “entrenched” (Goodman 1955) than “grue”, which keeps a certain predictive, dialethic dimension open. Goodman’s _grue_ can therefore be read as suggestive of the observation that which predicates are constructive, they are functionally related to practices and purposes. The projectibility of predicates _literally_ shape which aspects of reality become salient, and how future experiences are to be classified/modulated. Another interesting line to explore would be Quine^[A nice note on Quine on recursivity and paradoxes by Varela: ““Yields falsehood when appended to its own quotation” yields falsehood when appended to its own quotation. This Quine koan is a colorful way of presenting a pervasive knot that has been present in the study of language and mathematics for a long time. ... what seem[s] complex but understandable at the molecular domain [co-constitution, interaction between parts and emergent complexity] acquires a sense of paradox in th[e] linguistic domain. It is harder to leap out of the need to stay at a given level of meaning and simply look at the whole sentence as a unity. Paradox is exactly that: that which cannot be understood unless we examine it by leaping beyond both levels tangled in the structure of the paradox.” (Varela 1984, p. 311-2).] on nothingness in relation to prediction and “desert landscapes”, something treated in [[Free energy principle]].
The predictive condition will always be locally-constrained, preferential (selective, limited: see Friston 2012, 2013), this makes it conceptually-analyzable in terms of **situated** uses/teloi/narratives/ histories/legitimacies. Because of this: advantages and disadvantages for all the perspectives involved at different scales in these strategies can be better elucidated, criticized and discussed.
It is _no wonder_^[This is a joke about negation and philosophy.] that shaking one’s head sideways is found across cultures, worldwide, to imply negation, a phenomenon speculated to have emerged from the refusal of food.
<div class="page-break" style="page-break-before: always;"></div>
### Footnotes
%%
[[Wolfram irreducibility and interconcept space]]
[[Concepts as pre-dictions, brief summary, author, intent to submit]]
[[Concepts as pre-dictions presentation]] HAS ADDITIONAL NOTES YOU SHOULD ADD
Add this to intro/presentation: [[Self-reference]]
Formalize the contradictory as conflicting models, or the model brushing up with the Wheeler zeroth border [[Lingua]] talks about the zeroth border, too.
>“want to see LLMs say “no” or “but” like children might.” François Chollet https://youtu.be/JTU8Ha4Jyfc?t=9098
because they don’t do that it seems they are not intelligent nor conscious, to him
so: negation as a foundation for cognition? Negintelligibility?
Kofman in memory and relief:
“Comedians is what philosophers are” (in yt presentation but is it Kofman or the guy?)
shaking head horizontally seems to be quite common across cultures, speculated to be possibly related to the negation of food.
>“From the day Lobatchewsky dialectised the notion of parallel, he invited the human mind to dialectically complete the fundamental notions. An essential mobility, a psychic effervescence, a spiritual joy found itself associated with the activity of reason.“ Le surrationalisme, Inquisitions, G. Bachelard — by negating (the axioms of existing geometry), you can do new geometry. It’s not just about “creating the new” but also negating what exists, and this is what _philosophie du non_ is. (see picture):
![[bachelard philosophie du non, hannes.png]]
There is an incoming insistence on Platonism across domains. From the metaphoric understanding of the soul in light of functionalism (Cartesian _realism_), to literally “carving nature at the joints”, to a combination of both: Michael Levin’s lab and his platonic impetus.^[His most recent take on how “agentic materials” explore latent space is that they exploit mathematic invariances which must exist as patterns in the fabric of reality for life to latch onto them. According to him (2025): “This provides a new perspective on the organicist/mechanist debate by explaining why traditional computationalist views of life and mind are insufficient, while at the same time erasing artificial distinctions between life and machine, since both are in-formed by diverse patterns from the latent space.”]