Language models - ░ ░ ░ ░ ░

**Links to**: [[Language-modulating]], [[Question]], [[Semantic attractor]], [[NLP]], [[Semantics]], [[Syntax]], [[Linguistics]], [[NLU]], [[Entropy]], [[Noise]], [[Probability]], [[Vector]], [[Vector space model]], [[Calculation]], [[Neural nets]], [[Evolution]], [[Degeneracy]], [[Redundancy]], [[Kripkenstein]], [[Modulations]]. ### _All language-models are wrong, but some are useful._ # ❝𝐋𝐚𝐧𝐠𝐮𝐚𝐠𝐞 𝐦𝐨𝐝𝐞𝐥𝐬❞ Language models are probabilistic string-organizers. They only _learn_ when they get updated. They are notoriously anti-generative because much of their inner logic is structured around the reduction of entropy. What astonished the world when they first rolled out into the public, was a simple chatbot interface. Indeed, I often say “please” to them “thank you” to them. >If reasoning problems could be solved with nothing more than a single step of deductive inference, then an LLM’s ability to answer questions such as this might be sufficient. But non-trivial reasoning problems require multiple inference steps. LLMs can be effectively applied to multi-step reasoning, without further training, thanks to clever prompt engineering. In chain-of-thought prompting, for example, a prompt prefix is submitted to the model, before the user’s query, containing a few examples of multi-step reasoning, with all the intermediate steps explicitly spelled out (Nye et al., 2021; Wei et al., 2022). Including a prompt prefix in the chain-of-thought style encourages the model to generate follow-on sequences in the same style, which is to say comprising a series of explicit reasoning steps that lead to the final answer. As usual, the question really being posed to the model is of the form “Given the statistical distribution of words in the public corpus, what words are likely to follow the sequence S”, where in this case the sequence S is the chain-of-thought prompt prefix plus the user’s query. The sequences of tokens that are most likely to follow S will have a similar form to sequences found in the prompt prefix, which is to say they will include multiple steps of reasoning, so these are what the model generates. > >Murray Shanahan, “Talking about large language models”, 2023, p. 8. See: [[03 Semantic noise]] for some thoughts on 2023 LLMs. ![[kollektiv 1973.png|600]] <small>Fig. 1. Album cover detail, Kollektiv, self-titled, 1973.</small> %% ### Based on [[ErasmusX meeting]] presentation march 2024, notes on llms and learning If, like the library society debate mapping tool, LLMs gave options, particularly if they showed options for “really random” stuff, they might be seen under a different light in education. Right now, something like a thesaurus is definitely not a banned instrument, neither is it frowned upon, it is in fact very useful. What is someone doing when they use a thesaurus or even when they decide whether to base their style analysis essay in Chinese or Malaysian novels? They are choosing, from what came before, from what already exists, in hopes of reflection towards something “new” or insightful somehow. At least in the current paradigm. So: how could language modeling help rather than hinder this? By working like a thesaurus of sorts, but with much more agility. For the sake of what? Of someone else “enjoying” or being persuaded by what they read? Are we back to old school rhetorics? ![[token strength visualization claude golden gate.webp]] 1. What, exactly, is learning? I would like to ask you to reflect for a few seconds. If you tried to learn anything about learning just now, you might have realized that learning is, at the bare minimum, **change**. You start with some given knowledge, and you want to change that knowledge, for whatever reason. **NEW NOTE: Crucially: learning is confrontation w/ nothingness, with contradiction, with something NOT BEING THERE, and the ensuing search for something that will fill that gap (PP).** 2. In the context of machine learning, there are a lot of learning problems. Researchers who get sidelined and fired from major places like Google, etc. keep presenting these problems, but nobody seems to be learning. READ OUT SLIDE 3. I want to show you a specific study that I wrote about, which is one of the very first babysteps of contemporary language-modelling. This is a Winograd schema, a natural-language-processing test for machines which has been compared to the Turing test, that is: something that we consider important for machine learning to accomplish. 1. The idea here is that humans understand the difference between the two sentences, while natural-language-processors such as LLMs could find this challenging because there is an ambiguous pronoun there. READ SLIDE 2. However: if we understand learning to be the capacity to change, then it actually becomes very difficult to say that humans are capable of disambiguating these schemas! Why: because times change, ideas change, and the context is always very variable when it comes to communication. Demonstrators, for example at Occupy sessions here at Erasmus, are not violent. In this case, it was the university who has violent, as we know. 3. Unfortunately, much of machine learning has not learned this, and this is why LLMs are often very, very uncreative and disappointing. 4. The authors of the groundbreaking paper which led to things like the language models we use now also fail to understand this, when they say that “humans do not require large supervised datasets to be competent, fluid, etc.” — this is absolutely wrong. Humans exist in learning networks of sociality, such as the home, the neighborhood, the institution and their histories, and these are MASSIVE datasets, which require a lot of supervision. 5. This is where it becomes interesting to think about bias, and I would like to already pass this question onto Joao (whose work I am very appreciative and extremely in support of), because: it is impossible to be unbiased when every statement we make, every move inside or outside language, is always situated and has unavoidable preferences, conscious or unconscious, known or unknown. How do we make sure we work towards acknowledging this, not trying to become fooled by the promise of an impossible objectivity or neutrality? 6. The joke, remains the same: why do we want to increase scale and speed for bad decisions? This is the main question behind language-modeling today. Do we need scaling up, or do we actually need to rethink what, exactly, we are scaling? ______ Some problems w/ NLP history (fools, unfaithful wives, but there are countless of examples) Highlights from sustainability: impact bigger than the airline industry (van Wynsberghe, Crawford, other one), and [[Highlights for impact of genAI]] Things to check in advance Major problems w/ VC initiatives, and especially OpenAI Take notes from email to Gijs and Liesbeth about AI in the classroom. https://www.erasmusx.io/project/chatgpt-in-higher-education check also other projects https://www.erasmusx.io/projects Chatbots (I will call them this because this is what they are: the presentation of statistics as dialogue, whether humans are any different from this is an unanswerable question at this point, the only hint I will give at this, in the depths of the research I go further, is that humans as statistical machines that halt because of sleep, wear and tear, etc., and are also of a different speed than machines: this is why we build machines) are useful “sparring partners”. This is the conclusion, and the main thing I want to say today. Capital-driven initiatives that speculate, gamble on the future, are very volatile and thus dangerous (Durand ref). OpenAI is a problem. So is Eurudite. Moving fast and breaking things is what malfunction essentially is: [The motto “move fast and break things” is often associated with Facebook, as it was one of the company’s core values until 2014](https://www.snopes.com/fact-check/move-fast-break-things-facebook-motto/)[1](https://www.snopes.com/fact-check/move-fast-break-things-facebook-motto/). [It reflects the idea that innovation requires taking risks and experimenting, even if it means making mistakes or disrupting the status quo](https://hbr.org/2019/12/why-move-fast-and-break-things-doesnt-work-anymore)[2](https://hbr.org/2019/12/why-move-fast-and-break-things-doesnt-work-anymore). This shameless experimentalism, which is far from scientific, is the ruling ethos behind most globe-engulfing AI research today. This should not become the template for universities. Unfortunately, at Erasmus, a highly capitalist enterprise, a place where student protests bring in the armed forces who drag student bodies out of buildings and where you cannot even eat your own food on the campus’ central point canteen, continue to try to convince us that this is not a place of study but a place of business: a place where you are first an obeying, paying customer, and even then there is no guarantee that you get what you pay for (examples). Prompt-engineering (what I said 10 years ago), essentially: understanding how a machine works so that we can ask it the right kinds of questions, are our only way around the current crisis in the pedagogical employment of LLMs. Lady Lovelace already said this much, quoted by Turing: a machine can only perform what we know how to ask it to perform. The rest is all gambling. As Miriyam Aouragh explains: the paradigms of e.g. white privilege need to be countered w… One of the knee-jerk fears of Generative AI in the classroom is the classic new technology fear: “who will be doing the thinking?!” This is a fair criticism, the automation of everything means we are done existing and life has become irrelevant. However, in the realm of education, it is incredibly useful to be able to interact with a dynamic, intellectual sparring partner, so long as this is done collectively, outside the paradigm of testing individual student capability. Precisely because no actor acts singularly, and we exist in an interdependent, solidary condition as humans, we should not fear becoming “overly reliant” on somebody or something else’s efforts. We should already worry when we exist in a situation in which dramatically underpaid people support our daily activities, here on campus, by cleaning and maintaining everything running while we lounge and distractedly consider the future of humanity. It is rather sad. Students can be critical, and should be critical, and this criticality is what needs to be taught in order to develop the capacity to discern interesting from not-so-interesting AI-generated content. **Take highlights from impact of genAI, education section** and note on fact-checking recommendation and comment, in order to end with: - Would you recommend the university to invest in a calculator that only _sometimes_ gives you the right answer? - Why would we, then, recommend a language model that contains not only plenty of inaccurate information, but is moreover a minefield of racist, sexist, and other unavoidably baked-in biases? It is this question that I would like to discuss with you all. Paviljoen can’t eat your own food People walking around leaving trash behind letting the cleaners do the work Not leaving clean toilets behind etc. what is wrong with you? https://www.wired.com/story/women-in-tech-openai-board/ https://www.erasmusmagazine.nl/en/2023/10/11/erasmus-has-its-own-chatgpt/ --> super sympathetic to Joao’s work, but we need a better understanding of BIAS “Another advantage is that answers should be less biased. “Academic research is the only input. The ELM is also less America-centric. For example, ChatGPT will sometimes give American answers to Dutch legal questions.” However, Gonçalves can’t guarantee that the ELM will never use ‘racist’ language, as happened in a presentation of a Google language model. “EUR researchers sometimes conduct research involving old documents, which can be racist, so that language could be reproduced by the ELM.” At the same time, the ELM is less heavily censored than ChatGPT. Gonçalves adds, “With ChatGPT, for example, texts containing hate speech are not permitted in answers. We want researchers to be able to research everything, also hate speech. So we look for a balance between that academic freedom and preventing the spread of hatred.”” LLMs are way overhyped (Emily Bender, a.k.a. _Stochastic Parrots_ PI). ZF: Synthetic text-production is very useful when you need a string of characters that resemble something that came before. But science is all about that which is _different_ from what came before. And taking a bag of words, shaking it up and adding a chatbot interface that sort of gives a semblance of coherence, is such a scientific _lie_. Let’s not be dragged by the hype, and be a little more critical: following Bender, let’s stop using the word AI, it is only hype. Talk about probabilistic media synthesis, using x, y or z axioms. Explain what you are talking about, stop contributing to the bad, capitalist hype. One-size-does not fit all (Karen Hao): we cannot build a system that is based on the statistics of the past. There is no such thing as “good” prompt-engineering when the probabilities that are being queried are based on very problematic histories (racism, sexism, etc.). There is no such thing a “unbiased”. (Ahmed) (Gebru) phd tu eindhoven Can and should philosophers employ large language models or other artificial intelligence tools in the course of doing ethics? Eindhoven University of Technology (TU/e), in collaboration with the inter-university research consortium, "Ethics of Socially Disruptive Technologies (ESDiT)," seeks to hire a PhD student for a four-year project on whether (and, if so, how) philosophers can use AI technologies to improve ethics methodology. **Job Description** In the last few decades, philosophers have speculated about how AI systems could support or enhance individual moral reasoning, deliberation, and other moral functioning. Following recent developments in AI research, philosophers may well ask what role AI can play within ethics as a discipline. Can AI technologies be used to improve the methods of ethics, including the methods that ethicists of technology use for responding to socially disruptive technologies? Socially disruptive technologies pose new types of ethical challenges. AI could conceivably help us respond to those challenges by facilitating better ethical theorizing and decision-making. For instance, philosophers might attempt to incorporate AI technologies into processes of conceptual analysis, conceptual engineering, reflective equilibrium, development of new ethical principles and theories, generation of possible arguments and objections, generation of counterexamples to theories, preliminary identification of morally significant risks and benefits in scenarios, and so on. One way to approach such an ambition would be by using finetuning and special prompting to create large language model-based systems to perform or assist with some of these tasks. Philosophers might also try using AI systems to support exploratory anticipation and prospection activities—developing AI systems that help generate technomoral scenarios with a range of salient features, for human discussion and reflection. To facilitate group deliberation, philosophers might incorporate AI into systems for more frequently and efficiently eliciting, analyzing, and aggregating beliefs and preferences within groups, identifying and characterizing points of overlap and disagreement, and helping humans communicate with other humans about what their values are, what norms they endorse, their reasons for their moral views, etc. Insofar as decisions about technology design, implementation, maintenance, modification, etc., should be informed by the values and preferences of stakeholders, researchers might use AI systems to more effectively request and synthesize inputs from ordinary people about how various designs are falling short in ethically-significant dimensions. Alternately, philosophers may want to develop AI agents for engaging in dialogue or negotiating on behalf of individuals or interest groups. The project is not committed from the outset to the idea that AI systems _should_ be incorporated into ethics methodology any time soon or even ever. There are many potential objections to incorporating AI into ethics—e.g., that human individuals bear special duties to perform certain aspects of ethical reasoning or discernment for themselves, that use of AI within certain ethics tasks would reduce the value of those tasks, or that the opaqueness of the AI systems involved mean that humans cannot rely on AI systems for certain purposes. In this PhD project, the student will characterize some ways in which AI might conceivably be used to improve ethics methodology and they will develop and defend a position on whether humans should or should not attempt to incorporate AI into ethical methodology in those ways.