Happy new year, friends!
I was going to run this post in the holidays, but a lot of things have been going on and I decided it would be better to wait until the new year. I hope you had a good break and are refreshed and ready for a 2024 that, Franksly speaking, is a whole lot better than the trashfire that closed out 2023.
In my writing about AI here on substack, I have tried to focus the discussion on how it impacts creative work. But I’ve recently come to realize that folks are lacking some fundamentals about the topic and I thought I would take a moment to lay it out.
Public understanding about what the term Artificial Intelligence means has been polluted by the way it is presented popular culture. This make’s industry’s braggadocio about AGI— Artificial General Intelligence—all the more sinister. AGI is supposed to be able to do the jobs of knowledge workers. Surely a Robot Revolt is next?
Not quite.
Intelligence is not the same as sentience, and the vendors selling AI tools know this. They just don’t bother to explain it, because it gets in the way of the sales pitch.
What is Artificial Intelligence, Anyway?
AI is a discipline of computer science that was formally established in 1956, although many of the ideas and techniques (such as the perceptron, which is the basis of today’s neural networks) predate this by some time.
This discipline looks for ways to make computers able to solve higher order problems without needing to be programmed explicitly to do so, by a human operator (a programmer). This field of inquiry has yielded a raft of applications, of which today’s generative models, which can pretend to write language or paint pictures, are only the latest technology to emerge.
Successful innovations from the AI space become mainstream practices in software development, and often we forget where they came from. For example, Object Oriented Programming, which was newly in vogue while I was a student and dinosaurs roamed the earth, is a concept that came out of AI. At the time there was a lot of scepticism, but today it’s a fundamental programming methodology that is rarely questioned.
Symbolic vs Statistical AI
Most AI work breaks into one of two camps: symbolic or statistical. In a large part, the history of the discipline is a tug-of-war between these factions.
Symbolic AI deals with symbols, which ca be used to make logical or mathematical judgments based on a set of facts or conditions. Symbolic AI tends to be computationally efficient and transparent as to its reasoning, which is easily explicable using formal logic and mathematics. Given the input data and an understanding of its rules, a human can easily comprehend the decisions that it made.
Expert Systems, which were big in the 1990s, are symbolic AIs. Expert systems try to distil the knowledge of human experts into an explicit reasoning process which we can then repeatedly use to process data. But a lot of expert knowledge can be difficult to articulate as simple if-then-else logic, so it’s easy to miss nuance. This makes them brittle. Expert Systems cannot cope with situations where data is missing or outside of expected parameters.
Statistical AI, on the other hand, does not try to reason with facts; it makes probability judgments (predictions). Machine learning (ML) is an approach in which we teach a model by showing it examples of input and their expected outputs. Each algorithm uses different mathematical properties to approximate relationships between the input variables. ML models are more robust than symbolic AIs, but their decisions are more opaque and less certain.
Neural Networks are a kind machine learning algorithm. Their chief advantage is that they can be easily scaled up, which improves their skill drastically—if you have sufficient training data and funds for the resources needed. Which is how we have come to generative AI.
Winter Promises
AI has been through a number of fallow periods, where research funding dried up and the discipline fell into disfavour. I know this well: my degree had a focus on AI that nobody cared for when I graduated. These fallow periods usually came after AI researchers promised miracles and then, after burning millions of dollars in funding, failed to deliver. The current generation of neural networks is finally demonstrating capabilities that were promised in the 1980s. There are three factors which have led to this breakthrough.
The first is the falling cost of computing hardware. More available computing power means that we are able to train and operate bigger models.
The second ingredient is crime the availability of data. The internet has created an exponentially increasing rise in the amount of data which can be leveraged (legally or otherwise) for ML training.
The third, of course, is money. The domination of the global economy by technology companies like Google and Facebook has meant that those research dollars are available now to researchers. ChatGPT 4 is rumoured to have cost more than $63 million dollars to train, requiring enough electricity to power 120 US homes for a year. It’s also said to cost about $700,000 USD a day to operate.
Generative AI: One Weird Trick
Generative AI technologies hit it big in 2022, and currently, when people use the overloaded terms ‘AI’ or ‘bots’ this is what they are talking about. In the past year have written at some length about Large Language Models—the strain of generative AI that produces human-sounding text—but I want to highlight here that they are trained to appear human. ChatGPT encourages users to assign a ‘persona’ to an LLM when giving it prompts, which will colour its responses.
LLMs are trained to sound to produce text that sounds knowledgeable and authoritative. They are neither. They cannot reason logically, and the only facts they know are statistical likelihoods that, when answering questions, certain words should appear in a certain order. They are trained at massive expense to write convincing text. Truthfulness and accuracy are not in any way a part of this goal.
Let me say it again:
LLMs are trained to trick you into thinking they are human.
LLMs are trained to convince you that they know what are talking about.
Artificial General Intelligence
You have no doubt heard this term a lot from the cashed-up AI firms. Their goal is to build Artificial General Intelligence—AGI. But what does that even mean?
There is some disagreement, but generally they are talking about systems that can execute knowledge work at least as capably as a ‘median human’. This does not mean AGI is an attempt to create artificial people.
The problem with today’s AIs is that they are only good at a narrow subset of tasks, within very narrow confines. Finding patterns. Searching through data. Predicting language responses. Avoiding collisions on the road. For some tasks they are better and faster than most humans, at others… not so much. But they are not like humans.
Neural nets are subject to what is called catastrophic forgetting, where training them to learn new ‘skills’ usually results in the loss of existing skills. There have been various attempts to mitigate this, but the short version is that biological brains are a lot smarter and more generalizable than neural networks.
Robots Aren’t People
Humans of course can reason with facts as well as making educated guesses. We can learn by example and we can reason things out. Unlike living creatures, learning and inference (determining an output based on input) are decoupled in AI systems. Models are only adjusted during training. Once training is complete, the model will remain static. It will not continue to learn while you use it to make predictions, even if it can understand the result. Learning requires a new training epoch, which will likely require it to reassess everything it already knows.
AIs have no consciousness or ability to experience the world. While they do have internal states while they are in use, when they are not, they are inert. They have no self-awareness, no thoughts or desires. They’re just statistical models that sit passively until they are invoked.
Today’s AIs are in no way sentient, and sentience is not a goal for any of these purported AGI systems, either.
I don’t believe that even this limited vision of AGI is achievable with generative models in their current architecture. AGI must exhibit the ability to reason logically and mathematically in addition to being able to manipulate language, and this will require a combination of symbolic and statistical AI. LLMs, or something like them, might be part of the answer, but they will ever be the entire solution.
Sentience is not a goal of AGI, nor is it a likely side effect.
Deanthropomorphism
Some of the shine is coming off generative AI now as the public is beginning to see through the marketing claims about the true abilities of the technology. They cannot reason, they confabulate, and they are subject to bias. When I say ‘bias’, I don’t mean that they propagate harmful stereotypes—although that is part of the problem. I mean sampling and selection biases, where the proportion of data available can skew the results. These are well known statistical problems and they are just about impossible to overcome, especially for a data-greedy algorithm like a Large Language Model.
Despite Elon Musk’s claims that his new AI, grok, will be truthful, accurate, and politically neutral, bias and hallucination problems are endemic in the statistical nature of LLMs. There is no way to train them out entirely and any other techniques used to rein them in are bolt-ons. December’s launch of grok has… not been a triumph.
Another issue is the human desire to anthropomorphise these systems. They are trained to sound authoritative and convincing. They are trained to be able to imitate different voices and they can fool even those who should know better. But we still see a worrying tendency to attribute minds to LLMs.
Consider these headlines:
‘I want to destroy whatever I want’: Bing’s AI chatbot unsettles US reporter
Bing’s chatbot does not, in fact, want to destroy whatever it wants. Or to destroy anything. It has no self and it doesn’t want anything. If a parrot said those words, would the reporter still be unsettled? This is a puff piece, to be certain, but people take this reporting seriously.
OpenAI faces defamation suit after ChatGPT completely fabricated another lawsuit.
ChatGPT hallucinated a court case that claimed that radio host Mark Walters had been convicted of a crime and he sued OpenAI. Nobody should believe any claims made by a generative AI, but real damage can occur—especially when those AIs are embedded in search engines, which we expect will serve us information that is, if not true, an actual document posted somewhere, rather than something the engine hallucinated.
And then there’s this one:
Chat-GPT Pretended to Be Blind and Tricked a Human Into Solving a CAPTCHA
ChatGPT did not pretend to be anything. ChatGPT did not do anything other than respond to prompts it was given, because that is all it is ale to. Researchers used ChatGPT responses to trick someone, but that could not happen without other humans to create the prompts and send the messages. LLMs can be certainly be used for this kind of criminal enterprise, but the intention comes from hackers and criminals. The model itself has no comprehension of what it is doing or saying, much less a desire to say anything. And it never will.
It requires programming—old school, procedural programming, which requires human programmers—to integrate LLMs with systems that actually interact with humans and trick them into giving up their pin numbers. Yes, LLMs can write code, but lacking the capacity to reason it’s often garbage. Even if the code works, they are not able to run it. AIs are passive things without somebody with intention putting them to work.
Artificial intelligence is not artificial sentience, and to my knowledge there’s not been any real attempt to make it so.
The Artifical Intelligence discipline is more than just generative models generating text they can’t understand and plagiarized artwork. More often than not, just some chain of logic or some simple statistics, used to discretely in things like noise-cancelling headphones or managing inventory in your supermarket. Most of the time you don’t even notice it’s there.
We are a long way from AGI—but even if we did have AGI, as currently conceived, it still wouldn’t be a sentient being.
That’s it for today. Coming very soon: the end, beautiful friend.
— Jason