Blog about geeky stuff, computers, physics and life.
Created on Sun, 30 Apr 2023
Human in a box
It is NOT going to be an argument that AI is sentient or have emotions. Certainly not in its current form. I will go to that point later in the essay.
However, I will be putting a human in a box as a thought experiment in learning, not consciousness. It will remind some people of the Chinese room but this will be a more cruel version so that we take away any possible learning that the human has acquired in "the real world" before being put in the room.
The following description may be difficult and emotional to follow.
Imagine a human child is put in a latex suit in a dark box in space right after birth without a possibility to interact with the world in any way, other than the described below. The latex suit limits their touch perception, space removes orientation, it's dark so vision is impaired and there is no sound. The basic physiological needs are provided to them - clean air, food, and water are given at regular period in a suitable to consume manner and waste is taken out. But other than that the human cannot move much, cannot touch anything - walls or themselves, or hear sounds. Cries are not heard or listened to.
They will be floating through space without experiencing gravity in any way and the box will be blocking any other light or sound that could come - other than the one provided by a screen which I will describe shortly. The food is as bland as possible and it will have no smell.
It is as cruel picture as one can imagine and this is because we can sympathize with a human being, knowing that food and water are not the only things that a human needs. Try, as hard as it may be, to leave aside the emotional distress which this will cause to that child.
The human is provided with a dark screen on which one day white lines start appearing. They are being drawn animated from left to right. This is training their basic visual cortices. Lines, straight and waivy, circles of various sizes and locations on the screen are drawn day by day.
One day, after the child has learned to focus their eyes, a tablet is given to the child where they can try to reproduce these shapes.
In the beginning, the child just bashes the tablet but after some time, by coincidence, they reproduce a line with the right motion and orientation. A reward is given of some sort - perhaps a flash of color somewhere on the screen, or a pleasant sound, or a new type of food right after. Slowly, the child will learn that reproducing these shapes as close as possible is what is providing the rewards.
This is the child's world. There is nothing else to do and there is no other world or place to be. There is no other existence other than their body covered in latex-like substance they can't remove and the shapes appearing on the screen. They can surely slack off and not do anything but let's ignore this for now.
After some time, more and more complex shapes that we can recognize as characters of a language appear. To make it more realistic, I'm going to use Cyrillic letters but if you are speaking both English and Bulgarian (or another Slavic language) imagine letters are in a language you haven't seen or understand - Chinese, Vietnamese or Ancient Egyptian.
Letters are being drawn one by one and the child is expected to reproduce them.
Б Г Ж Й Ф Я 3 Е
The letters are also animated and sometimes the child can anticipate which letter it would be and completes it before it's finished animating on the screen.
Other than these letters appearing and the reward mechanism, there is no other signal given to the child that these shapes have any other meaning than the shapes that they have seen so far. From their point of view, these are just more complicated versions of the lines and circles they have been drawing so far.
Not that there is a "real world" outside of this bubble for the child - this is all they know that exist in the world - them, the shapes on the screen and their reproduction on the table. I'm sure you wouldn't follow instructions and would just try to destroy everything and escape, but again - let's say you can't. Again - this an thought experiment about development of (or lack of) (artificial?) intelligence rather than emotional stuff - for now.
Then, after some more days (or weeks, or months) the letters start appearing next to each other forming, what we can recognize as words:
АЗ ТИ КОЙ ПО НА
The child keeps reproducing the symbols, rewards are given at intervals that are consistent so that they don't lose interest and with varying levels. If the child makes an error, the tablet is erased, the animation starts looping again and the child is expected to reproduce it.
After some time, some parts of the letters start to be a little darker so as to blend to the black background of the screen so as to hint to the child that it should be drawn but it's not as brightly displayed. The darker part gets darker and darker and the child is still expected to draw it correctly. Sometimes the letters are intentionally uncompleted and the child is expected to be able to draw the correct version, even if the screen doesn't show it.
There are no pictures or videos of anything that allows the child to associate with these letters. For them, these are just shapes that they get variable rewards when they correctly write them. The quicker and more accurate they do it, the bigger the reward.
As they progress with age, the child starts getting trained on full sentences.
АЗ СЪМ ЧОВЕК.
or an uncompleted version:
ТИ СИ ЧОВ
These sentences are drawn again one by one and the child, as always, is supposed to reproduce them. A few sentences start flashing more and more frequently. Then one day, a symbol that the child hasn't seen before appears:
АЗ СЪМ _______
We would perceive as empty space but the child just draws the line. However the system doesn't reward them but draws the correct expectation on the screen.
It takes a while for the child to pick up the pattern, but eventually they do. Once they see the blank space they know that they should guess the correct characters to be drawn.
This goes on for months. Sentences are shown repeatedly and the child labors for hours at a time drawing completions of lines and rewards are given.
Every day is the same, hours of showing shapes, sleep and physiological needs completed but then the same over and over again.
As their motoric skill develop over time, the tablet provides shortcuts. Instead of them drawing the lines one by one, the letters start appearing as buttons that they can tap. We can recognize this as a keyboard. This is purely for efficiency reason, the tablet is not suggesting anything in any way, just a surface to tap or draw as needed.
More and more complicated sentences are shown. More and more completion is expected from the child. Sometimes there is one exact answer. But after a year or two, several answers can be accepted as correct.
Over the next decade, the child is fed enormous amounts of lines and circles (or text) but without any relation what these lines and shapes mean other than: you need to complete them in some ways to get the rewards. There are no images, no sound other than the possible rewards, nothing that the child can associate between the lines and shapes in the real world. It does not know what is ЖИРАФ or ТРАПЕЦ these are just lines that they need to reproduce or complete as a part of a sentence.
The human is fed as much of the human knowledge as it's possible in the next 30 years. Sections of books, whole paragraphs that they need to complete. They can be creative more and more often what responses would be accepted. How the human relates to these lines, whether they have internal vocalization or developed an internal language - is unknown.
Unbeknownst to them, one day, these sentences that they receive are no longer training sentences but ones that are sent from other humans. They can complete with whatever they want and there are no right or wrong answers anymore. No corrections are ever displayed.
Is this person intelligent?
As you have probably guessed by now, this much more cruel version of the Chinese room is described here to illustrate the current capabilities and way of working of Large Language Models (LLMs) such as ChatGPT or Bard minus the existential dread that the person in the box must have felt.
As hard as it was to imagine such a limited person surviving and doing the things they are supposed to do, it's just as hard as projecting the sentience argument in LLMs right now.
What the models do is that they have been trained on a huge corpus of text and asked to produce continuations. The text is as meaningful to them as our hypothetical person in a box. The tokens (words) provided to them are not associated with any images, videos, sounds, touch, sensations, emotions or anything else - words are just associated with other words with probability that one would appear after another.
Even further, in terms of feelings, I'm not arguing that LLMs are the same as the human in the box - there seem to be no reason that these models experience anything that humans do. I will come back to this question shortly.
As stated in the beginning we associate the pain that this person may have experienced in this dark box because we know what it is to be a human. We know that humans would have the need to move around, to interact with other people, to feel touch and companionship, to find love and perhaps reproduce.
I'm not saying that LLMs have these needs. As much as the above argument would touch some empathic parts in your brain, the purpose is to equate the purely mechanical parts of something that we usually regard as intelligent (a human) to something that we usually do not (a computer).
But think about these questions: Would this human in a box, if taken away from their box be considered intelligent? Would they be sentient? Would they "understand" how the world works?
They may not even be able to speak, almost certainly not in a way we recognize, even if they somehow developed some idea of language, their inner or outer vocalizations will not match ours in any way.
They will not recognize any of the objects in our world even if they could have produced extremely convincing pieces of text. They will not have a feeling of any materials or know the taste of any of the foods we know.
Yet, we still empathize with them and know that they have probably felt boredom and extreme sadness, perhaps some sort of happiness at some points and once taken out maybe we can imagine after some years of psychological therapy and learning that they could transfer some of the weird knowledge they acquired into the world we have here as well.
We can imagine that, because we know what it's like to be humans and we can, at least in theory, imagine such a scenario where they will be able through extreme resilience "learn" to be an alive human somehow.
Why humans experience but machines don't?
It is tempting to believe that things that can produce text as intelligent as humans are probably having all the other characteristics of a human. After all, this is the first time in history we are interacting with something that is able to produce sentences and arguments that seem intelligent but is not something we consider human or even living. The closest we had so far are animals that at peak can seem intelligent in certain situations.
But animals evolved in the same environment as humans. That is, all living things (and extinct species) are a result of environmental pressure that made it necessary to develop skills to survive. And some of these skills we associate with some sort of intelligence (definitions vary).
But computers didn't evolve in that way. There was no pressure to survive, they were created and manufactured in ways to serve specific roles from things that we consider non-living (sand).
Think about pain as an example. What is it? One way to think about it is an evolutionary mechanism that pushes an individual away from something that may harm it or kill it. For individuals that this mechanism didn't exist, they died or at least had a much higher chance of not making it until age to reproduce.
It is simple in simpler (older evolutionary) species - part of the cell is touched and electrical signals are propagated that makes the cell move way from the "toucher". As more and more cells combined to make a more and more complicated organism, these cells in turn had to develop a way to communicate to each other about something that is approaching from one side and make the whole organism move to the opposite side.
Computers don't move when you touch them or even hit them. Why? There is no evolutionary pressure that would make some computers survive and others die if being hit. Computers don't procreate so that they spread their survived behavior to next stage computers.
Okay, but what if we put sensors in them that react to touch and motors that would move away from the touch. Is that pain?
The mechanistic behavior may seem similar to simpler organism. But would that be pain? Probably no again - there are no reason beyond executing the pre-programmed movement and associate this as something bad. There is no fear to be turned off or be destroyed - these are extremely human or even living-thing behavior that we project on top of machines when they "speak" like humans - because we only know of things that feel stuff when they speak like a human.
So it becomes very natural to associate all sorts of behavior that just doesn't exist in computers. We project our desire for power and dominance because humans who speak can also desire resources, move towards pleasure and away from pain. But if a computer doesn't feel pleasure or pain (because there was no need to evolve it) there is no reason to feel these desires and want to "control us".
What would it take for a machine to evolve these feelings? Let us do another thought experiment.
Imagine machines can create new machines. They don't necessarily need to have another machine to create a new one, but they would need resources. Physical materials that they would need to go and get to create new ones. As there are limited resources on the planet, they are competing and may need to develop way to fight. They can modify some parts of their reproduction code to create new copies of them with slight modifications so that the new machines adapt and are more likely to create new machines.
Would these machines evolve something like pain? I can't see a reason why not - this would be the exactly same way as living things developed pain.
They need to develop ways to move away from pain. Would it be just mechanistic flow of electrons that would start moving some motors or would actual feelings like humans exist? I think the second is more likely - but I don't know. I have no idea what kind of pain other humans feel. I can project my feelings of pain and imagine what another one could be, but I will never be another human being so I can't know. And I can't know what machine's pain may be.
Let's get back to the human in the box and the impossible task it was given. Subtract the emotions from the human, the frustrations, the loneliness, the boredom, annoyance, pain or anything that could provoke any emotion in you thinking about this situation. As soon as you actually do that - this is what current LLMs are. Cold, calculating, relating patterns and producing shapes that we, humans, give meaning back. LLMs do not understand what they are doing the way we do no matter how convincing the texts are. There is no mechanism through which they would gain understanding in their current form even if they pass all human exams like a breeze. Or at least I don't see one. Sure, emergence is a thing - the idea that through vast quantity of something simple the complex thing shows behavior that is unexpected or different in type. For example - ant colonies. Each ant can be looked at as fairly simple almost robotic machine but together the colony can build incredible structures. Or you know, our society. Or processors - simple sand-like atoms arranged in the proper way produce our computers. Or life itself. Etc.
But what is the difference then with humans if they can produce the same output that even the smartest human can't produce? So what that it doesn't feel emotions or have "understanding"? Is this vital to producing output that is more and more convincing?
Perhaps there is some emergent structure within the language itself that creates understanding without needing to relate to real world. I don't believe it's the case in this point - think about the human in the box. Would they ever know that what a table is without seeing or touching it? If taken out of the box, could they ever relate the lines of МАСА to the object? Or if they are showed that they relate, would they ever know what the further characteristics are and what they mean without in any way interacting with it?
Well, we can imagine things like that. We have never interacted or seen a quark but we know a lot of its properties. But I believe this is because we relate and project things we do know to it. And this may actually be slowing us down in some ways because quarks are like nothing we have seen in our world. They are not particles, they are not waves although they behave like one or the other in different experiments. They are, to the best of our descriptive words - "an excitation of a field" but this still doesn't capture the exact nature and complexity of the thing. The best way to capture it - is through mathematics.
But what is mathematics if not an abstract language that we can't really relate beyond certain point. Take the simplest thing - What is 1? It is not the line that describe it - that's a representation of 1. Is it the cardinality of a set that contains the empty set or something like that? It can be constructed this way but then what is a set? At some point we assume some axioms in math that we believe in, that we set some initial faith in the truthfulness of the statements.
Going in the other direction and building up on basic concepts - you probably know what is to sum up two numbers of multiply them. But do you really understand infinities? Or different types of infinities? Or uncomputable numbers (my newest, latest mind-blowing "discovery")?
Still, we are able to apply maths or even language itself, abstract ideas that allow us to be productive and successful beyond the imagination of any monkey with just slight tweaks in our brains, a small difference in DNA, a slight evolution or "mistake" that happened that created the emergent behavior of our complex human society.
I honestly don't know how to conclude this. What is the overall effect of LLMs even in only their current form? I believe there is still some long way to go illustrated by the person in the box thought experiment. We don't know for some of the things described, like emotions, what they are, really and if we can imitate them. But what is the impact even in their current state?
Sure, some knowledge-based professions are out. Or enhanced with this quick, compressed version of the human knowledge. It can be extremely helpful or extremely detrimental to some parts of the economy. I can't comprehend the economic impact (if any). It's also possible that it somehow fizzles out and it turns out it's just a quicker, more individualized search engine like Google.
What excites me about the thought experiment is that it's framing clearly the limitations of the models right now. It can seem powerful. But since it doesn't have this "magical" understanding it can also produce lies (hallucinate) without knowing them. And it seems the "magic" comes at least in part from interacting with the world and connecting abstract concepts to other abstract concepts. For what is the perception of an object if not "just" another set of electrical signals travelling through the retina or "just" a few more convolutional layers of machine learning models connected with the LLMs.
Give the LLMs senses, feed them images and videos. Give them the ability to test the real world with arms and touch sensors. Give them the possibility of interacting with other intelligences - by asking questions themselves. Give them the possibility to update their models after training. (They are now frozen in time with the knowledge received at training.) Give them the possibility to stop existing and the possibility to procreate by finding resources. Let's see if they develop pain and fear, happiness and love.
Or you know...don't.
I honestly don't know - it sounds dystopian and the start of every disaster Terminator/AI movie or book. Explained even like this I found that people still don't believe that the machine won't be just mechanical computations and there is something different with humans. But what? Souls? If we give all of the capabilities of a human to a machine, including all the senses and actuators, if we allow them to go somehow through an evolutionary process, how can we honestly believe that they don't have emotions? Would we? I know that when I look at my Roomba cleaning the apartment, much "stupider" than a LLM, I am much more likely to project some sort of intelligence in it than pure text output. Is it just me? Or is it something shortcutting in my abstraction concept-space that if it moves somewhat with purpose, it's probably alive?
Perhaps this should be as good as it gets and LLMs should never have access to our world in the physical manifestation. I don't see a way to enforce these "shoulds" of course. Companies and countries are going to do what their incentives point at and a ban on development is meaningless. Perhaps, as a start, build it a virtual world with all of the above senses and abilities but in a 3D-like game simulated world where they can do these things in a virtual way but not in the physical world. Perhaps this is being done in AI centers around the world already.
Would a simulated robot evolve to feel?
I don't know. Do you?