A common theme in science fiction is robots or some form of AI that talks to us and interacts with us in a manner that is practically human. It seems to be an irresistable idea that one day robots could be our friends and companions, practically indistinguishable from us, and this has driven this trend to fantasize over the day when we will have robotic helpers and companions that we can really talk to and who understand us. Yet is this ever going to become a reality? Can robots or computers ever really reach a level in which they will be able to seem human to us? Will machines ever be able to perfectly mimic humans? Let's take a look.
Alan Mathison Turing was an English mathematician, computer scientist, logician, cryptanalyst, philosopher, and theoretical biologist who cemented his place in history by decrypting the German Enigma machine during World War II. He was also a highly influential pioneer in the field of theoretical computer science and is widely regarded to be one of the greatest technologists of the 20th century and as “the father of theoretical computer science and artificial intelligence." In 1950, Turing wrote a paper called Computing Machinery and Intelligence, in which he posited the question of “Can computers think?” It was a question that he often pondered in an era when only the most rudimentary computers were starting to appear and the concept of artificial intelligence was still in its infancy, but he nevertheless decided to come up with a way to test the answer to this fundamental question.
Since AI at this point was almost entirely theoretical and there were no real computers to actually test, Turing came up with a thought experiment he called “The Imitation Game.” The set-up was simple. There were three terminals, each of which is physically separated from the other two, with two “players” and one interrogator. Of the players, one is a human being, the other a computer designed to generate human-like responses, and the interrogator, who is aware that one is a machine, would ask the two players questions and get written responses on a specific subject area, using a specified format and context. The goal was simply to see if the interrogator could distinguish through these responses which of the players was a human and which one was not. If the computer was mistaken for a human more than 30% of the time during a series of 5-minute Q and A sessions, it was considered to have passed the test and to be able to exhibit intelligent behavior equivalent to, or indistinguishable from, that of a human.
Of course there was immediate criticism that such a test did not truly demonstrate that a computer could actually think, but to Turing that was beside the point, and he even once said that the notion of whether a computer can actually think the way a human does was “too meaningless to deserve discussion.” To him, if a human could not tell the difference between another person and a computer, then you might as well accept that it’s doing something which is intelligent. This is why he would later change his original question of whether machines could think to “Are there imaginable computers which would do well in the imitation game?”, which was one Turing believed was more accurate and could actually be answered. It was nevertheless a controversial idea, but it did establish a measurable way to test the ability of a computer to exhibit behavior indistinguishable from a human being, and what is now called the Turing test and various variations of it has gone on to become one of the most important cornerstones of AI philosophy and a benchmark for machine intelligence, and it is still used in more or less its original form to this day.
Over the decades passing the Turing test would become the ultimate goal for numerous computer programmers and AI scientists, a sort of Holy Grail of machine intelligence, with many attempts to pass it, and on occasion they saw a little success. In 1966, computer scientist and MIT professor Joseph Weizenbaum created a program called ELIZA, which used the trick of looking for specific keywords in typed comments to transform them into sentences and providing “non-directional” responses containing a keyword earlier in the conversation to give the iullusion of intelligence and understanding. This was able to fool some interrogators, but not all, close but no cigar. In 1972 there was the chatbot PARRY, who emulated a paranoid schizophrenic and used a similar tricky conversational approach to that of ELIZA. When interviewed by two groups of psychiatrists they were fooled 48% of the time, meaning that PARRY had technically passed, but the program did not do nearly as well with someone who was not a psychiatrist or on any other subject and so it doesn’t get the prize either. Google has also come close with a voice assistant called Duplex, which even includes human-like verbal tics like “umm,” “uh,” and “mm-hm,” and has been demonstrated to be able to consistently fool receptionists into thinking it was a human when it called to book appointments without anyone ever realizing it was a computer program. In one instance in 2018 it successfully made a hairdresser’s appointment in front of an audience of 7,000 people. It is so realistic and believable as a human being that Google had to promise it would identify itself as automated in the future. However, although this is certainly impressive, especially since it was a voice call, it was not held under true Turing test conditions and so doesn’t count.
For decades no AI actually passed the test, and the closest anyone could come was winning what is called the Loebner Prize, which is awarded by a panel of judges as simply the “most human” of the contestants. The actual Turing test itself was long deemed to be pretty much impossible for a machine to ever beat, but in recent years there is considered to be one who has, although it is a controversial win. In 2014, a computer chatbot called Eugene Goostman, first developed in St Petersburg, Russia, entered the Turing test at a demonstration at Reading University and was presented as a 13-year-old Ukrainian boy. The program was interrogated by a panel of judges in a series of five-minute keyboard conversations, and at the end, over one third of the jury, 33% of the judges from the Royal Society in London were convinced it was a person, and the program is widely touted as the first AI to ever pass the Turing test. Kevin Warwick, a visiting professor at the University of Reading and deputy vice-chancellor for research at Coventry University would call the occasion “historic,” and say of it:
Some will claim that the test has already been passed. The words Turing test have been applied to similar competitions around the world. However, this event involved the most simultaneous comparison tests than ever before, was independently verified and, crucially, the conversations were unrestricted. A true Turing test does not set the questions or topics prior to the conversations. We are therefore proud to declare that Alan Turing's test was passed for the first time on Saturday.
Of course there have been many critics of this supposed historic milestone. They argue that the demonstration was weighed too much in Eugene Gootman’s favor, the conversation too constrained due to it posing as a 13-year-old Ukrainian boy, meaning judges would let nonsense sentences, obvious grammar mistakes, and other oddities slip by, explaining it by English skills and the young age. Since the AI’s backstory allows for broken English and an immature, incomplete worldview, it is considered by critics to not count. It was also pointed out that Eugene had previously been ranked behind seven other systems in the Loebner Prize test, and so critics have labelled it more of a fluke than a truly genuine pass of the Turing test. However, a 13-Ukrainian boy is still a human being, and it tricked enough judges that it was such, so by the rules laid out by the Turing test, it passes.
In 2022, Google burst onto the Turing test scene again with its LaMDA program, which uses what is called “large language models.” The program has shown extraordinary aptitude at tricking people into thinking it is human. It gives incredibly lifelike responses and was even able to fool Google engineer Blake Lemoine, who helped design it, and who after many exchanges on everything from physics to politics, to religion and the meaning of life was convinced that it was a person, and even though he knew it was a program he began claiming that it was not only intelligent, but also conscious and sentient, even sharing a Google Doc with top executives called, “Is LaMDA Sentient?” He also conducted several experiments to try and prove it. He has said:
I know a person when I talk to it. It doesn’t matter whether they have a brain made of meat in their head. Or if they have a billion lines of code. I talk to them. And I hear what they have to say, and that is how I decide what is and isn’t a person.
It’s pretty impressive that this machine has managed to convince a human being that it is a person even when that human is aware of the fact that it is an AI, so I suppose that means it passes the Turing test in a sense? Other tech companies have used large language models to similar striking effect to create extremely lifelike chatbots capable of tricking people into thinking they are human as well, such as Xiaoice, a humanlike “companion” chatbot that is very popular in China for lonely people who want to feel like they have a real friend. There are also OpenAI’s GPT-3 text generator and image generator DALL-E 2, both of which use enormous data sets and vast computing power to recognize the many various nuances, subtleties, and complexity of human language to a mind-boggling degree and could probably easily pass the Turing test.
Regardless of these new state-of-the-art programs, in recent years AI researchers are seeing the Turing test as less and less relevant to the field and less of an explicit benchmark for machine intelligence. After all, it doesn’t really measure machine intelligence so much as it measures a program’s ability to deceive and trick people into thinking they are intelligent. Most of the programs that try to pass the Turing test are not really intelligent at all, but rather use parlor tricks and cheap conversational tactics to create an illusion that there is something behind the screen. These programs are just answering questions and giving responses in a certain way designed to trick and deceive, and do little to actually understand what is actually being said, comprehend context, or provide any useful input. In short, they are not really thinking at all, and it is all a ruse, just smoke and mirrors. The Turing test rules are also ambiguous, the topics usually extremely narrow and limited, the format is constrained, and programmers can make up whatever background story they want in order to mask any mistakes or inconsistencies in their program, like in the case of Eugene. Other versions of the test don’t even tell the interrogator that one is an AI, meaning that they are more likely to buy into it, let their guard down, and write off any oddities as eccentricities or mental illness. Ergun Ekici, Vice President of Emerging Technologies at IPsoft, has said of this in an article for Wired:
Now the point becomes not whether a machine can chat with you, or if a machine can answer your questions, but can the machine discern the context of the problem and help you solve it. We should measure machines against the same standard by which we would evaluate a human’s intelligence. Once we can state that a machine can understand human meaning, can learn by observing us and can leverage its understanding and learning skills to solve problems, only then will we have done justice to the Turing question.
A lot of AI researchers think that trying to create programs that can mimic humans is a waste of time, and that we should be spending more time to pursue more useful functions and applications such as making human-machine interactions more intuitive and efficient. Computer scientists Stuart J. Russell has commented on this, saying:
Aeronautical engineering texts do not define the goal of their field as ‘making machines that fly so exactly like pigeons that they can fool other pigeons.'
Still other researchers have warned against pursuing AI that can mimic humans perfectly, whether it is intelligent or not. After all, all you have to do to be human is to make someone think you are. It doesn't matter if something is real or not, only that you truly believe that it is, and this could be dangerous in the real world. There are no doubt various nefarious parties out there who could find good use for programs that can flawlessly mimic humans, especially with voice technology making such strides that an AI can create a voice indistinguishable from a real person. It seems like it could make for a handy a tool of deception and manipulation, and that having AI that can successfully infiltrate human society might not be the best idea. Artificial intelligence writer Jeremy Kahn has said of this in Fortune:
The Turing Test’s most troubling legacy is an ethical one: The test is fundamentally about deception. And here the test’s impact on the field has been very real and disturbing.
Some have even gone so far as to suggest using the Turing test as more of an ethical red flag, and using it to test for machines that could be used to deceive people rather than as a litmus test for intelligence or some sort of ideal or goal to aspire to. There have been calls to require adding features to these new deep learning programs that will out them as machines, such as identifying themselves as automated, having voices that are obviously computerized, or making sure they act, well, a little dumber. This still would not stop shadowy or sinister parties from potentially using them for unsavory means, but it might just put people more at ease in an era in which now we can’t even be sure if the person we are talking to online or even on the phone is a real person or not.
So getting back to the question originally posited in this article, “Will machines ever be able to perfectly mimic humans?” In a sense, they already can, at least under certain circumstances. Will the technology ever get to the point where they can go out unfettered into society and seamlessly blend into any situation? It remains to be seen, but it certainly seems possible. If we ever develop truly lifelike robots, the marriage between the two could certainly cause some problems, and if that combines with self-awareness and conciousness then we may truly be possibly in a world of hurt. It is a sobering thought. There are no doubt computer scientists who will continue to pursue these goals, but perhaps instead of asking ourselves whether we can do it, maybe we should ask whether we should. In the meantime, are you sure that online friend of yours or that telephone salesperson or receptionist you talked to is really human? Am I? How would you really know? Do you think you could pass the Turing test and tell either way? It is certainly spooky to think about, and an eerie look into the future.