Over the summer of 1956, an enterprising group of scientists, engineers, and mathematicians held a workshop at Dartmouth College to explore the idea of a new area of scientific research: artificial intelligence. The proposal for the workshop, penned a year earlier and now famous in the field, reads:
The study is to proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it. An attempt will be made to find how to make machines use language, form abstractions and concepts, solve kinds of problems now reserved for humans, and improve themselves.
One of the group’s luminaries, the economist Herbert Simon, prophesied some years later that “machines will be capable, within twenty years, of doing any work that a man can do.” And another, the computer scientist Marvin Minsky, declared that “within a generation, I am convinced, few compartments of intellect will remain outside the machine’s realm — the problem of creating ‘artificial intelligence’ will be substantially solved.”
At the birth of the discipline of AI we find these two core ideas. First, all aspects of human learning and intelligence — including language, problem-solving, and abstract thought — can in principle be rigorously codified and reproduced by machines. Second, given the accelerating pace of technological progress, the development of truly intelligent machines is inevitable; it’s only a question of time.
The prophesies have not come true within a generation, nor within three. Instead, each generation has had its own prophets of the imminent arrival of human-level AI — and each its skeptics. The latest broadside comes from computer scientist and AI entrepreneur Erik J. Larson in the provocative new book The Myth of Artificial Intelligence: Why Computers Can’t Think the Way We Do.
The book is many things: a fascinating history of AI’s roots in the work of Alan Turing and his contemporaries, a powerful and wide-ranging exposé of the limitations of today’s statistics-based “deep learning,” a constructive account of what aspects of the human mind AI still cannot capture, and a warning about the harm that AI hype is inflicting on science and the culture of innovation.
To understand the real prospects of human-level AI, says Larson, “we need to better appreciate the only true intelligence we know — our own.” This focus is spot on. With an innovative theory about a kind of human intelligence that is not only beyond the abilities of AI but largely beyond the notice of researchers, Larson offers what seems like an attempt to defend the human against the mechanical.
Yet while the critique of AI hype points us in the right direction, it is not radical enough. For Larson is fixated on intelligence’s logical aspects. And while he insightfully argues that even human logic goes beyond what today’s “deep learning” systems can do, in defending the human in this way he misses the broader picture. Our intelligent human capacities to identify meaning and practical significance in the realm of human action, to respond flexibly to social context and cultural nuance, to read literature and tell stories with understanding, to converse and reason with others, and to decide what to do and how to live — all this goes far beyond the domain of formal logic. To make sense of these capacities, one must look to something else.
Humans have long imagined the possibility of living and thinking machines. We find the term “automata” in Homer’s Iliad, in reference to the automated machinery of the god of fire and craft, Hephaestus. The golem in Jewish lore and Frankenstein are familiar tales. But the dream of engineering intelligent life seemed almost to be on the cusp of materializing in the twentieth century, when computer pioneer Alan Turing offered a precise mathematical definition of a digital, programmable “computing machine” and proposed that intelligence itself could be understood in terms of computation: processes of transforming “inputs” into “outputs” on the basis of well-defined algorithms.
In his seminal 1950 paper, “Computing Machinery and Intelligence,” Turing argued that the question of whether a machine is intelligent was too vague to be considered directly, and so he replaced that question with a test of whether the machine could convincingly imitate human behavior, now called the Turing Test. The setup is perhaps familiar: A human judge engages in a teletype conversation with both another human and the computer, not knowing which is which. If the judge cannot tell the difference, then, says Turing, we have no good reason to deny that the machine is intelligent.
This move was hugely influential. Activities of the mind — consciousness, thought, creativity, will, imagination, and agency — that were once considered to be explainable only through philosophical reflection were now reformulated as precisely defined cognitive functions. And these functions could in turn be measured and compared between humans and machines. Turing was optimistic that computers would eventually pass his test, arguing that by the year 2000 “one will be able to speak of machines thinking without expecting to be contradicted.”
The processing power of computer hardware, and the capabilities of algorithms, have expanded vertiginously since the days of Turing and the Dartmouth workshop. The dominant approach to engineering AI systems, too, has changed. Classical AI systems, which were dominant from the 1960s to the early 2000s, directly followed the Dartmouth statement: They aimed not only to produce the same results as human intelligence, but to replicate human thought processes. These classic systems used formal rules hand-coded by human experts, defining what the system should do in every relevant circumstance. Classical AI was governed by deduction. It was good at certain tasks, like proving theorems in math and logic, but proved ineffective in most other domains.
More recent “machine learning” systems, on the other hand, are based on the logic of statistical and probabilistic inference, or induction. Machine-learning systems are given a well-defined task and large amounts of training data, based on which they form their own models and algorithms for executing the task, by looking for and analyzing patterns in the dataset. Hence, the programs are said to “learn.”
“Deep learning” is a further advance in AI that uses somewhat more complex statistical techniques. The term “learning” is bolstered by the fact that these systems use “artificial neural networks.” This is a fancy term for a rather basic statistical feedback technique that is loosely inspired by the functioning of real neurons. When expanded to a large enough scale, backed by enough computing power, and in some cases carefully fine-tuned — as they recently have been — neural nets can achieve remarkable feats of pattern-matching.
Deep learning is now all the rage in AI. For example, DeepMind’s AlphaGo system beat the world champion Lee Sedol at the exceedingly complex game Go in 2016. GPT-3, a language system released just last year by the OpenAI lab, has garnered murmurs of being another significant leap forward, with philosopher David Chalmers describing it as having “impressive conversational and writing abilities” and being “instantly one of the most interesting and important AI systems ever produced.”
AI has also become big business. Google bought DeepMind in 2014 for around half a billion dollars. Governments are following the lead of venture capital and investing huge sums in AI research. And the increasing reach of AI into commercial, public, and private life has become a staple of tech reporting: high-speed algorithmic stock trading, search engines, targeted ads, music and movie recommendation systems, dating apps, decision-support software for doctors and criminal judges, surveillance systems, facial-recognition and voice-identification technology for authoritarian governments — the list goes on.
Most of these AI systems are not intended to simulate human-level intelligence, but only to perform highly specific tasks in highly specific domains — what some call “narrow” AI. The holy grail is still human-level AI, also called “general” AI: truly intelligent machines with the flexibility, depth, creativity, and unconstrained range of application of the conscious human mind.
High hopes for the imminent arrival of general AI have been there from the very beginning, and the successes of machine learning keep them high. DeepMind cofounder Demis Hassabis once described the company’s goal as “solving intelligence, and then using that to solve everything else.” Former Google CEO and chairman Eric Schmidt has claimed that AI will solve all the big problems, from cancer to climate change, education to population growth. Singularity guru Ray Kurzweil, now a Google director of engineering, has predicted for years that we will have AI with human-level general intelligence by 2029. Once achieved, he believes, computers will quickly surpass human intelligence by analyzing all the world’s texts, scientific documents, and other information, and exponentially improving themselves.
More anxious than hopeful, Bill Gates, Elon Musk, Henry Kissinger, the late Stephen Hawking, and many others have been worried that “super-intelligent” AI may become humanity’s biggest existential risk.
Erik Larson doesn’t quite buy it. The first target of his critique is what he calls the “AI myth,” which claims that the discovery of the holy grail is inevitable, that “we have already embarked on the path that will lead to human-level AI, and then superintelligence.” “We have not,” says Larson. Despite its impressive successes in narrow domains, AI is nowhere near human-level, flexible, general intelligence capable of understanding meaning and context. We may one day achieve it — that’s a “scientific unknown” — but we need fundamental scientific breakthroughs before we can make real progress in that direction.
Larson challenges the AI myth by exposing its dependence upon a more basic claim, the “intelligence error,” which says that human intelligence can be reduced, without remainder, to calculation and problem-solving. The error goes back to the birth of AI — to Turing’s work and the Dartmouth workshop, reflected in the idea that “the human mind is an information processing system.”
Philosophers call this the “computational theory of mind.” It goes like this: The human brain is a “meat machine,” as Marvin Minsky put it, and the mind is a set of algorithms that run on that machine, processing inputs into outputs. Put another way, the mind is a form of software that runs in the brain, which is “wetware,” akin to computer hardware. A human agent, on this view, takes in bits of information from the physical world through the sense organs, processes this information, formulates a model of the world, and uses the model to make a calculation of how best to achieve a goal, and then acts on the result of the calculations.
Larson rightly emphasizes that “equating a mind with a computer is not scientific, it’s philosophical.” Similar ideas go back to early modern empiricists like Thomas Hobbes, who wrote that reasoning was “nothing but ‘reckoning’” — “adding and subtracting, of the consequences of general names agreed upon for the ‘marking’ and ‘signifying’ of our thoughts.” So, when a man reasons, “he does nothing else but conceive a sum total.” Or, more succinctly: “To reason … is the same as to add or to subtract.” Tempted as we are to picture ourselves in the likeness of our latest machinery, Hobbes’s model for the human mind was a calculating machine; ours is a digital computer manipulating symbols and turning stimuli into behavior according to algorithms.
To be even remotely plausible, a computer model of the human mind must warp — and restrict — our commonsense definitions of intelligence, knowledge, understanding, and action. Larson discusses the renowned computer scientist Stuart Russell, who defines intelligence as simply the efficient pursuit of objectives, based on inputs from the environment. An “intelligent agent,” says Russell, is just a physical “process” whereby a “stream of perceptual inputs is turned into a stream of actions” to achieve a predefined “objective.” As Larson wryly notes, this definition “covers everything from Einstein ‘achieving’ his ‘objective’ when he reimagined physics as relativity, to a daisy turning its face toward the sun.” It places intelligent human activity on the same spectrum as Venus fly traps and shrimp; the difference is merely a matter of complexity.
Among other things, this ignores the reflective aspect of human intelligence — how we discover, imagine, question, and commit to our objectives in the first place, the judgments we make about which objectives really matter in life, and which are trivialities, distractions, irrational cravings. The constricted definition of intelligence also ignores activities with no objective, forms of human mental life that we do for their own sake, like free-ranging conversation, writing poetry or a journal, forming a friendship, or contemplating this mortal coil.
Prominent work in AI pre-defines intelligence in terms of calculated processes toward designated ends, terms that make intelligence fit more easily with automated systems — and thereby invite us to see ourselves in kind.
To understand how the AI myth fails to account for human intelligence, what is most important for Larson is a form of thinking called “abduction.” Larson takes this idea from the nineteenth- and early-twentieth-century American philosopher Charles Sanders Peirce, who mapped the varieties of logical inference — the ways we draw conclusions on the basis of prior beliefs and observations. The familiar forms are deduction, the kind of strict, truth-preserving entailments found in standard logic, mathematical proofs, and classical AI; and induction, the kind of generalizations made by statistics, probability theory, and machine-learning AI. Abduction, says Peirce, is a third form of logical thinking, a kind of insightful hypothesizing that seeks the best explanation of some particular event or phenomenon. A thinker makes an abductive inference when he or she goes beyond applying logical rules (deduction), or spotting correlations in a dataset (induction), and “leaps” to a previously unforeseen explanation or hypothesis.
Larson offers numerous historical and literary examples of abduction. One is Nicolaus Copernicus, who was studying astronomy in the context of the well-established Ptolemaic model of geocentric planetary motion, with Earth at the center of the cosmos and the celestial bodies moving in perfect circles around it. Plenty of data supported the paradigm, and complex models had been developed to try to explain any data that seemed not to fit. What Copernicus did was to make a conjectural leap from the available evidence to an entirely new paradigm, the heliocentric model, with the Sun at the center of the solar system and Earth and the planets encircling it.
This was not a deduction from given logical rules, nor was it an induction from data gleaned through observation. Instead, Copernicus provided a radically new way of seeing the cosmos, and thus of interpreting the data. Initially, there was not even a great deal of data or predictive power to back up his revolutionary idea. Only later, with further observation and theorizing (like Kepler’s idea of elliptical planetary motion), and testing with new technology (like Galileo’s telescope), did it become clear that Copernicus’s model was correct. For Larson, this kind of thinking cannot be reduced to deduction or induction, nor dismissed as random luck — it was a brilliant and creative work of the human intellect.
Larson’s most colorful example of abduction is drawn from Edgar Allan Poe’s masterpiece of crime fiction “The Murders in the Rue Morgue.” In the story, detective Monsieur Dupin confronts a baffling mystery. There has been a double murder in a house; strange voices and shrieks were heard that night; in the house was found a bloody razor and tufts of human hair; plenty of money and valuable property were left in plain sight; one body was found head down, stuffed well up into the chimney, the other outside, badly maimed; bystanders thought they heard someone gruffly speaking a strange foreign language. Dupin surveys these peculiar details, and where the police are simply perplexed, he makes an abductive inference to a possible explanation. He conjectures that the murders were not committed by a human at all, but by a large animal, an orangutan. Dupin’s thinking involves highly subtle and intelligent hypothesizing that reorganizes all of the data in a fresh way — and he solves the case.
But abduction is not only the provenance of brilliant scientists and detectives. Larson follows Peirce in claiming that it actually goes on constantly in human life. It is involved in identifying interesting problems and offering possible solutions; in understanding language, with all its ambiguities of meaning that depend on context; in understanding a complex story; and in being able to fruitfully converse with other human beings. Abduction, claim Peirce and Larson, is even involved at the core of ordinary visual perception: seeing an azalea as an azalea is really an abductive guess from the raw or uninterpreted sense data. “Abduction,” writes Larson, “is inference that sits at the center of all intelligence.” How do we do it? When we make an abduction, he explains, “we guess, out of a background of effectively infinite possibilities, which hypotheses seem likely or plausible.”
Machine-learning AI, rooted in induction, portrays intelligence as “the detection of regularity,” and thus excels at things like finding strategies for games with fixed rules, properly labeling images based on regular patterns in pixel data, filtering spam email, and flagging possible fraud in credit data. But abduction goes beyond regularity, and we simply do not know how to codify it into a formal set of instructions or a statistical model that can run on a machine. A major scientific breakthrough would be needed, Larson argues, to understand how to do it.
Artificial intelligence researchers have long treated the ability to play games as an important proxy for intelligence. Narrow AI has proved masterful at games of a certain kind, like tic-tac-toe, checkers, chess, and Go. These are games with pre-defined objectives, governed by fixed rules, and perfect information, where each player can see all the pieces on the board at all times. As Larson explains of the world of games, they “are closed by rules and they are regular — it’s a kind of bell-curve world where the best moves are the most frequent ones leading to wins.” But, he goes on, “this isn’t the real world that artificial general intelligence must master, which sits outside human-engineered games and research facilities. The difference means everything.” Acting intelligently within the fluid situations humans confront in the everyday world — think of a busy airport terminal, supermarket, or city block — is an activity of a different order entirely, not a more complex version of the same kind as games.
Unlike game worlds, the real world lacks fixed rules and a single, pre-defined objective: “Manhattan isn’t Atari or Go — and it’s not a scaled-up version of it, either,” writes Larson. The real world is ever-changing, open-ended, unpredictable, and inexhaustible in its meaningful details. It is rich in linguistic and cultural meanings, including social roles and settings, gestures, speech acts, texts, stories, traditions, and so forth. How many potentially important details does a buzzing street corner have? Infinitely many.
Making sense of the real world, as we humans do, requires a flexible and mysterious capacity to grasp meaning and interpret what is relevant in unfolding situations, a kind of know-how that researchers gesture at with the vague umbrella term “common sense.” The problem of how to produce human-level or general AI is, in large part, the problem of how to formalize and program the capacity for commonsense knowledge and judgment. This problem has long bedeviled the field of AI, and deep learning, as Larson points out, has not made much progress here.
Consider an example. Next to my laptop sits my well-worn coffee mug. A deep-learning, object-recognition program may be said to “recognize” the coffee mug (or is it a “cup”?) from a photo and tag it “coffee cup.” But the system does not know what a coffee cup is, in the ways we do. One approach in AI is to build a huge database of all bits of commonsense knowledge that may be necessary for the system to aptly deal with the real world — essentially, all the knowledge we have from our physical engagement with things and the whole fabric of our culture codified into discrete items of information, together with algorithms for applying the knowledge in context. This is extremely difficult.
How many “facts” or “truths” do I know about my coffee mug? I know that it has a hard surface, that it will break if I drop it from my window (but not onto the carpet), that it will hold enough liquid to satisfy my craving for hot coffee (but not so much that the coffee will be cold after I’ve had enough), that it serves just as well for drinking tea or hot cocoa (but not champagne), that it weighs less than the planet Saturn (but more than a speck of dust), that it will hold a handful of AAA batteries, that spilling it puts my laptop at risk, that it would not taste good if I tried to bite it, that my wife would not appreciate my giving it to her as a gift on her birthday, that it cannot be used for playing baseball, that it cannot be sad or frustrated, that it was not involved in the French Revolution, and on and on forever. But I do not actually “know” these things, in the sense of having explicitly formulated them as thoughts and stored them on my “meat machine.” Most of them never crossed my mind until now, for the purpose of writing this essay, while some had always been on my mind, but in the background, without becoming explicit thoughts. But I know them all tacitly, in the sense that I could easily draw upon them and countless other facts, if necessary in a given context.
And that’s just my coffee mug. How many things does an average adult know, as a matter of common sense? How many a three-year-old? Larson quotes Yehoshua Bar-Hillel, a pioneer in machine translation, who first noted the problem of common sense for AI natural-language interpretation: “The number of facts we human beings know is, in a certain very pregnant sense, infinite.”
The problem of how to turn common sense into a computable algorithm is not just a matter of the total quantity of discrete facts a human being knows, whether explicitly or tacitly. The problem is deeper. Common sense is primarily about how we tell what is relevant, what matters, in a given situation. Return to my coffee mug: A draft of wind blows through my open window — and suddenly none of the facts I considered earlier about my mug is relevant, only the fact that it can serve as a paperweight (whereas the crumbs on the table cannot, and I don’t even consider them). Common sense is the ability to navigate this terrain, and spot what makes sense and what matters in real time, often without even noticing. It is our highly flexible ability, first, to perceive and focus on what is most relevant in a changing situation (the gust of wind and ruffling papers); second, to ignore what is trivial (the crumbs); and third, to draw upon an infinite background of worldly knowledge and bodily skill in handling the demands of the moment (the mug can serve as a paperweight). This is why common sense is ubiquitous and essential in intelligent human practice, though it is characteristically invisible to us. As Ludwig Wittgenstein said, “The aspects of things that are most important for us are hidden because of their simplicity and familiarity. (One is unable to notice something — because it is always before one’s eyes).”
Larson claims that abductive inference is central to commonsense knowledge, including our ability to understand and speak a language:
Common sense is itself mysterious, precisely because it doesn’t fit into logical frameworks like deduction or induction. Abduction captures the insight that much of our everyday reasoning is a kind of detective work, where we see facts (data) as clues to help us make sense of things.
To borrow a classic example from the philosopher John Haugeland, imagine that I say to you, “I left my raincoat in the bathtub, because it was still wet.” Many people don’t even notice that the sentence is ambiguous, that “it” — that which is wet — could be the raincoat or the bathtub. But, given our commonsense understanding of raincoats, bathtubs, water, and generally intelligible human motivations, we immediately interpret the sentence to mean I left the raincoat in the bathtub, because the raincoat was wet. We don’t put dry raincoats in bathtubs that are wet. That’s common sense, and you knew it without thinking about it. For AI, this is a real problem.
Abduction, argues Larson, is how we manage everyday challenges of language, like parsing ambiguous meanings, distinguishing literal and figurative speech, making sense of metaphors, grasping irony and sarcasm, understanding narratives, and even just engaging in genuine, natural conversation, all things that remain exceedingly difficult for AI.
Larson’s critique of the limits of classical and machine-learning AI is informative and often persuasive. But his primary focus on abduction as the “core mystery of human intelligence” is a serious limitation. Abduction is surely an important part of common sense, but not necessarily the core. Abduction is a form of inference. It is detective work, aiming to solve a problem, read a situation rightly in order to make sense of it, or understand ambiguous speech. Even if these activities seem beyond the ken of today’s AI, they still have what programmers would call “well-defined” objectives, pre-defined outcomes that we could imagine serving as the target for training an algorithm on the basis of some dataset.
Much of our commonsense experience is not like that. Take, for example, writing a journal, reading a novel, starting up a friendship, listening to a new style of music, playing “pretend” with one’s children, or, to take the paradigmatic task for testing human-level AI, having a conversation.
Why do humans converse? What’s the objective? We converse to share information, to share a laugh, to be polite and make someone feel welcome, to be rude and put others in their place, to bribe, to threaten, to seduce, to spy, to gossip, to philosophize, to plan, to celebrate, to change someone’s mind about religion or morality, to sell someone drugs or a new refrigerator, and many other things besides. Many objectives might be involved in any given conversation, including objectives that arise while talking and that could not be spelled out at the outset. You and I start a conversation, at first just to pass the time in the elevator, and then we discover a mutual interest and, walking down the hall, our conversation opens up into fresh and unanticipated territory.
The open-ended and potentially transformative nature of human conversation has made it a serious challenge for AI and, as Larson rightly points out, systems tend to only do well on Turing-style tests by fooling human participants with trickery and evasion, for example by repeating the content of the person’s statements in the form of a question, changing the subject, being evasive instead of flexibly responsive, or otherwise ensuring that humans cannot go “off script.”
In an infamous case at a Royal Society event in 2014, a chatbot called Eugene Goostman succeeded in tricking a third of the judges into thinking it was human. The BBC hailed this as a “historic” event: “Computer passes AI Turing test in ‘world first.’” But when you look more closely, the sensation quickly dissipates. The test was only five minutes long, and Goostman “succeeded,” in part, by claiming to be an unruly Ukrainian teenager, a non-native English speaker. (One of its engineers was Ukrainian-born.) It succeeded through misdirection and non sequiturs, not genuine comprehension of natural language. The judges were far less exacting than they might have been had Goostman pretended to be a native English speaker. Of course, two-thirds of the judges were not fooled, and few serious analysts now think this was any kind of breakthrough. It reveals more about the infectious hype surrounding AI. Goostman, says Larson, “is a fraud.”
But Larson doesn’t go far enough in exploring why the trickery was advantageous for Goostman. True, it’s partly because current AI systems are incapable of venturing successful guesses about the meaning of what’s being said based on real-world context. But that is because in human conversation, as much as in complex texts, in-depth understanding requires grasping cultural and moral concepts that matter to us because we are living beings.
When we interpret the world around us, we do so with the help of an expansive range of concepts, rich in emotions and values: love, trust, betrayal, longing, hope, grief, remorse, shame, passion, abandonment, commitment, deception, guilt, generosity, brutality, humor, bravery, selfishness, wisdom, and countless others.
Philosophers call these “anthropocentric” or “thick” concepts, because they combine thought, feeling, and value in particularly human-centered ways. These concepts are elements of our commonsense skills for grasping the meaning or sense of a person’s actions, of speech, of a conversation, of a story, or a literary text.
Consider Shakespeare’s Othello. In order to understand the character Othello and his central place in the story, we need to understand that he loves, or takes himself to love, Desdemona; we need to understand Iago’s actions as lies and psychological manipulations driven by bitterness and jealousy; we need to make sense of Othello’s mounting anxiety and anger; we need to understand that he is going mad with paranoia; and, in the tragic and bloody culmination of the play, we need to see the point of his grief, shame, and sense of loss of something irreplaceable. This makes his suicide intelligible within the larger context of the work. So the human-centered concepts of love, guilt, jealousy, grief, betrayal, madness, and many others are essential to understanding the meaning of the characters’ actions and their dialogue, and, thus, of the story as a whole.
Champions of deep-learning AI may be inclined to treat a text like Othello simply as a collection of “information,” like a weather report, that could in principle be extracted by a system with no knowledge of the meaning of love, jealousy, anger, grief, betrayal, and so forth. This is hopelessly naïve. Without the background of commonsense understanding and a grasp of these concepts, we wouldn’t find an intelligible story of purposive human actions at all, just sequences of words and events, one damn thing after another.
The point is not that AI needs to have Shakespeare-level intelligence, of course. There aren’t many human Shakespeares, alas. But humans the world over can understand his plays and the actions, emotions, and motivations of their characters, can be moved by these works and find resonances between them and their own life experience, and can return to them time and again and appreciate further levels of meaning. This is human-level intelligence par excellence.
Could we somehow program human-level knowledge of concepts like love, guilt, and betrayal? Not given the current data-focused nature of AI. Understanding such concepts is not simply a matter of being able to properly sort examples of human speech and action into pre-defined categories, with such-and-such degree of likelihood, on the basis of prior instances — akin to how “affective computing” tries to use facial-recognition AI to sort human faces into pre-fab emotional categories such as happy, sad, and angry. Our understanding of concepts like love, guilt, and betrayal cannot be strictly described and classified in terms of correct performances of any task or function. Why not?
First, concepts like love, guilt, and betrayal are meaningful to us humans because of how they impact our own lives — we know why they matter. If we did not have any clue of their emotional character and how they affect our lives, we would not be able to fully learn their meaning and to make sense of the actions and motivations they generate. We know them from the “inside,” so to speak, not just as observable behaviors.
Second, concepts like love, guilt, and betrayal are multi-layered, capable of being understood at varying degrees of depth. To understand a Shakespeare play, or even just a good mystery novel, one must have a suitable grasp of love, guilt, and betrayal and of how they matter in our lives. But this also means that one must be able to ask questions like: Does Othello really love Desdemona, or is he merely infatuated or obsessed? Why does his belief that Desdemona betrayed him drive him mad? Is it even possible to have the absolute certainty about another person’s inner life that Othello, in his paranoia, craves? Questions like these open up the possibility that, for example, what looks like love may actually be something else, or that it is merely superficial, or that it is indeed more profound than we at first imagined.
These are not questions about data that could be answered by seeking generalizations in statistics on human motivation and behavior (like in the machine-learning approach). They are also not questions about the correct meaning of an ambiguous text, which could be answered by offering a hypothesis based on context (as in Larson’s abduction). They are questions inviting open-ended interpretive thought, with no definitive answer. We engage in this kind of exercise of understanding not just with “high” art or literature. Children’s fables and stories, too, are full of concepts like love, guilt, grief, longing, homecoming and many others — all of which can open up to deeper, more probing thought and more refined understanding with greater life experience.
Together, these human-centric concepts form a framework through which we “take in” and interpret the human world. We acquire this set of ideas and interpretive skills through enculturation, the learning of language, life experience, and the reflection on our experience, all of which are rooted in our physical, emotional, and biological nature. These concepts are indispensable for any non-superficial grasp of human life — which means they are also fundamental for human intelligence. There can be no “human-level” general AI, one worthy of the name, that does not adequately imitate this level of human thought.
Does this set the bar too high for general AI? No, it brings into view what human intelligence is, what that holy grail of AI research would be. As John Haugeland wrote in a classic and still pressing 1979 paper:
There is no reason whatsoever to believe that there is a difference in kind between understanding “everyday English” and appreciating literature. Apart from a few highly restricted domains, like playing chess, analyzing mass spectra, or making airline reservations, the most ordinary conversations are fraught with life and all its meanings.
At present, we do not know how to program a deepening understanding of anthropocentric concepts in computer systems. As Larson ably demonstrates, the most advanced AI programs for reading and grasping the meaning of texts are severely limited: “The state of the art for reading comprehension by machines is pitiful.” But the missing piece is not only abductive inference, as Larson seems to think. Haugeland was closer to the mark in arguing that what AI lacks is an “existential” and “holistic” grasp of what things mean, which requires caring about things, being concerned with our own existence, and seeing the world as mattering. Thus Haugeland’s memorable pronouncement: “The trouble with Artificial Intelligence is that computers don’t give a damn.”
A proper, non-simplified understanding of what human-level intelligence really is requires that we recognize all that our mental life is, over and above its logical capacities. Our mental life bears concepts like love, guilt, and betrayal, which allow both a shallow and deep grasp of human speech and actions. In our mental life we experience the world as mattering — even in the negative or depressive instance of seeing the world as meaningless, as mere “sound and fury.” An algorithm doesn’t even “give a damn” enough to experience things as empty and vain. And in our mental life we recognize that we have a life to lead, a life for which the world matters, and without which it doesn’t. Our mental life is an enigmatic nexus of all these things at once. As Haugeland writes in his 2000 book Having Thought, “No system lacking a sense of itself as ‘somebody’ with a complete life of its own (and about which it particularly cares) can possibly be adequate as a model of human understanding.” This is a picture of human thought and intelligence that goes beyond systematic calculation of means to ends, spotting correlations in data, or even abductive inferences to hypotheses and explanations.
Which brings us back to the Turing Test and why, properly done, it remains so challenging for current AI. Human conversation is a basic element of our nature as social, meaning-seeking animals, something that cannot be reduced to the exchange of symbols or to the mechanical production of any particular pre-defined outcome. Conversation ranges over potentially anything. It has indefinitely many aims. It draws upon deep wells of shared commonsense understanding of natural language and the world. Conversation — from daily chit-chat to political debate to romantic seduction to therapy — reflects the fact that human intelligence is wrapped up in our emotional involvement with the world. Conversation can yield flashes of mutual insight and deepen a person’s perspective on a topic; it can even transform one’s worldview or outlook on life.
Consider the mind-bending (though admittedly scripted) conversations between Socrates and his interlocutors, who in the course of talking become convinced of positions they formerly opposed. Or consider how a careless word from a stranger can sow in us the worry that we might be unlikeable, maybe leading us to interpret with some suspicion what our friends say to us. Or consider someone’s change of mind, through a conversation, from perceiving another person not as an enemy but a potential friend, and thus what she says as possible gestures of care rather than disdain. In each of these cases, human intelligence displays the ability to shift the meaning of speech based on how it matters to us. We do this, literally, all the time.
Properly understood, mimicking human conversation is indeed a tall order for computer science.
Why does it matter, really, that people are hyping up the looming arrival of human-level AI? Larson argues that the “AI myth” and the “intelligence error” are fundamentally “bad for science,” because they undermine “a healthy culture for innovation” that should explore “unknowns,” such as the mysteries of the conscious human mind.
The failures of the current approach are expensive. The publicly funded BRAIN initiative (Brain Research through Advancing Innovative Neurotechnologies), launched by President Obama in 2013, has thus far dispersed over $1.3 billion to researchers. And the European Union’s Human Brain Project, launched the same year, was funded with a ten-year grant of the same amount to produce, as Larson writes, a “complete computer simulation of the entire brain” with the aid of AI.
Larson describes how the AI myth has fueled the rise of Big Data science, where big money is thrown at massive projects based on supercomputers and deep-learning techniques in the hope that new discoveries will somehow emerge from the data. This diverts resources from the ingenuity, passion, and brilliance of human scientists, who make explanatory leaps and form new ideas. AI as a research project thus devalues the human potential and excellence of scientists, threatening to turn them into a “hive” or “swarm” of computer technicians who lack a proper taste for innovation and genuine explanation.
Larson makes a plausible and provocative case. But to my mind, the most salient risks lie elsewhere. We are rapidly ceding power and authority to narrow AI systems, which in numerous areas of human life operate behind opaque layers of code and trade secrecy protections. We do this because of AI’s supposedly superior degree of accuracy, consistency, efficiency, and fairness relative to the limited information processing power of the “meat machines” in our skulls. AI systems are emerging in courtrooms, children’s toys, intelligence agencies, hospitals and retirement homes, and in one way or another in nearly every area of work and commerce.
Even granting for the sake of argument that there are many genuine benefits to all this, it still means that we gullibly attribute intelligence to computers. And so we are becoming increasingly dependent upon, and biased in favor of, algorithmic decision-making in our professional and even personal lives. The catch should now be clear: We are doing this even though narrow AI systems lack common sense and any real understanding of human experience, motivations, concepts, and values.
We must grapple seriously with the prospect of dehumanization —the loss of human agency, knowledge, and control — as we substitute human action and responsibility with AI systems that lack true comprehension of human affairs. This is the most salient risk of the AI myth. To guard against that risk, we must foster a readiness to challenge algorithmic biases, to recognize and respect the uniqueness and depth of human intelligence, to celebrate rather than downgrade as unscientific the kind of understanding that can arise through literature and humanistic inquiry, and to identify what areas of life we wish to leave up to our own agency — and to that of our children, whose future we are building. We must discipline our Promethean longings to ensure that our future with artificial systems strengthens, rather than weakens, our humanity.
Can Machines Have Common Sense?