Readers of this blog may be familiar with the concept of “Friendly AI” — the project of making sure that artificial intelligences will do what we say without harming us (or, at the least, that they will not rise up and kill us all). In a recent issue of The New Atlantis, the authors of this blog have explored this idea at some length.First, Charles T. Rubin, in his essay “Machine Morality and Human Responsibility,” uses Karel Čapek’s 1921 play R.U.R. — which introduced the word “robot” — to explore the different things people mean when they describe “Friendly AI,” and the conflicting motivations people have for wanting to create it. And he shows why it is that the play actually evinces a much deeper understanding of the meaning and stakes of engineering morality than can be found in the work of today’s Friendly AI researchers:

By design, the moral machine is a safe slave, doing what we want to have done and would rather not do for ourselves. Mastery over slaves is notoriously bad for the moral character of the masters, but all the worse, one might think, when their mastery becomes increasingly nominal…. The robot rebellion in the play just makes obvious what would have been true about the hierarchy between men and robots even if the design for robots had worked out exactly as their creators had hoped. The possibility that we are developing our “new robot overlords” is a joke with an edge to it precisely to the extent that there is unease about the question of what will be left for humans to do as we make it possible for ourselves to do less and less.

Professor Rubin’s essay also probes and challenges the work of contemporary machine-morality writers Wendell Wallach and Colin Allen, as well as Eliezer Yudkowsky.In “The Problem with ‘Friendly’ Artificial Intelligence,” a response to Professor Rubin’s essay, Adam Keiper and I further explore the motivations behind creating Friendly AI. We also delve into Mr. Yudkowsky’s specific proposal for how we are supposed to create Friendly AI, and we argue that a being that is sentient and autonomous but guaranteed to act “friendly” is a technical impossibility:

To state the problem in terms that Friendly AI researchers might concede, a utilitarian calculus is all well and good, but only when one has not only great powers of prediction about the likelihood of myriad possible outcomes, but certainty and consensus on how one values the different outcomes. Yet it is precisely the debate over just what those valuations should be that is the stuff of moral inquiry. And this is even more the case when all of the possible outcomes in a situation are bad, or when several are good but cannot all be had at once. Simply picking certain outcomes — like pain, death, bodily alteration, and violation of personal environment — and asserting them as absolute moral wrongs does nothing to resolve the difficulty of ethical dilemmas in which they are pitted against each other (as, fully understood, they usually are). Friendly AI theorists seem to believe that they have found a way to bypass all of the difficult questions of philosophy and ethics, but in fact they have just closed their eyes to them.

These are just short extracts from long essays with multi-pronged arguments — we might run longer excerpts here on Futurisms at some point, and as always, we welcome your feedback.


  1. We have encountered this problem before. Let's look at the original speculation about artificial persons from Leviathan by Thomas Hobbes:

    "NATURE (the art whereby God hath made and governs the world) is by the art of man, as in many other things, so in this also imitated, that it can make an artificial animal. For seeing life is but a motion of limbs, the beginning whereof is in some principal part within, why may we not say that all automata (engines that move themselves by springs and wheels as doth a watch) have an artificial life? For what is the heart, but a spring; and the nerves, but so many strings; and the joints, but so many wheels, giving motion to the whole body, such as was intended by the Artificer?"

    This looks like it's leading up to a description of either a robot or Frankenstein's monster but Hobbes is actually comparing the State to an artificial person. The problem of ensuring that a government remains friendly is still unsolved.

    So if you even wondered if Lenin was actually a Mad Scientist, the answer is, of course, yes. Lenin even looked like a Mad Scientist, from the Lex Luthor hair style to the Beard of Evil …

  2. Mr. Yudkowsky is good at outlining potential problems— we must grant his imagination is fertile— but he never unveils specific approaches to those problems. One gets the idea that the ultimate solution to the inevitable arrival of unfriendly AI will be found by running a Ouija board over a carving of Bayes's Theorem.

Comments are closed.