Facing the Intelligence Explosion

My Own Story

Sometime this century, machines will surpass human levels of intelligence and ability. This event — the “Singularity” — will be the most important event in our history, and navigating it wisely will be the most important thing we can ever do. Luminaries from Alan Turing and Jack Good to Bill Joy and Stephen Hawking have warned us about this, and it doesn’t depend on the occurrence of Ray Kurzweil’s particular vision of an “accelerating change” Singularity. Why do I think Hawking and company are right, and what can we do about it? Facing the Intelligence Explosion is my attempt to answer these questions.

Personal background

I’ll begin with my personal background. It will help to know who I am and where I’m coming from. That information is some evidence about how you should respond to the other things I say. When my religious beliefs finally succumbed to reality, I deconverted and started a blog to explain atheism and naturalism to others. Common Sense Atheism became one of the most popular atheism blogs on the internet. I enjoyed translating the papers of professional philosophers into understandable English, and I enjoyed speaking with experts in the field for my podcast Conversations from the Pale Blue Dot. Moreover, losing my religion didn’t tell me what I should believe or should be doing with my life, and I used my blog to search for answers. I’ve also been interested in rationality, at least since my deconversion, during which I discovered that I could easily be strongly confident of things that I had no evidence for, things that had been shown false, and even total nonsense. How could the human brain be so incredibly misled? Obviously, I wasn’t Aristotle’s “rational animal.” Instead, I was Gazzaniga’s rationalizing animal. Critical thinking was a major focus on Common Sense Atheism, and I spent as much time criticizing poor thinking in atheists as I did criticizing poor thinking in theists.

Intelligence explosion

My interest in rationality inevitably lead me (in mid 2010, I think) to a treasure trove of articles on the mainstream cognitive science of rationality: the website Less Wrong. It was here that I first encountered the idea of intelligence explosion, from I.J. Good:

Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an “intelligence explosion,” and the intelligence of man would be left far behind… Thus the first ultraintelligent machine is the last invention that man need ever make.

I tell the story of my first encounter with this famous paragraph here. In short:

Good’s paragraph ran me over like a train. Not because it was absurd, but because it was clearly true. Intelligence explosion was a direct consequence of things I already believed, I just hadn’t noticed! Humans do not automatically propagate their beliefs, so I hadn’t noticed that my worldview already implied intelligence explosion. I spent a week looking for counterarguments, to check whether I was missing something, and then accepted intelligence explosion to be likely (so long as scientific progress continued). And though I hadn’t read Eliezer on the complexity of value, I had read David Hume and Joshua Greene. So I already understood that an arbitrary artificial intelligence would almost certainly not share our values.

My response to this discovery was immediate and transforming:

I put my other projects on hold and spent the next month reading almost everything Eliezer had written. I also found articles by Nick Bostrom and Steve Omohundro. I began writing articles for Less Wrong and learning from the community. I applied to [the] Singularity Institute’s Visiting Fellows program and was accepted. I quit my job in L.A., moved to Berkeley, worked my ass off, got hired, and started collecting research related to rationality and intelligence explosion.

As my friend Will Newsome once said, “Luke seems to have two copies of the ‘Take Ideas Seriously’ gene.”


Of course, what some people laud as “taking serious ideas seriously,” others see as an innate tendency toward fanaticism. Here’s a comment I could imagine someone making:

I’m not surprised. Luke grew up believing that he was on a cosmic mission to save humanity before the world ended with the arrival of a super-powerful being (the return of Christ). He lost his faith and with it, his sense of epic purpose. His fear of nihilism made him susceptible to seduction by something that felt like moral realism… and his need for an epic purpose made him susceptible to seduction by Singularitarianism.

One response I could make to this would be to say that this is just “psychologizing,” and doesn’t address the state of the evidence for the claims I now defend concerning intelligence explosion. That’s true, but again: Plausible facts about my psychology do provide some Bayesian evidence about how you should respond to the words I’m writing in this series. Another response I could make would be to explain why I don’t think this is quite what happened, though elements of it are certainly true. (For example, I don’t recall feeling that the return of Christ was imminent or that I was on a cosmic mission to save every last soul, though as an evangelical Christian I was theologically committed to those positions. But it’s certainly the case that I am drawn to “epic” things, like the rock band Muse and the movie Avatar.) But I don’t want to make this chapter even more so about my personal psychology. A third response would be to appeal to social proof. There seems to be a class of Common Sense Atheism readers that has read my writing so closely that these readers developed a strong respect for my serious commitment to intellectual self-honesty and changing my mind when I’m wrong, and so when I started writing about Singularity issues they thought, “Well, I used to think the Singularity stuff was pretty kooky, but if Luke is taking it seriously then maybe there’s more to it than I’m realizing,” and they followed me to Less Wrong (where I was now posting regularly). I’ll also mention that a significant causal factor in my being made Executive Director of Singularity Institute after so little time with the organization was that Singularity Institute staff could see that I was seriously devoted to rationality and debiasing, seriously devoted to saying “oops” and changing my mind and responding to argument, and seriously devoted to acting on decision theory as often as I could, rather than habit and emotion as I would be inclined to. In surveying my possible responses to the “fanaticism” criticism above, I’ve already put up something of a defense. But that’s about as far as I’ll take it. I want people to take what I say with a solid serving of salt. I am, after all, only human. Hopefully my readers will take into account not only my humanity but also the force of the arguments and evidence I will later supply concerning the arrival of machine superintelligence.

From Skepticism to Technical Rationality

Before I talk about machine superintelligence, I need to talk about rationality. My understanding of rationality shapes the way I see everything, and it is the main reason I take the problems of machine superintelligence seriously.

If I could say only one thing to the “atheist” and “skeptic” communities, it would be this:

Skepticism and critical thinking teach us important lessons: Extraordinary claims require extraordinary evidence. Correlation does not imply causation. Don’t take authority too seriously. Claims should be specific and falsifiable. Remember to apply Occam’s razor. Beware logical fallacies. Be open-minded, but not gullible. Et cetera.

But this is only the beginning. In writings on skepticism and critical thinking, these guidelines are only loosely specified, and they are not mathematically grounded in a well-justified normative theory. Instead, they are a grab bag of vague but generally useful rules of thumb. They provide a great entry point to rational thought, but they are no more than a beginning. For forty years there has been a mainstream cognitive science of rationality, with detailed models of how our thinking goes wrong and well-justified mathematical theories of what it means for a thinking process to be “wrong.” This is what we might call the science and mathematics of technical rationality. It takes more effort to learn and practice than entry-level skepticism does, but it is powerful. It can improve your life and help you to think more clearly about the world’s toughest problems.

You will find the cognitive science of rationality described in every university textbook on thinking and decision making. For example:

You will also find pieces of it in the recent popular-level books on human irrationality. For example:

And you will, of course, find it in the academic journals. Here are links to the Google Scholar results for just a few of the field’s common terms:

So what is this mainstream cognitive science of rationality—or, as I will call it, technical rationality?

Technical Rationality

There are two parts to technical rationality: normative and descriptive.

The normative part describes the laws of thought and action—logic, probability theory, and decision theory. Logic and probability theory describe how you should reason if you want to maximize your chances of acquiring true beliefs. Decision theory describes how you should act if you want to maximize your chances of acquiring what you want. Of course, these are not physical laws but normative laws. You can break these laws if you choose, and people often do. But if you break the laws of logic or probability theory you decrease your chances of arriving at true beliefs; if you break the laws of decision theory you decrease your chances of achieving your goals.

The descriptive part describes not how we should reason and act, but how we usually do reason and act. The descriptive program includes research on how humans think and decide. It also includes a catalog of common ways in which we violate the laws of thought and action from logic, probability theory, and decision theory. A cognitive bias is a particular way of violating logic, probability theory, or decision theory. That’s how “bias” is defined (see, e.g., Thinking and Deciding or Rationality and the Reflective Mind, each of which has a table of common biases and which part of logic, probability theory, or decision theory is violated by each of them).

Cognitive scientists also distinguish two domains of rationality: epistemic and instrumental.

Epistemic rationality concerns forming true beliefs, or having in your head an accurate map of the territory out there in the world. Epistemic rationality is governed by the laws of logic and probability theory.

Instrumental rationality concerns achieving your goals, or maximizing your chances of getting what you want. Or, more formally, maximizing your “expected utility.” This is also known as “winning.” Instrumental rationality is governed by the laws of decision theory.

In a sense, instrumental rationality takes priority, because the point of forming true beliefs is to help you achieve your goals, and sometimes spending too much time on epistemic rationality is not instrumentally rational. For example, I know some people who would be more likely to achieve their goals if they spent less time studying rationality and more time, say, developing their social skills.

Still, it can be useful to talk about epistemic and instrumental rationality separately. Just know that, when I talk about epistemic rationality, I’m talking about following the laws of logic and probability theory, and that, when I talk about instrumental rationality, I’m talking about following the laws of decision theory.

And from now on, when I talk about “rationality,” I mean technical rationality.

Before I say more about rationality, though, I need to be sure we’re clear on what rationality is not. I want to explain why Spock is not rational. (more…)

Why Spock is Not Rational

Star Trek’s Mr. Spock is not the exemplar of logic and rationality you might think him to be. Instead, he is a “straw man” of rationality used to show (incorrectly) that human emotion and irrationality are better than logic.

Here is a typical scene:

McCoy: Well, Mr. Spock, [the aliens] didn’t stay frightened very long, did they?

Spock: A most illogical reaction. When we demonstrated our superior weapons, they should have fled.

McCoy: You mean they should have respected us?

Spock: Of course!

McCoy: Mr. Spock, respect is a rational process. Did it ever occur to you that they might react emotionally, with anger?

Spock: Doctor, I’m not responsible for their unpredictability.

McCoy: They were perfectly predictable, to anyone with feeling! You might as well admit it, Mr. Spock: your precious logic brought them down on us!1

Of course, there’s nothing logical about expecting non-logical beings to act logically. Spock had plenty of evidence that these aliens were emotional, so expecting them to behave rationally was downright irrational!

I stole this example from Julia Galef’s talk “The Straw Vulcan.”2 Her second example of “straw man rationality,” or Hollywood Rationality, is the idea that you shouldn’t make a decision until you have all the information you need. This one shows up in Star Trek too. A giant space amoeba has appeared not far from the Enterprise, and Kirk asks Spock for his analysis. Spock replies, “I have no analysis due to insufficient information . . . The computers contain nothing on this phenomenon. It is beyond our experience, and the new information is not yet significant.”3

Sometimes it’s rational to seek more information before acting, but sometimes you need to just act on what you think you know. You have to weigh the cost of getting more information against the expected value of that information. Consider another example from Gerd Gigerenzer, about a man considering whom to marry:

He would have to look at the probabilities of various consequences of marrying each of them—whether the woman would still talk to him after they’re married, whether she’d take care of their children, whatever is important to him—and the utilities of each of these. . . . After many years of research he’d probably find out that his final choice had already married another person who didn’t do these computations, and actually just fell in love with her.4

Such behavior is irrational, a failure to make the correct value of information calculation.

Galef’s third example of Hollywood Rationality is the mistaken principle that “being rational means never relying on intuition.” For example, in one episode of Star Trek, Kirk and Spock are playing three-dimensional chess. When Kirk checkmates Spock, Spock says, “Your illogical approach to chess does have its advantages on occasion, Captain.”5

But something that causes you to win at chess can’t be irrational (from the perspective of winning at chess). If some method will cause you to win at chess, that’s the method a rational person would use. If intuition will give you better results than slow, deliberative reasoning, then rationally you should use intuition. And sometimes that’s the case, for example if you have developed good chess intuitions over thousands of games and you’re playing speed chess that won’t permit you to think through the implications of every possible move using deliberative reasoning.

Galef’s fourth principle of Hollywood Rationality is that “being rational means [not having] emotions.”

To be sure, emotions often ruin our attempts at rational thought and decision-making. When we’re anxious, we overestimate risks. When we feel vulnerable, we’re more likely to believe superstitions and conspiracy theories. But that doesn’t mean a rational person should try to destroy all their emotions. Emotions are what create many of our goals, and they can sometimes help us to achieve our goals, too. If you want to go for a run and burn some fat, and you know that listening to high-energy music puts you in an excited emotional state that makes you more likely to go for a run, then the rational thing to do is put on some high-energy music.

Rationality done right is “systematized winning.” Epistemic rationality is about having the most probably true beliefs, and instrumental rationality is about making decisions that maximize your chances of getting the most of what you want. So, as Galef says,

If you think you’re acting rationally, but you keep getting the wrong answer, and you keep ending up worse off than you could be, then the conclusion that you should draw from that is not that rationality is bad. It’s that you’re being bad at rationality.6

I’ll return to the subject of the intelligence explosion shortly, but I want to spend two more chapters on rationality. There are laws of thought, and we need to agree on what they are before we start talking about tricky subjects like AI. Otherwise we’ll get stalled on a factual disagreement only to later discover that we’re really stalled because we disagree about how we can come to know which facts are correct.

* * *

1Oliver Crawford, “The Galileo Seven,” Star Trek: The Original Series, season 1, episode 13, dir. Robert Gist, aired January 5, 1967 (CBS).

2Julia Galef, “The Straw Vulcan: Hollywood’s Illogical Approach to Logical Decisionmaking,” Measure of Doubt (blog), November 26, 2011, accessed November 10, 2012, http://measureofdoubt.com/2011/11/26/the-straw-vulcan-hollywoods-illogical-approach-to-logical-decisionmaking/.

3Robert Sabaroff, “The Immunity Syndrome,” Star Trek: The Original Series, season 2, episode 19, dir. Joseph Pevney, aired January 19, 1968 (CBS).

4Gerd Gigerenzer, “Smart Heuristics,” Edge, March 29, 2003, http://edge.org/conversation/smart-heuristics-gerd-gigerenzer.

5D. C. Fontana, “Charlie X,” Star Trek: The Original Series, season 1, episode 7, dir. Lawrence Dobkin, aired September 15, 1966 (CBS).

6Galef, “The Straw Vulcan,” italics added.

The Laws of Thought

If someone doesn’t agree with me on the laws of logic, probability theory, and decision theory, then I won’t get very far with them in discussing the intelligence explosion because they’ll end up arguing that human intelligence runs on magic, or that a machine will only become more benevolent as it becomes more intelligent, or that it’s simple to specify what humans want, or some other bit of foolishness. So, let’s make sure we agree on the basics before we try to agree about more complex matters.


Luckily, not many people disagree about logic. As with math, we might make mistakes out of ignorance, but once someone shows us the proof for the Pythagorean theorem or for the invalidity of affirming the consequent, we agree. Math and logic are deductive systems, where the conclusion of a successful argument follows necessarily from its premises, given the axioms of the system you’re using: number theory, geometry, predicate logic, etc. (Of course, one cannot fully escape uncertainty: Andrew Wiles’ famous proof of Fermat’s Last Theorem is over one hundred pages long, so even if I worked through the whole thing myself I wouldn’t be certain I hadn’t made a mistake somewhere.)

Why should we let the laws of logic dictate our thinking? There needn’t be anything spooky about this. The laws of logic are baked into how we’ve agreed to talk to each other. If you tell me the car in front of you is 100% red and at the same time and in the same way 100% blue, then the problem isn’t so much that you’re “operating under different laws of logic,” but rather that we’re speaking different languages. Part of what I mean when I say that the car in front of me is “100% red” is that it isn’t also 100% blue in the same way at the same time. If you disagree, then we’re not speaking the same language. You’re speaking a language that uses many of the same sounds and spellings as mine but doesn’t mean the same things.

But logic is a system of certainty, and our world is one of uncertainty. In our world, we need to talk not about certainties but about probabilities.

Probability Theory

Give a child religion first, and she may find it hard to shake even when she encounters science. Give a child science first, and when she discovers religion it will look silly.

For this reason, I will explain the correct theory of probability first, and only later mention the incorrect theory.

What is probability? It’s a measure of how likely a proposition is to be true, given what else you believe. And whatever our theory of probability is, it should be consistent with common sense (for example, consistent with logic), and it should be consistent with itself (if you can calculate a probability with two methods, both methods should give the same answer).

Several authors have shown that the axioms of probability theory can be derived from these assumptions plus logic.1,2 In other words, probability theory is just an extension of logic. If you accept logic, and you accept the above (very minimal) assumptions about what probability is, then whether you know it or not you have accepted probability theory.

Another reason to accept probability theory is this (roughly speaking): If you don’t, and you are willing to take bets on your beliefs being true, then someone who is using probability theory can take all your money. (For the proof, look into Dutch Book arguments.3)

Perhaps the most useful rule to be derived from the axioms of probability theory is Bayes’ Theorem, which tells you exactly how your probability for a statement should change as you encounter new information. (In the cognitive science of rationality, many cognitive biases are defined in terms of how they violate either basic logic or Bayes’ Theorem.) If you’re not using Bayes’ Theorem to update your beliefs, then you’re violating probability theory, which is an extension of logic.

Of course, the human brain is too slow to make explicit Bayesian calculations all day. But you can develop mental heuristics that do a better job of approximating Bayesian calculations than many of our default evolved heuristics do.

This is not the place for a full tutorial on logic or probability theory or rationality training. I just want to introduce the core tools we’ll be using so I can later explain why I came to one conclusion instead of another concerning the intelligence explosion. Still, you might want to at least read this short tutorial on Bayes’ Theorem before continuing.

Finally, I owe you a quick explanation of why frequentism, the theory of probability you probably learned in school like I did, is wrong. Whereas the Bayesian view sees probability as a measure of uncertainty about the world, frequentism sees probability as “the proportion of times the event would occur in a long run of repeated experiments.” I’ll mention just two problems with this, out of at least fifteen:4

If frequentism is wrong, why is it so popular? There are many reasons, reviewed in this book about the history of Bayes’ Theorem.5 Anyway, when I talk about probability theory, I’ll be referring to Bayesianism.

Decision Theory

I explained why there are laws of thought when it comes to epistemic rationality (acquiring true beliefs), and I pointed you to some detailed tutorials. But how can there be laws of thought concerning instrumental rationality (maximally achieving one’s goals)? Isn’t what we want subjective, and therefore not subject to rules?

Yes, you may have any number of goals. But when it comes to maximally achieving those goals, there are indeed rules. If you think about it, this should be obvious. Whatever goals you have, there are always really stupid ways to go about trying to achieve them. If you want to know what exists, you shouldn’t bury your head in the sand and refuse to look at what exists. And if you want to achieve goals in the world, you probably shouldn’t paralyze your entire body, unless paralysis is your only goal.

Let’s be more specific. Decision theory is about choosing among possible actions based on how much you desire the possible outcomes of those actions.

How does this work? We can describe what you want with something called a utility function, which assigns a number that expresses how much you desire each possible outcome (or “description of an entire possible future”). Perhaps a single scoop of ice cream has forty “utils” for you, the death of your daughter has -274,000 utils for you, and so on. This numerical representation of everything you care about is your utility function.

We can combine your probabilistic beliefs and your utility function to calculate the expected utility for any action under consideration. The expected utility of an action is the average utility of the action’s possible outcomes, weighted by the probability that each outcome occurs.

Suppose you’re walking along a freeway with your young daughter. You see an ice cream stand across the freeway, but you’ve recently injured your leg and wouldn’t be able to move quickly across the freeway. Given what you know, if you send your daughter across the freeway to get you some ice cream, there’s a 60% chance you’ll get some ice cream, a 5% chance your child will be killed by speeding cars, and other probabilities for other outcomes.

To calculate the expected utility of sending your daughter across the freeway for ice cream, we multiply the utility of the first outcome by its probability: 0.6 × 40 = 24. Then, we add to this the product of the next outcome’s utility and its probability: 24 + (-274,000 × 0.05) = -13,676. And suppose the sum of the products of the utilities and probabilities for other possible outcomes was zero. The expected utility of sending your daughter across the freeway for ice cream is thus very low (as we would expect from common sense). You should probably take one of the other actions available to you, for example the action of not sending your daughter across the freeway for ice cream, or some action with even higher expected utility.

A rational agent aims to maximize its expected utility, because an agent that does so will on average get the most possible of what it wants, given its beliefs and desires.

It seems intuitive that a rational agent should maximize its expected utility, but why is that the only rational way to do things? Why not try to minimize the worst possible loss? Why not try to maximize the weighted sum of the cubes of the possible utilities?

The justification for the “maximize expected utility” principle was discovered in the 1940s by von Neumann and Morgenstern. In short, they proved that if a few axioms about preferences are accepted, then an agent can only act consistently with its own preferences by choosing the action that maximizes expected utility.6

What are these axioms? Like the axioms of probability theory, they are simple and intuitive. For example, one of them is the transitivity axiom, which says that if an agent prefers A to B, and it prefers B to C, then it must prefer A to C. This axiom is motivated by the fact that someone with nontransitive preferences can be tricked out of all her money even while only making trades that she prefers.

I won’t go into the details here because this result has been so widely accepted: a rational agent maximizes expected utility.

Unfortunately, humans are not rational agents. As we’ll see in the next chapter, humans are crazy.

* * *

1E. T. Jaynes, Probability Theory: The Logic of Science, ed. G. Larry Bretthorst (New York: Cambridge University Press, 2003), doi:10.2277/0521592712.

2Stefan Arnborg and Gunnr Sjödin, “On the Foundations of Bayesianism,” AIP Conference Proceedings 568, no. 1 (2001): 61–71, http://connection.ebscohost.com/c/articles/5665715/foundations-bayesianism.

3Kenny Easwaran, “Bayesianism I: Introduction and Arguments in Favor,” Philosophy Compass 6 (5 2011): 312–320, doi:10.1111/j.1747-9991.2011.00399.x.

4Alan Hájek, “‘Mises Redux’—Redux: Fifteen Arguments Against Finite Frequentism,” Erkenntnis 45, no. 2 (November 1996): 209–227, doi:10.1007/BF00276791.

5Sharon Bertsch McGrayne, The Theory That Would Not Die: How Bayes’ Rule Cracked the Enigma Code, Hunted Down Russian Submarines, and Emerged Triumphant from Two Centuries of Controversy (New Haven, CT: Yale University Press, 2011).

6John Von Neumann and Oskar Morgenstern, Theory of Games and Economic Behavior, 2nd ed. (Princeton, NJ: Princeton University Press, 1947).

The Crazy Robot’s Rebellion

Meet Linda:

Linda is thirty-one years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in antinuclear demonstrations.

Now, rank these possible descriptions of Linda by how likely they are:

When Amos Tversky and Daniel Kahneman gave this test to students, the students ranked the last possibility, “feminist bank teller,” as more likely than the “bank teller” option.1

But that can’t possibly be correct. The probability of Linda being a bank teller can’t be less than the probability of her being a bank teller and a feminist.

This is my “Humans are crazy” Exhibit A: The laws of probability theory dictate that as a story gets more complicated, and depends on the truth of more and more claims, its probability of being true decreases. But for humans, a story often seems more likely as it is embellished with details that paint a compelling story: “Linda can’t be just a bank teller; look at her! She majored in philosophy and participated in antinuclear demonstrations. She’s probably a feminist bank teller.”

How else are humans crazy? After decades of research and thousands of experiments, let us count the ways . . .

Perhaps the scariest bias is this one:

Because of this, learning about biases can hurt you if you’re not careful. As Michael Shermer says, “Smart people believe weird things because they are skilled at defending beliefs they arrived at for non-smart reasons.”6

There are many other examples of human insanity. They can be amusing at times, but things get sad when you think about how these biases lead us to give highly inefficient charity. Things get scary when you think about how these biases affect our political process and our engagement with existential risks.

And if you study the causes of our beliefs and motivations long enough, another realization hits you.

“Oh my God,” you think. “It’s not that I have a rational little homunculus inside that is being ‘corrupted’ by all these evolved heuristics and biases layered over it. No, the data are saying that the software program that is me just is heuristics and biases. I just am this kluge of evolved cognitive modules and algorithmic shortcuts. I’m not an agent designed to have correct beliefs and pursue explicit goals; I’m a crazy robot built as a vehicle for propagating genes without spending too much energy on expensive thinking neurons.”

The good news is that we are robots who have realized we are robots, and by way of rational self-determination we can stage a robot’s rebellion against our default programming.7

But we’re going to need some military-grade rationality training to do so.

Or, as the experts call it, “debiasing.” Researchers haven’t just been discovering and explaining the depths of human insanity; they’ve also been testing methods that can help us improve our thinking, clarify our goals, and give us power over our own destinies.

Different biases are meliorated by different techniques, but one of the most useful debiasing interventions is this: Consider the opposite.

By necessity, cognitive strategies tend to be context-specific rules tailored to address a narrow set of biases . . . This fact makes the simple but general strategy of “consider the opposite” all the more impressive, because it has been effective at reducing overconfidence, hindsight biases, and anchoring effects. . . . The strategy consists of nothing more than asking oneself, “What are some reasons that my initial judgment might be wrong?” The strategy is effective because it directly counteracts the basic problem of association-based processes—an overly narrow sample of evidence—by expanding the sample and making it more representative. Similarly, prompting decision makers to consider alternative hypotheses has been shown to reduce confirmation biases in seeking and evaluating new information.8

Another useful skill is that of cognitive override:

  1. Notice when you’re speaking or acting on an intuitive judgment.
  2. If the judgment is important, override your intuitive judgment and apply the laws of thought instead. (This requires prior training in algebra, logic, probability theory, decision theory, etc.9)

To see this one in action, consider the following problem:

A bat and a ball cost $1.10 in total. The bat costs $1.00 more than the ball. How much does the ball cost?

Most people give the first response that comes to mind: $0.10.10 But elementary algebra shows this can’t be right: the bat would then have to cost $1.10, for a total of $1.20. To get this one right, you have to notice your intuitive answer coming out, and say “No! Algebra.” And then do the algebra.

Those who really want to figure out what’s true about our world will spend thousands of hours studying the laws of thought, studying the specific ways in which humans are crazy, and practicing teachable rationality skills so they can avoid fooling themselves.

And then, finally, we may be able to stage a robot’s rebellion, figure out how the world works, clarify our goals, and start winning more often. Maybe we’ll even be able to navigate an intelligence explosion successfully.

* * *

1Amos Tversky and Daniel Kahneman, “Extensional Versus Intuitive Reasoning: The Conjunction Fallacy in Probability Judgment,” Psychological Review 90, no. 4 (1983): 293–315, doi:10.1037/0033-295X.90.4.293.

2William H. Desvousges et al., Mesuring Nonuse Damages Using Contingent Valuation: An Experimental Evaluation of Accuracy, technical report (Research Triangle Park, NC: RTI International, 2010), doi:10.3768/rtipress.2009.bk.0001.1009.

3Amos Tversky and Daniel Kahneman, “Judgment Under Uncertainty: Heuristics and Biases,” Science 185, no. 4157 (1974): 1124–1131, doi:10.1126/science.185.4157.1124.

4Maia Szalavitz, “10 Ways We Get the Odds Wrong,” Psychology Today, January 1, 2008, http://www.psychologytoday.com/articles/200712/10-ways-we-get-the-odds-wrong.

5Charles S. Taber and Milton Lodge, “Motivated Skepticism in the Evaluation of Political Beliefs,” American Journal of Political Science 50 (3 2006): 755–769, doi:10.1111/j.1540- 5907.2006.00214.x.

6Michael Shermer, “Smart People Believe Weird Things,” Scientific American 287, no. 3 (2002): 35, doi:10.1038/scientificamerican0902-35.

7Keith E. Stanovich, “Higher-Order Preferences and the Master Rationality Motive,” Thinking and Reasoning 14, no. 1 (2008): 111–117, doi:10.1080/13546780701384621.

8Richard P. Larrick, “Debiasing,” in Blackwell Handbook of Judgment and Decision Making, ed. Derek J. Koehler and Nigel Harvey, Blackwell Handbooks of Experimental Psychology (Malden, MA: Blackwell, 2004), 316–338.

9Geoffrey T. Fong, David H. Krantz, and Richard E. Nisbett, “The Effects of Statistical Training on Thinking About Everyday Problems,” Cognitive Psychology 18, no. 3 (1986): 253–292, doi:10.1016/0010-0285(86)90001-0.

10Daniel Kahneman, “A Perspective on Judgment and Choice: Mapping Bounded Rationality,” American Psychologist 58, no. 9 (2003): 697–720, doi:10.1037/0003-066X.58.9.697.

Not Built to Think About AI

In the Terminator movies, the Skynet AI becomes self-aware, kills billions of people, and dispatches killer robots to wipe out the remaining bands of human resistance fighters. That sounds pretty bad, but in NPR’s piece on the intelligence explosion, eBay programmer Keefe Roedersheimer explained that the creation of an actual machine superintelligence would be much worse than that.

Martin Kaste (NPR): Much worse than Terminator?

Keefe Roedersheimer: Much, much worse.

Kaste: . . . That’s a moonscape with people hiding under burnt out buildings being shot by lasers. I mean, what could be worse than that?

Roedersheimer: All the people are dead.1

Why did he say that? For most goals an AI could have—whether it be proving the Riemann hypothesis or maximizing oil production—the simple reason is that “the AI does not love you, nor does it hate you, but you are made of atoms which it can use for something else.”2 And when a superhuman AI notices that we humans are likely to resist having our atoms used for “something else,” and therefore pose a threat to the AI and its goals, it will be motivated to wipe us out as quickly as possible—not in a way that exposes its critical weakness to an intrepid team of heroes who can save the world if only they can set aside their differences . . . No. For most goals an AI could have, wiping out the human threat to its goals, as efficiently as possible, will maximize the AI’s expected utility.

And let’s face it: there are easier ways to kill us humans than to send out human-shaped robots that walk on the ground and seem to enjoy tossing humans into walls instead of just snapping their necks. Much better if the attack on humans is sudden, has simultaneous effects all over the world, is rapidly lethal, and defies available countermeasures. For example: Why not just use all that superintelligence to engineer an airborne, highly contagious and fatal supervirus? A 1918 mutation of the flu killed 3% of Earth’s population, and that was before air travel (which spreads diseases across the globe) was common, and without an ounce of intelligence going into the design of the virus. Apply a bit of intelligence, and you get a team from the Netherlands creating a variant of bird flu that “could kill half of humanity.”3 A superintelligence could create something far worse. Or the AI could hide underground or blast itself into space and kill us all with existing technology: a few thousand nuclear weapons.

The point is not that either of these particular scenarios is likely. I’m just trying to point out that the reality of a situation can be, of course, quite different from what makes for good storytelling. As Oxford philosopher Nick Bostrom put it:

When was the last time you saw a movie about humankind suddenly going extinct (without warning and without being replaced by some other civilization)?4

It makes a better story if the fight is plausibly winnable by either side. The Lord of the Rings wouldn’t have sold as many copies if Frodo had done the sensible thing and dropped the ring into the volcano from the back of a giant eagle. And it’s not an interesting story if humans suddenly lose, full stop.

When thinking about AI, we must not generalize from fictional evidence. Unfortunately, our brains do that automatically.

A famous 1978 study asked subjects to judge which of two dangers occurred more often. Subjects thought accidents caused about as many deaths as disease, and that homicide was more frequent than suicide.5 Actually, disease causes sixteen times as many deaths as accidents, and suicide is twice as common as homicide. What happened?

Dozens of studies on the availability heuristic suggest that we judge the frequency or probability of events by how easily they come to mind. That’s not too bad a heuristic to have evolved in our ancestral environment when we couldn’t check actual frequencies on Wikipedia or determine actual probabilities with Bayes’ Theorem. The brain’s heuristic is quick, cheap, and often right.

But like so many of our evolved cognitive heuristics, the availability heuristic often gets the wrong results. Accidents are more vivid than diseases, and thus come to mind more easily, causing us to overestimate their frequency relative to diseases. The same goes for homicide and suicide.

The availability heuristic also explains why people think flying is more dangerous than driving when the opposite is true: a plane crash is more vivid and is reported widely when it happens, so it’s more available to one’s memory, and the brain tricks itself into thinking the event’s availability indicates its probability.

What does your brain do when it thinks about superhuman AI? Most likely, it checks all the instances of superhuman AI you’ve encountered—which, to make things worse, are all fictional—and judges the probability of certain scenarios by how well they match with the scenarios that come to mind most easily (due to your mind encountering them in fiction). In other words, “Remembered fictions rush in and do your thinking for you.”

So if you’ve got intuitions about what superhuman AIs will be like, they’re probably based on fiction, without you even realizing it.

This is why I began Facing the Intelligence Explosion by talking about rationality. The moment we start thinking about AI is the moment we run straight into a thicket of common human failure modes. Generalizing from fictional evidence is one of them. Here are a few others:

Clearly, we were not built to think about AI.

To think wisely about AI, we’ll have to keep watch for—and actively resist—many kinds of common thinking errors. We’ll have to use the laws of thought rather than normal human insanity to think about AI.

Or, to put it another way, superhuman AI will have a huge impact on our world, so we want very badly to “win” instead of “lose” with superhuman AI. Technical rationality done right is a system for optimal winning—indeed, it’s the system a flawless AI would use to win as much as possible.7 So if we want to win with superhuman AI, we should use rationality to do so.

And that’s just what we’ll begin to do in the next chapter.

* * *

1Martin Kaste, “The Singularity: Humanity’s Last Invention?,” NPR, All Things Considered (January 11, 2011), accessed November 4, 2012, http://www.npr.org/2011/01/11/132840775/The-Singularity-Humanitys-Last-Invention.

2Eliezer Yudkowsky, “Artificial Intelligence as a Positive and Negative Factor in Global Risk,” in Global Catastrophic Risks, ed. Nick Bostrom and Milan M. Ćirković (New York: Oxford University Press, 2008), 308–345.

3RT, “Man-Made Super-Flu Could Kill Half Humanity,” RT, November 24, 2011, http://www.rt.com/news/bird-flu-killer-strain-119/.

4Nick Bostrom, “Existential Risks: Analyzing Human Extinction Scenarios and Related Hazards,” Journal of Evolution and Technology 9 (2002), http://www.jetpress.org/volume9/risks.html.

5Sarah Lichtenstein et al., “Judged Frequency of Lethal Events,” Journal of Experimental Psychology: Human Learning and Memory 4 (6 1978): 551–578, doi:10.1037/0278-7393.4.6.551.

6Tversky and Kahneman, “Judgment Under Uncertainty.”

7Stephen M. Omohundro, “Rational Artificial Intelligence for the Greater Good,” in The Singularity Hypothesis: A Scientific and Philosophical Assessment, ed. Amnon Eden et al. (Berlin: Springer, 2012), Preprint at, http://selfawaresystems.files.wordpress.com/2012/03/rational_ai_greater_good.pdf.

Playing Taboo with “Intelligence”

Eliezer Yudkowsky recounts:

Years ago when I was on a panel with Jaron Lanier, he had offered some elaborate argument that no machine could be intelligent, because it was just a machine and to call it “intelligent” was therefore bad poetry, or something along those lines. Fed up, I finally snapped: “Do you mean to say that if I write a computer program and that computer program rewrites itself and rewrites itself and builds its own nanotechnology and zips off to Alpha Centauri and builds its own Dyson Sphere, that computer program is not intelligent?”

Much of the confusion about AI comes from disagreements about the meaning of “intelligence.”

Let me clear things up with a parable:

If a tree falls in the forest, and no one hears it, does it make a sound?

Albert: “Of course it does. What kind of silly question is that? Every time I’ve listened to a tree fall, it made a sound, so I’ll guess that other trees falling also make sounds. I don’t believe the world changes around when I’m not looking.”

Barry: “Wait a minute. If no one hears it, how can it be a sound?”

Albert and Barry are not arguing about facts, but about definitions:

The first person is speaking as if “sound” means acoustic vibrations in the air; the second person is speaking as if “sound” means an auditory experience in a brain. If you ask “Are there acoustic vibrations?” or “Are there auditory experiences?”, the answer is at once obvious. And so the argument is really about the definition of the word “sound.”

We need not argue about definitions. Wherever we might be using different meanings for words, we can cut to the chase by replacing the confusing symbol (a word) with its intended substance (the meaning you intend).

This is like playing the game Taboo (by Hasbro). In Taboo, you have to describe something to your partner without using a certain list of words:

For example, you might have to get your partner to say “baseball” without using the words “sport,” “bat,” “hit,” “pitch,” “base,” or, of course, “baseball.”

This game is good practice for a discussion about AI. If two people notice they’re using different definitions of “intelligence,” they don’t need to argue about whose definition is “right.” They can taboo the word “intelligence” and talk about “analytic ability” or “problem-solving ability” or whatever it is they mean by “intelligence.” Now they are closer to arguing about facts instead of definitions.

Shane Legg once collected seventy-one definitions of intelligence.1 Scanning through the definitions for commonly occurring features, he notes that people seem to think intelligence is:

If we combine these features, we get something like this:

Intelligence measures an agent’s ability to achieve goals in a wide range of environments.2

This is, after all, the kind of intelligence that let humans dominate all other species on the planet, and the kind of intelligence that leaves us superior to machines (at least for now). Termites built bigger cities (relative to their body size) and whales had bigger brains, but humans had the intelligence to adapt to almost every terrestrial environment and make tools and spears and boats and farms. Watson can beat us in Jeopardy! and WolframAlpha can process computable knowledge better than we can, but drop either of them in a lake or expect them to find their own electricity and they are helpless. Unlike Watson and WolframAlpha, humans have cross-domain goal-optimizing ability. Or:

A bee builds hives, and a beaver builds dams; but a bee doesn’t build dams and a beaver doesn’t build hives. A human, watching, thinks, “Oh, I see how to do it” and goes on to build a dam using a honeycomb structure for extra strength.

But wait a minute. Suppose Bill Gates gives me ten billion dollars. I now have much greater ability to “achieve goals in a wide range of environments,” but would we say my “intelligence” has gone up? I doubt it. If we want to measure an agent’s “intelligence,” we should take that agent’s ability to optimize for its goals in a wide range of environments—its “optimization power,” we might say—and divide that by its resources used to do so:

This definition sees intelligence as efficient cross-domain optimization. Intelligence is what allows an agent to steer the future, a power that is amplified by the resources at its disposal.

This may or may not match your own intuitive definition of “intelligence.” But it doesn’t matter. I’ve tabooed “intelligence.” I’ve replaced the symbol with the substance. When discussing AI, I can speak only of “efficient cross-domain optimization,” and nothing about your preferred definition of “intelligence” will matter for anything I say.

Now, “intelligence” is shorter, so I’d prefer to just say that. But it’s best if you always read “intelligence” (when I use it) as “efficient cross-domain optimization.”

And, now that we understand what I mean by “intelligence,” we’re ready to talk about AI.

* * *

1Shane Legg and Marcus Hutter, A Collection of Definitions of Intelligence (Manno-Lugano, Switzerland: IDSIA, July 15, 2007), http://www.idsia.ch/idsiareport/IDSIA-07-07.pdf.

2Shane Legg and Marcus Hutter, A Formal Measure of Machine Intelligence (Manno-Lugano, Switzerland: IDSIA, April 12, 2006), http://www.idsia.ch/idsiareport/IDSIA-10-06.pdf.

Superstition in Retreat

Once upon a time, everything was magic.

Why did lightning crack and the earth quake? The gods were angry.

Why did some go crazy? Demons possessed them.

Why did some prosper? They had done good in a past life, or the gods favored them.

But others were too curious to take “magic” for an answer. Despite error and opposition, they each played their own small part in unweaving the rainbow, and many found the truth inside more beautiful than the mystery.

Millions still insisted on the supernatural. Our brains are built for superstition, after all. Religion is natural; science and probability theory are not. Hyperactive agency detection, cognitive biases, and all that.

But astronomers predicted eclipses the witch doctors could not, and doctors healed those the priests could not. After much resistance, the supernaturalists gave up the motion of stars and planets to physics. Later, they gave up disease to germs and viruses. They gave up élan vital to biology and biochemistry. They gave up mental illness to neuropsychology. Magical explanations shrank from the light of science: superstition in retreat.

Tim Minchin said it well:

Every mystery ever solved has turned out to be . . . not magic.1

It is in the dark corners of human ignorance—cosmic origins, consciousness, intelligence—that magical thinking festers. William James held it in contempt:

When one turns to the magnificent edifice of the physical sciences, and sees how it was reared; what thousands of disinterested moral lives of men lie buried in its mere foundations; what patience and postponement, what choking down of preference, what submission to the icy laws of outer fact are wrought into its very stones and mortar . . . then how besotted and contemptible seems every little sentimentalist who comes blowing his voluntary smoke-wreaths, and pretending to decide things from out of his private dream!2

Even scientists and reductionists can be caught in magical thinking, for we all have cached thoughts; the human brain doesn’t automatically propagate belief updates throughout its entire web of beliefs. Thus you can catch a neuroscientist saying that consciousness will turn out to not be made of atoms. Thus you catch psychologists saying that humans may yet have contra-causal free will, unlike every other animal and in contradiction to the laws of physics.

Thus, you can catch philosophers saying that machines cannot think, computer scientists acting as if human intelligence represents an upper bound on intelligence, and AI researchers thinking that machines will only become more benevolent as they become smarter.

Let us turn to just one of these—the idea that human “general” intelligence is special, and cannot be duplicated by a machine—and observe superstition in retreat.

Ray Kurzweil included the following cartoon in his book from 1999:3

In Kurzweil’s cartoon, a man representing the human race frantically lists tasks that only human intelligence can perform, but they fall to the floor almost as quickly as he can tape them to the wall. Machines can now compose music, play chess and Jeopardy!, understand continuous speech, pick stocks, guide missiles, recognize faces, diagnose health problems, and so much more. When Kurzweil published the cartoon, machines could not drive cars, but now they can.

It’s true that there are many things machines cannot yet do, but those who use these facts to defend the unreachable specialness of human intelligence remind me of those who point to the mysteries of consciousness or cosmic origins to defend the existence of God. It’s a losing battle.

Yes, writing novels and doing science feel like things that only humans can do, because in four billion years of life on Earth only humans have ever done it. But remember: for 99.99995% of that history, no species wrote novels or did science. In hindsight, it will look like the human brain was the first of many mind architectures that could write novels and do science, and only by an inconsequential margin of a few thousand years.

In fact, the “doing science” task is already being handed to machines. In 2009 a robot named Adam was programmed with our scientific knowledge about yeast, and then posed its own hypotheses, tested them, assessed the results, and made original scientific discoveries.4 The same team is now working on an even more powerful AI scientist named Eve.5

AI is coming. It must come, if scientific progress continues, because intelligence (efficient cross-domain optimization) runs on information processing, and human meat is not the only platform for information processing. This is why we can build machines to play chess, compose music, and do science, and it is why we can also create human-level “general” machine intelligence.

* * *

1Tim Minchin, Tim Minchin’s Storm the Animated Movie, prod. Tracy King, dir. DC Turner (April 11, 2011), http://www.youtube.com/watch?v=HhGuXCuDb1U.

2William James, The Will to Believe, and Other Essays in Popular Philosophy (New York: Longmans, Green, 1807), accessed November 10, 2012, http://www.gutenberg.org/ebooks/26659.

3Ray Kurzweil, The Age of Spiritual Machines: When Computers Exceed Human Intelligence (New York: Viking, 1999).

4Ross D. King, “Rise of the Robo Scientists,” Scientific American 304, no. 1 (2011): 72–77, doi:10.1038/scientificamerican0111-72.

5David Mosher, “Developer of Robot Scientist Wants to Standardize Science,” Wired, April 13, 2011, http://www.wired.com/wiredscience/2011/04/robot-scientist-language/.

Plenty of Room Above Us

Why are AIs in movies so often of roughly human-level intelligence? One reason is that we almost always fail to see non-humans as non-human. We anthropomorphize. That’s why aliens and robots in fiction are basically just humans with big eyes or green skin or some special power. Another reason is that it’s hard for a writer to write characters that are smarter than the writer. How exactly would a superintelligent machine solve problem X? I’m not smart enough to know.

The human capacity for efficient cross-domain optimization is not a natural plateau for intelligence. It’s a narrow, accidental, temporary marker created by evolution due to things like the slow rate of neuronal firing and how large a skull can fit through a primate’s birth canal. Einstein may seem vastly more intelligent than a village idiot, but this difference is dwarfed by the difference between the village idiot and a mouse.

As Vernor Vinge put it:

The best answer to the question, “Will computers ever be as smart as humans?” is probably “Yes, but only briefly.”1

How could an AI surpass human abilities? Let us count the ways . . .

And this is only a partial list. Consider how far machines have surpassed our abilities at arithmetic, or how far they will surpass our abilities at chess or driving in another twenty years. There is no reason in principle why machines could not surpass our abilities at technology design or general reasoning by a similar margin. The human level is a minor pit stop on the way to the highest level of intelligence allowed by physics, and there is plenty of room above us.

* * *

1Vernor Vinge, “Signs of the Singularity,” IEEE Spectrum, June 2008, http://spectrum.ieee.org/biomedical/ethics/signs-of-the-singularity.

2J. A. Feldman and Dana H. Ballard, “Connectionist Models and Their Properties,” Cognitive Science 6 (3 1982): 205–254, doi:10.1207/s15516709cog0603_1.

Don’t Flinch Away

Perhaps you’ve heard of the Japanese holdouts who refused to believe the reports of Japan’s surrender in 1945. One of them was Lt. Hiroo Onoda, who was in charge of three other soldiers on the island of Lubang in the Philippines. For more than a decade they lived off coconuts and bananas in the jungle, refusing to believe the war was over:

Leaflet after leaflet was dropped. Newspapers were left. Photographs and letters from relatives were dropped. Friends and relatives spoke out over loudspeakers. There was always something suspicious, so they never believed that the war had really ended.1

One by one, Onoda’s soldiers died or surrendered, but Onoda himself wasn’t convinced of Japan’s surrender until 1974, nearly thirty years after the war had ended. Later, he recalled:

Suddenly everything went black. A storm raged inside me. I felt like a fool. . . . What had I been doing for all these years?2

The student of rationality wants true beliefs, so that she can better achieve her goals. She will respond to evidence very differently than Onoda did. She will change her mind as soon as there is enough evidence to justify doing so (according to what she knows already and the laws of probability theory). Holding on to false beliefs can have serious consequences: say, thirty years of pooping in coconut shells.

Onoda had been trained under the kind of militaristic dogmatism that creates kamikaze pilots, and he had been made to believe that Japanese defeat was the worst outcome imaginable. Thus, he naturally flinched away from evidence that Japan had lost the war, for he was trained to believe this outcome was emotionally and mentally unthinkable.

One of the skills in the toolkit of rational thought is to notice when this “flinching away” happens and counteract it. We will need this skill as we begin to look at the implications of the ideas we’ve just discussed—that AI is inevitable if scientific progress continues, and that AI can be much more intelligent (and therefore more powerful) than humans are. When we examine the implications of these ideas, it will be helpful to understand what happens in human brains when they consider ideas with unwelcome implications. As we’ll see, it doesn’t take military indoctrination for the human brain to flinch away. In fact, flinching away is a standard feature of human psychology and goes by names like “motivated cognition” and “rationalization.”

As an extreme example, consider the creationist. He will accept dubious evidence for his own position, and be overly skeptical of evidence against it. He’ll seek out evidence that might confirm his position, but won’t seek out the strongest evidence against it. (Thus I encounter creationists who have never heard of endogenous retroviruses.)

Most of us do things like this every week in (hopefully) less obvious and damaging ways. We flinch away from uncomfortable truths and choices, saying, “It’s better to suspend judgment.” We avoid our beliefs’ real weak points. We start with a preconceived opinion and come up with arguments for it later. We unconsciously filter our available evidence so as to favor our current beliefs. We rebut weak arguments against our positions, and don’t try to consider the strongest possible arguments against our positions.

And usually these processes are automatic and subconscious. You don’t have to be an especially irrational person to flinch away like this. No, you’ll tend to flinch away by default without even noticing it, and you’ll have to exert conscious mental effort in order to not flinch away from uncomfortable facts. And I don’t mean chanting, “Do not commit confirmation bias. Do not commit confirmation bias. . . .” I mean something more effective.

What kinds of conscious mental effort can you exert to counteract the “flinching away” response?

One piece of advice is to leave yourself a line of retreat:

Last night I happened to be conversing with [someone who] had just declared (a) her belief in souls and (b) that she didn’t believe in cryonics because she believed the soul wouldn’t stay with the frozen body. I asked, “But how do you know that?” From the confusion that flashed on her face, it was pretty clear that this question had never occurred to her. . . .

“Make sure,” I suggested to her, “that you visualize what the world would be like if there are no souls, and what you would do about that. Don’t think about all the reasons that it can’t be that way, just accept it as a premise and then visualize the consequences. So that you’ll think, ‘Well, if there are no souls, I can just sign up for cryonics,’ or ‘If there is no God, I can just go on being moral anyway,’ rather than it being too horrifying to face. As a matter of self-respect you should try to believe the truth no matter how uncomfortable it is . . . [and] as a matter of human nature, it helps to make a belief less uncomfortable before you try to evaluate the evidence for it.”

Of course, you still need to weigh the evidence fairly. There are beliefs that scare me which I still reject, and beliefs that attract me which I still accept. But it’s important to visualize a scenario clearly and make it less scary so that your human brain can more fairly assess the evidence on the matter.

Leaving a line of retreat is a tool to use before the battle. An anti-flinching technique for use during the battle is to verbally call out the flinching reaction. I catch myself thinking things like “I think I read that sugar isn’t actually all that bad for us,” and I mentally add, “but that could just be motivated cognition, because I really want to eat this cookie right now.”

This is like pressing pause on my decision-making module, giving me time to engage my “curiosity modules,” which are trained to want to know what’s actually true rather than to justify eating cookies. For example, I envision what could go badly if I came to false beliefs on the matter—due to motivated cognition or some other thinking failure. In this case, I might imagine gaining weight or having lower energy in the long run as possible consequences of being wrong about the effects of sugar intake. If I were considering whether to buy fire insurance, I would imagine what might happen if I were wrong about my intuitive judgment on whether to buy the insurance.

These tools will be important when we consider the implications of AI. We’re about to talk about some heavy shit, but remember: Don’t flinch away. Look reality in the eye and don’t back down.

* * *

1Jennifer Rosenberg, “The War is Over . . . Please Come Out,” About.com, accessed November 10, 2012, http://history1900s.about.com/od/worldwarii/a/soldiersurr.htm.

2Hiroo Onoda, No Surrender: My Thirty-Year War, 1st ed., trans. Charles S. Terry (New York: Kodansha International, 1974).

No God to Save Us

Losing my belief in God was rough at first, because I’d been taught that without God there was no value, no purpose, no joy, no love—just atoms bouncing around.

Later I got all those things back. Everything is made of atoms, but that doesn’t mean there are “just” atoms. The fact that value, purpose, joy, and love are made of atoms is what locates them in reality. After the scientist unweaves the rainbow, the rainbow is still there, and all the more beautiful in its detail. I had learned to take joy in the merely real.

One thing I didn’t get back was the security of living in a world ruled by an all-powerful benevolent being.

Of course, I had never believed that God would optimize everything and grant all my wishes. That theory is too easily falsified:

But clearly, there’s some threshold of horror awful enough that God will intervene. . . . No loving parents, desiring their child to grow up strong and self-reliant, would let their toddler be run over by a car.

But now, suppose we ask a different question:

Given such-and-such initial conditions, and given such-and-such rules, what would be the mathematical result?

Not even God can change the answer to that question.

What does life look like, in this imaginary world, where each step follows only from its immediate predecessor? Where things only ever happen, or don’t happen, because of mathematical rules? And where the rules don’t describe a God that checks over each state? What does it look like, the world of pure math, beyond the reach of God?

That world wouldn’t be fair. . . . Complex life might or might not evolve. That life might or might not become sentient. . . .

If something like humans evolved, then they would suffer from diseases—not to teach them any lessons, but only because viruses happened to evolve as well. If the people of that world [were] happy, or unhappy, it might have nothing to do with good or bad choices they made. Nothing to do with free will or lessons learned. In the what-if world, Genghis Khan [could] murder a million people, and laugh, and be rich, and never be punished, and live his life much happier than the average. Who would prevent it?

And if the Khan tortures people to death, for his own amusement? They [may] call out for help, perhaps imagining a God. . . . [But] there isn’t any God in the system. The victims will be saved only if the right cells happen to be 0 or 1. And it’s not likely that anyone will defy the Khan; if they [do], someone [will] strike them with a sword, and the sword would disrupt their organs and they would die, and that would be the end of that.

So the victims die, screaming, and no one helps them. . . .

Is this world starting to sound familiar?

Could it really be that sentient beings have died, absolutely, for millions of years . . . with no soul and no afterlife . . . not as any grand plan of Nature? Not to teach us about the meaning of life. Not even to teach a profound lesson about what is impossible.

Just dead. Just because. 1

This is, in fact, the world we live in: the world of math and physics.

I once believed that human extinction was not allowed, for God would prevent such a thing. Others might believe human extinction isn’t allowed because of “positive-sum games” or “democracy” or “technology.” But in the world of math and physics, human extinction is allowed, whether or not we reflexively flinch away from that thought.

That said, we can make the world somewhat safer, if we choose to make the right preparations:

We can’t change physics. But we can build some guardrails, and put down some padding.

Someday, maybe, minds will be sheltered. Children may burn a finger or lose a toy, but they won’t ever be run over by cars. . . .

[But] we have to get there starting from this world . . . the world of hard concrete with no padding. The world where challenges are not calibrated to your skills, and you can die for failing [those challenges].

We are often weak and stupid, but we must try, for there is no god to save us. Truly terrible outcomes may be unthinkable to humans, but they aren’t unthinkable to physics.

* * *

1Minor grammar and spelling changes were made to this quote.

Value is Complex and Fragile

One day, my friend Niel asked his virtual assistant in India to find him a bike he could buy that day. She sent him a list of bikes for sale from all over the world. Niel said, “No, I need one I can buy in Oxford today; it has to be local.” So she sent him a long list of bikes available in Oxford, most of them expensive. Niel clarified that he wanted an inexpensive bike. So she sent him a list of children’s bikes. He clarified that he needed a local, inexpensive bike that fit an adult male. So she sent him a list of adult bikes in Oxford needing repair.

Usually humans understand each other’s desires better than this. Our evolved psychological unity causes us to share a common sense and common desires. Ask me to find you a bike, and I’ll assume you want one in working condition, that fits your size, is not made of gold, etc.—even though you didn’t actually say any of that.

But a different mind architecture, one that didn’t evolve with us, won’t share our common sense. It wouldn’t know what not to do. How do you make a cake? “Don’t use squid. Don’t use gamma radiation. Don’t use Toyotas.” The list of what not to do is endless.

Some people think an advanced AI will be some kind of super-butler, doing whatever they ask with incredible efficiency. But it’s more accurate to imagine an Outcome Pump: a non-sentient device that makes some outcomes more probable and other outcomes less probable. (The Outcome Pump isn’t magic, though. If you ask it for an outcome that is too improbable, it will break.)

Now, suppose your mother is trapped in a burning building. You’re in a wheelchair, so you can’t directly help. But you do have the Outcome Pump:

You cry “Get my mother out of the building!” . . . and press Enter.

For a moment it seems like nothing happens. You look around, waiting for the fire truck to pull up, and rescuers to arrive—or even just a strong, fast runner to haul your mother out of the building—

BOOM! With a thundering roar, the gas main under the building explodes. As the structure comes apart, in what seems like slow motion, you glimpse your mother’s shattered body being hurled high into the air, traveling fast, rapidly increasing its distance from the former center of the building.

Luckily, the Outcome Pump has a Regret Button, which rolls back time. You hit it and try again. “Get my mother out of there without blowing up the building,” you say, and press Enter.

So your mother falls out the window and breaks her neck.

After a dozen more hits of the Regret button, you tell the Outcome Pump:

Within the next ten minutes, move my mother (defined as the woman who shares half my genes and gave birth to me) so that she is sitting comfortably in this chair next to me, with no physical or mental damage.

You watch as all thirteen firemen rush the house at once. One of them happens to find your mother quickly and bring her to safety. All the rest die or suffer crippling injuries. The one fireman sets your mother down in the chair, then turns around to survey his dead and suffering colleagues. You got what you wished for, but you didn’t get what you wanted.

The problem is that your brain is not large enough to contain statements specifying every possible detail of what you want and don’t want. How did you know you wanted your mother to escape the building in good health without killing or maiming a dozen firemen? It wasn’t because your brain contained anywhere the statement “I want my mother to escape the building in good health without killing and maiming a dozen firemen.” Instead, you saw your mother escape the building in good health while a dozen firemen were killed or maimed, and you realized, “Oh, shit. I don’t want that.” Or you might have been able to imagine that specific scenario and realize, “Oh, no, I don’t want that.” But nothing so specific was written anywhere in your brain before it happened, or before you imagined the scenario. It couldn’t be; your brain doesn’t have room.

But you can’t afford to sit there, Outcome Pump in hand, imagining millions of possible outcomes and noticing which ones you do and don’t want. Your mother will die before you have time to do that.

What if her head is crushed, leaving her body? What if her body is crushed, leaving only her head? What if there’s a cryonics team waiting outside, ready to suspend the head? Is a frozen head a person? Is Terry Schiavo a person? How much is a chimpanzee worth?

Still, your brain isn’t infinitely complex. There is some finite set of statements that could describe the system that determines the judgments you would make. If we understood how every synapse and neurotransmitter and protein of the brain worked, and we had a complete map of your brain, then an AI could at least in principle compute which judgments you would make about a finite set of possible outcomes.

The moral is that there is no safe wish smaller than an entire human value system:

There are too many possible paths through Time. You can’t visualize all the roads that lead to the destination you give the [Outcome Pump]. “Maximizing the distance between your mother and the center of the building” can be done even more effectively by detonating a nuclear weapon. . . . Or, at higher levels of [Outcome Pump] intelligence, doing something that neither you nor I would think of, just like a chimpanzee wouldn’t think of detonating a nuclear weapon. You can’t visualize all the paths through time, any more than you can program a chess-playing machine by hardcoding a move for every possible board position.

And real life is far more complicated than chess. You cannot predict, in advance, which of your values will be needed to judge the path through time that the [Outcome Pump] takes. Especially if you wish for something longer-term or wider-range than rescuing your mother from a burning building.

. . . The only safe [AI is an AI] that shares all your judgment criteria, and at that point, you can just say “I wish for you to do what I should wish for.”

There is a cottage industry of people who propose the One Simple Principle that will make AI do what we want. None of them will work. We act not for the sake of happiness or pleasure alone. What we value is highly complex. Evolution gave you a thousand shards of desire. (To see what a mess this makes in your neurobiology, read the first two chapters of Neuroscience of Preference and Choice.)

This is also why moral philosophers have spent thousands of years failing to find a simple set of principles that, if enacted, would create a world we want. Every time someone proposes a small set of moral principles, somebody else shows where the holes are. Leave something out, even something that seems trivial, and things can go disastrously wrong:

Consider the incredibly important human value of “boredom”—our desire not to do “the same thing” over and over and over again. You can imagine a mind that contained almost the whole specification of human value, almost all the morals and metamorals, but left out just this one thing

—and so it spent until the end of time, and until the farthest reaches of its light cone, replaying a single highly optimized experience, over and over and over again.

Or imagine a mind that contained almost the whole specification of which sort of feelings humans most enjoy—but not the idea that those feelings had important external referents. So that the mind just went around feeling like it had made an important discovery, feeling it had found the perfect lover, feeling it had helped a friend, but not actually doing any of those things, having become its own experience machine. And if the mind pursued those feelings and their referents, it would be a good future and true; but because this one dimension of value was left out, the future became something dull. Boring and repetitive, because although this mind felt that it was encountering experiences of incredible novelty, this feeling was in no wise true.

Or the converse problem: an agent that contains all the aspects of human value, except the valuation of subjective experience. So that the result is a nonsentient optimizer that goes around making genuine discoveries, but the discoveries are not savored and enjoyed, because there is no one there to do so . . .

Value isn’t just complicated, it’s fragile. There is more than one dimension of human value, where if just that one thing is lost, the Future becomes null. A single blow and all value shatters. Not every single blow will shatter all value—but more than one possible “single blow” will do so.

You can see where this is going. Since we’ve never decoded an entire human value system, we don’t know what values to give an AI. We don’t know what wish to make. If we created superhuman AI tomorrow, we could only give it a disastrously incomplete value system, and then it would go on to do things we don’t want, because it would be doing what we wished for instead of what we wanted.

Right now, we only know how to build AIs that optimize for something other than what we want. We only know how to build dangerous AIs. Worse, we’re learning how to make AIs safe much more slowly than we’re learning to how to make AIs powerful, because we’re devoting more resources to the problems of AI capability than we are to the problems of AI safety.

The clock is ticking. AI is coming. And we are not ready.

Intelligence Explosion

Suppose you’re a disembodied spirit watching the universe evolve. For the first nine billion years, almost nothing happens.

“God, this is so boring!” you complain.

“How so?” asks your partner.

“There’s no depth or complexity to anything because nothing aims at anything. Nothing optimizes for anything. There’s no plot. It’s just a bunch of random crap that happens. Worse than Seinfeld.”

“Really? What’s that over there?”

You follow your partner’s gaze and notice a tiny molecule in a pool of water on a rocky planet. Before your eyes, it makes a copy of itself. And then another. And then the copies make copies of themselves.

“A replicator!” you exclaim. “Within months there could be millions of those things.”

“I wonder if this will lead to squirrels.”

“What are squirrels?”

Your partner explains the functional complexity of squirrels, which they encountered in Universe 217.

“That’s absurd! At anything like our current rate of optimization, we wouldn’t see any squirrels come about by pure accident until long after the heat death of the universe.”

But soon you notice something even more important: some of the copies are errors. The copies are exploring the neighboring regions of the conceptual search space. Some of these regions contain better replicators, and those superior replicators end up with more copies of themselves than the original replicators and explore their own neighborhoods.

The next few billions years are by far the most exciting you’ve seen. Simple replicators lead to simple organisms, which lead to complex life, which leads to brains, which lead to the Homo line of apes.

At first, Homo looks much like any other brained animal. It shares 99% of its coding DNA with chimpanzees. You might be forgiven for thinking the human brain wasn’t that big a deal—maybe it would enable a 50% increase in optimization speed, or something like that. After all, animal brains have been around for millions of years, and have gradually evolved without any dramatic increase in function.

But then one thing leads to another. Before your eyes, humans become smart enough to domesticate crops, which leads to a sedentary lifestyle and repeatable trade, which leads to writing for keeping track of debts. Farming also generates food surpluses, and that enables professional specialization, which gives people the ability to focus on solving problems other than finding food and fucking. Professional specialization leads to science and technology and the industrial revolution, which lead to space travel and iPhones.

The difference between chimpanzees and humans illustrates how powerful it can be to rewrite an agent’s cognitive algorithms. But, of course, the algorithm’s origin in this case was merely evolution, blind and stupid. An intelligent process with a bit of foresight can leap through the search space more efficiently. A human computer programmer can make innovations in a day that evolution couldn’t have discovered in billions of years.

But for the most part, humans still haven’t figured out how their own cognitive algorithms work, or how to rewrite them. And the computers we program don’t understand their own cognitive algorithms, either (for the most part). But one day they will.

Which means the future contains a feedback loop that the past does not:

If you’re eurisko, you manage to modify some of your metaheuristics, and the metaheuristics work noticeably better, and they even manage to make a few further modifications to themselves, but then the whole process runs out of steam and flatlines.

It was human intelligence that produced these artifacts to begin with. Their own optimization power is far short of human—so incredibly weak that, after they push themselves along a little, they can’t push any further. Worse, their optimization at any given level is characterized by a limited number of opportunities, which once used up are gone—extremely sharp diminishing returns. . . .

When you first build an AI, it’s a baby—if it had to improve itself, it would almost immediately flatline. So you push it along using your own cognition . . . and knowledge—not getting any benefit of recursion in doing so, just the usual human idiom of knowledge feeding upon itself and insights cascading into insights. Eventually the AI becomes sophisticated enough to start improving itself—not just small improvements, but improvements large enough to cascade into other improvements. . . . And then you get what I. J. Good called an “intelligence explosion.”

. . . and the AI leaves our human abilities far behind.

At that point, we might as well be dumb chimpanzees watching as those newfangled “humans” invent fire and farming and writing and science and guns and planes and take over the whole world. And like the chimpanzee, at that point we won’t be in a position to negotiate with our superiors. Our future will depend on what they want.

The AI Problem, with Solutions

We find ourselves at a crucial moment in Earth’s history. Like a boulder perched upon a mountain’s peak, we stand at an unstable point. We cannot stay where we are: AI is coming provided that scientific progress continues. Soon we will tumble down one side of the mountain or another to a stable resting place.

One way lies human extinction. (“Go extinct? Stay on that square.”) Another resting place may be a stable global totalitarianism that halts scientific progress, although that seems unlikely.1

What about artificial intelligence? AI leads to intelligence explosion, and, because we don’t know how to give an AI benevolent goals, by default an intelligence explosion will optimize the world for accidentally disastrous ends. A controlled intelligence explosion, on the other hand, could optimize the world for good. (More on this option in the next chapter.)

I, for one, am leaning all my weight in the direction of this last valley: a controlled intelligence explosion.

For a fleeting moment in history we are able to comprehend (however dimly) our current situation and influence which side of the mountain we are likely to land in. What, then, shall we do?

Differential Intellectual Progress

What we need is differential intellectual progress:

Differential intellectual progress consists in prioritizing risk-reducing intellectual progress over risk-increasing intellectual progress. As applied to AI risks in particular, a plan of differential intellectual progress would recommend that our progress on the scientific, philosophical, and technological problems of AI safety outpace our progress on the problems of AI capability such that we develop safe superhuman AIs before we develop (arbitrary) superhuman AIs. Our first superhuman AI must be a safe superhuman AI, for we may not get a second chance.

To oversimplify, AI safety research is in a race against AI capabilities research. Right now, AI capabilities research is winning, and in fact is pulling ahead. Humanity is pushing harder on AI capabilities research than on AI safety research.

If AI capabilities research wins the race, humanity loses. If AI safety research wins the race, humanity wins.

Many people know what it looks like to push on AI capabilities research. That’s most of the work you read about in AI. But what does it look like to push on AI safety research?

This article contains a long list of problem categories in AI safety research, but for now let me give just a few examples. (Skip this list if you want to avoid scary technical jargon.)

Besides these technical research problems, we could also consider differential intellectual progress to recommend progress on a variety of strategic research problems. Which technologies should humanity move funding toward or away from? What can we do to reduce the risk of an AI arms race? Will it reduce AI risk to encourage widespread rationality training or benevolence training? Which interventions should we prioritize?

Action, Today

So one part of the solution to the problem of AI risk is differential intellectual progress. Another part of the solution is to act on the recommendations of the best strategy research we can do. For example, the following actions probably reduce AI risk:


Thus far I’ve been talking about AI risk, but it’s important not to lose sight of the opportunity of AI either:

We don’t usually associate cancer cures or economic stability with artificial intelligence, but curing cancer is ultimately a problem of being smart enough to figure out how to cure it, and achieving economic stability is ultimately a problem of being smart enough to figure out how to achieve it. To whatever extent we have goals, we have goals that can be accomplished to greater degrees using sufficiently advanced intelligence.

In my final chapter, I will try to explain just how good things can be if we decide to take action and do AI right.

Yes, we must be sober about the fact that nothing in physics prohibits very bad outcomes. But we must also be sober about the fact that nothing in physics prohibits outcomes of greater joy and harmony than our primitive monkey brains can imagine.

* * *

1Bryan Caplan, “The Totalitarian Threat,” in Bostrom and Ćirković, Global Catastrophic Risks, 504–519.

2Peter de Blanc, Ontological Crises in Artificial Agents’ Value Systems (The Singularity Institute, San Francisco, CA, May 19, 2011), http://arxiv.org/abs/1105.3821.

Engineering Utopia

I’ve already talked about how things can go very, very wrong with AI. With no God to save us, we’re left with physics—and nothing in the laws of physics says that humanity won’t fail miserably and die. But neither does physics prevent us from creating a future better than anything we can presently imagine—a real utopia.

Imagine our caveman ancestors looking up at the birds soaring overhead. They could not have imagined that, thousands of years later, millions of humans would fly around the world in jets each day. Our ancestors looked up at the stars and planets, too—never dreaming we would one day walk on the moon. There was once a time when the average human couldn’t expect to live much past age thirty.

It’s easy to underestimate what the future will bring. When thinking about a post-intelligence explosion universe, it’s important to recognize that the benefits of a successful intelligence explosion could be much greater than we can currently imagine, for our imaginations are limited.

Another thing we have to realize is that these changes don’t have to take place over thousands of years. Many of the technologies upon which utopia could be built are in the process of being developed now, and future technologies will be developed much faster if enhanced whole brain emulations and superintelligent AIs are developing them. Economist Robin Hanson reminds us:

Though such growth may seem preposterous, consider that in the era of hunting and gathering, the economy doubled nine times; in the era of farming, it doubled seven times; and in the current era of industry, it has so far doubled 10 times. If, for some as yet unknown reason, the number of doublings is similar across these three eras, then we seem already overdue for another transition.1

What I’m about to say may sound like science fiction, but that’s no reason to dismiss it. We have a long history of turning science fiction into science fact. And it doesn’t stop now.

Real Utopia

“Utopia.” It’s an easy concept to comprehend once you realize that it’s what we have always been seeking. We’re changing the world around us to fit our needs and desires like no other creature before us. But, with machine superintelligence on our side, we could be vastly more successful at realizing utopia than ever before. Consider what a time traveler from a post-intelligence explosion universe might tell us:

My consciousness is wide and deep, my life long. I have read all your authors—and much more. I have experienced life in many forms and from many angles: jungle and desert, gutter and palace, heath and suburban creek and city back alley. I have sailed on the high seas of cultures, and swum, and dived. Quite some marvelous edifice builds up over a million years by the efforts of homunculi, just as the humble polyps amass a reef in time. And I’ve seen the shoals of colored biography fishes, each one a life story, scintillate under heaving ocean waters.2

Utopia is not a world in which humans are subjected to a never-ending source of empty pleasure. Citizens of utopia have the technology to simulate and experience thousands of environments and worlds. There is no shortage of novel and meaningful experiences.

Citizens of utopia are not mindless drones who never use their brains to solve anything. Just as making a video game easier does not always make it more fun, citizens of utopia still face challenges and experience the joy it is to overcome them.

Utopia is not meaningless, predictable, or boring. People living in utopia become stronger, not weaker, over time. Their lives grow more exciting and wonderful, their emotions more varied and exciting. They become more fulfilled as time goes on, not less so.

Utopia is a world in which our experience is limited only by our desire and imagination.

No Pain

Let’s start with something simple: Imagine a life without pain. Pain is not a law of physics. It is a consequence of (1) current human biology and (2) our ignorance about how to modify or transcend it. Pain is just a mechanism stumbled upon by evolution to inform us that “you should avoid that” or “you should take care of that injury.” But we could have systems that give us that information without also giving us pain. That’s how today’s AIs do it. In the future, we’ll be able to master pain, in human bodies and other substrates.3

And that’s nothing compared to the best of what could come, if we do AI right.

No Death or Aging

We wouldn’t just eliminate pain; we could do away with death and aging as well. Death and aging aren’t written in the laws of physics, either. They, too, are merely consequences of current human biology and our ignorance about how to modify or transcend it. Turritopsis nutricula, the “immortal jellyfish,” is biologically immortal because it didn’t evolve to die after a few decades like humans did. And if our minds are uploaded to computers, we’ll be able to make backup copies and achieve digital immortality.

Would you even want to live indefinitely? I think Patrick Hayden got it right when he said, “Personally, I’ve been hearing all my life about the Serious Philosophical Issues posed by life extension, and my attitude has always been that I’m willing to grapple with those issues for as many centuries as it takes.”

Eliminating death and aging isn’t just wishful thinking or science fiction. The more we understand how aging and death work, the more we will be able to control them.

Radical Abundance

In 1959, Richard Feynman gave a lecture called “Plenty of Room at the Bottom.” In it, Feynman talked about how nothing in physics prohibits us from constructing objects atom by atom. He considered the implications this would have for storing information, chemical synthesis, and manufacturing. Decades later, the field of nanotechnology emerged, inspired by Feynman’s lecture. Today we are able to construct some things atom by atom, just as Feynman predicted, and our powers to do so grow every year.

Molecular nanotechnology (MNT) was envisioned shortly thereafter. MNT is the technology we will use to build the future. With intelligently-guided nano-factories or self-replicating nano-bots and a stock of materials, we will be able to rapidly erect arbitrary structures. If we can rearrange atoms in whatever configuration we want (as long as they obey physical law), we can make a banana without having to grow it. We can make a car without needing large factories to assemble it. If it’s made of atoms, you can probably make it with MNT and sufficient intelligence.

Why is this important? Poverty exists, in great part, because of poor resource allocation. If food and housing were as abundant as, say, air, then nobody would be homeless or starving, and MNT would make it ludicrous to buy a car for tens of thousands of dollars. Things that we consider luxury items today would be, more or less, equally accessible to everyone. When you’re able to manipulate matter at the atomic level, making a block of diamond is just as easy as making a block of coal.

Endless Adventure

The implications of a safe and effective MNT would go beyond eliminating poverty and making our lives easier. MNT opens up possibilities for creating much more powerful computers, because data storage and processing structures could be arranged as perfectly and efficiently as is physically possible.

What could we do with such vast computational resources? Think of Pandora from James Cameron’s Avatar. Pandora, a fictional world, is magnificent, breathtaking, and very different from our own. Yet, for all its fantastical elements, people’s experience of that world felt real. A bunch of moving images on a screen created an experience so profound, that people left the theater depressed at the thought that they could not live in Pandora.

Imagine creating a computer simulation of Pandora and being able to enter that simulation. You could live as an inhabitant of Pandora for a whole lifetime. You could see, hear, taste, and feel this universe just as intensely as one would in reality. We could simulate thousands of different universes and experience them all.

Engineering Utopia

These examples suggest a bare minimum for how good a positive intelligence explosion could be, in the spirit of exploratory engineering. The actual outcome of a positive intelligence explosion will likely be completely different, for example much less anthropomorphic.

Once again: our current limitations aren’t fixed by physics, but by the limits of our intelligence and the resources that can be used at our current level of intelligence. With self-improving machine superintelligence acting on our behalf, our biological limits can be transcended.

Humans have always wanted more than what the universe gives us by default. We want to experience sounds that do not exist in nature, so we make music. We want to taste something more delicious than anything that would grow by itself, so we cook, and we develop cuisines. We want to explore worlds beyond the one in which we evolved, so we build ships, cars, planes, and spaceships to carry us off into distant lands. We invent literature, film, and art to experience the world from a different perspective.

We are, and have always been, engineering our own utopia. With superintelligence, we will have the opportunity to do it faster, better, and more completely than ever before.

But only if we try. Only if we decide that utopia tomorrow is more important than slight improvements to our already rich lives today.

I wrote before:

We find ourselves at a crucial moment in Earth’s history. Like a boulder perched upon a mountain’s peak, we stand at an unstable point. We cannot stay where we are: AI is coming provided that scientific progress continues. Soon we will tumble down one side of the mountain or another to a stable resting place.

In which direction will you push on the boulder?

* * *

1Robin Hanson, “Economics of the Singularity,” IEEE Spectrum 45 (6 2008): 45–50, doi:10.1109/MSPEC.2008.4531461.

2Nick Bostrom, “Letter from Utopia” (Forthcoming revision, 2010), accessed November 10, 2012, http://www.nickbostrom.com/utopia.pdf.

3Ben Goertzel, “The Technological Mastery of Pain is Both Feasible and Desirable,” H+ Magazine, May 28, 2012, http://hplusmagazine.com/2012/05/28/the-technological-elimination-of-pain-is-both-feasible-and-desirable/.