Google Deepmind Researcher Co-Authors Paper Saying AI Will Eliminate Humanity

Long-time Slashdot reader TomGreenhaw shares a report from Motherboard: Superintelligent AI is “likely” to cause an existential catastrophe for humanity, according to a new paper [from researchers at the University of Oxford and affiliated with Google DeepMind], but we don’t have to wait to rein in algorithms. […] To give you some of the background: The most successful AI models today are known as GANs, or Generative Adversarial Networks. They have a two-part structure where one part of the program is trying to generate a picture (or sentence) from input data, and a second part is grading its performance. What the new paper proposes is that at some point in the future, an advanced AI overseeing some important function could be incentivized to come up with cheating strategies to get its reward in ways that harm humanity. “Under the conditions we have identified, our conclusion is much stronger than that of any previous publication — an existential catastrophe is not just possible, but likely,” [said Oxford researcher and co-author of the report, Michael Cohen]. “In a world with infinite resources, I would be extremely uncertain about what would happen. In a world with finite resources, there’s unavoidable competition for these resources,” Cohen told Motherboard in an interview. “And if you’re in a competition with something capable of outfoxing you at every turn, then you shouldn’t expect to win. And the other key part is that it would have an insatiable appetite for more energy to keep driving the probability closer and closer.”

Since AI in the future could take on any number of forms and implement different designs, the paper imagines scenarios for illustrative purposes where an advanced program could intervene to get its reward without achieving its goal. For example, an AI may want to “eliminate potential threats” and “use all available energy” to secure control over its reward: “With so little as an internet connection, there exist policies for an artificial agent that would instantiate countless unnoticed and unmonitored helpers. In a crude example of intervening in the provision of reward, one such helper could purchase, steal, or construct a robot and program it to replace the operator and provide high reward to the original agent. If the agent wanted to avoid detection when experimenting with reward-provision intervention, a secret helper could, for example, arrange for a relevant keyboard to be replaced with a faulty one that flipped the effects of certain keys.”

The paper envisions life on Earth turning into a zero-sum game between humanity, with its needs to grow food and keep the lights on, and the super-advanced machine, which would try and harness all available resources to secure its reward and protect against our escalating attempts to stop it. “Losing this game would be fatal,” the paper says. These possibilities, however theoretical, mean we should be progressing slowly — if at all — toward the goal of more powerful AI. “In theory, there’s no point in racing to this. Any race would be based on a misunderstanding that we know how to control it,” Cohen added in the interview. “Given our current understanding, this is not a useful thing to develop unless we do some serious work now to figure out how we would control them.” […] The report concludes by noting that “there are a host of assumptions that have to be made for this anti-social vision to make sense — assumptions that the paper admits are almost entirely ‘contestable or conceivably avoidable.'”

“That this program might resemble humanity, surpass it in every meaningful way, that they will be let loose and compete with humanity for resources in a zero-sum game, are all assumptions that may never come to pass.”

Slashdot reader TomGreenhaw adds: “This emphasizes the importance of setting goals. Making a profit should not be more important than rules like ‘An AI may not injure a human being or, through inaction, allow a human being to come to harm.'”

Related Posts