As we watch robots, autonomous vehicles, artificial intelligence machines and the like slowly (and sometimes rapidly) permeate our world, it’s not hard to imagine them going from permeating to taking over. Reaching that point, even those who don’t watch sci-fi movies can see the possibility of rogue robots and AI machines using computer logic to eliminate the one thing that could stop their takeover – humans – and initiating a robot apocalypse. Fortunately, some computer scientists in California have developed an algorithm that trains robots to always avoid the bad, human-elimination decisions and stay on the path of good, obedient servitude instead. How well do you think this will work?
“We want to advance AI that respects the values of its human users and justifies the trust we place in autonomous systems.”
In a Stanford University press release announcing the publication in the journal Science of the paper “Preventing undesirable behavior of intelligent machines,” Emma Brunskill, an assistant professor of computer science at Stanford and senior author of the paper, illustrates the alleged noble goal of robot and AI developers – making them tools for the good of humankind – and describes the algorithm she and her colleagues have designed to accomplish – and hopefully guarantee – that goal.
“We show how the designers of machine learning algorithms can make it easier for people who want to build AI into their products and services to describe unwanted outcomes or behaviors that the AI system will avoid with high-probability.”
The idea is to describe bad decisions and behaviors mathematically. While “don’t do this or the patient will die” decisions are easy to describe, behaviors such as gender bias – a line AI is crossing as it’s used more and more to evaluate employment or school applications – requires a sense of fairness that is obviously much harder to define mathematically. The Stanford researchers modified the algorithm in a program designed to predict the future grade point averages of students and it learned to avoid bias toward one gender. While “fairness” is much more vague than “life or death,” the researchers believe it can – and should – be built into all AI using their algorithm.
Does all of this sound familiar? The paper refers to this solution as a “Seldonian algorithm” and fans of Isaac Asimov will recognize that obvious reference to Hari Seldon, the mathematics professor at Streeling University on the planet Trantor in Asimov’s “Foundation” sci-fi series. Seldon developed psychohistory, an algorithmic science that allowed him to predict the future in probabilistic terms. It worked for Hari Seldon … will it work for us?
“Given the recent rise of real-world ML applications and the corresponding surge of potential harm that they could cause, it is imperative that ML algorithms provide their users with an effective means for controlling behavior. To this end, we have proposed a framework for designing ML algorithms and shown how it can be used to construct algorithms that provide their users with the ability to easily (that is, without requiring additional data analysis) place limits on the probability that the algorithm will produce any specified undesirable behavior.”
In the paper’s conclusion on using machine learning (ML) algorithms to change robot behavior, that nasty word “probability” pops up. The algorithm is not a 100% guarantee of robot apocalypse prevention and even if it were, it depends on humans using it in all applications. Just as it’s easy to imagine a robot apocalypse, it’s easy to imagine it being caused by unscrupulous businesses leaving the “bad behavior prevention algorithm” out to cut costs or make more profits.
Are we doomed?