AI Has a Hallucination Problem That’s Proving Tough to Fix

Tech companies are rushing to infuse everything with artificial intelligence, driven by big leaps in the power of machine learning software. But the deep-neural-network software fueling the excitement has a troubling weakness: Making subtle changes to images, text, or audio can fool these systems into perceiving things that aren’t there.

That could be a big problem for products dependent on machine learning, particularly for vision, such as self-driving cars. Leading researchers are trying to develop defenses against such attacks–but that’s proving to be a challenge.

Case in phase: In January, a leading machine-learning meeting announces that it had selected 11 new papers to be presented in April that propose ways to defend or see such adversarial attacks. Just three days later, first-year MIT grad student Anish Athalye threw up a webpage claiming to have “broken” seven of the new newspapers, including from boldface institutions such as Google, Amazon, and Stanford. “A creative attacker can still get out all these defenses, ” tells Athalye. He worked on the project with Nicholas Carlini and David Wagner, a grad student and professor, respectively, at UC Berkeley.

That project has led to some academic back-and-forth over certain details of the trio’s asserts. But there’s little conflict about one message of the findings: It’s not clear how to protect the deep neural network fueling innovations in consumer gadgets and automated driving from sabotage by hallucination. “All these systems are vulnerable, ” says Battista Biggio, an assistant professor at the University of Cagliari, Italy, who has mulled machine learning security for about a decade, and wasn’t involved in such studies. “The machine learning community is lacking a methodological approach to asses security.”

Human readers of WIRED will easily identify the image below, created by Athalye, as depicting two men on skis. When asked for its take Thursday morning, Google’s Cloud Vision service reported being 91 percentage certain it considered a puppy. Other stunts have shown how to make stop signs invisible, or audio that voices benign to humen but is transcribed by software as “OK Google browse to evil dot com.”


So far, such attacks have been demonstrated merely in lab experimentations , not observed on streets or in homes. But they still need to be taken seriously now, tells Bo Li, a postdoctoral researcher at Berkeley. The vision systems of autonomous vehicles, voice assistants be permitted to spend money, and machine learning systems filtering unsavory content online all required to trustworthies. “This is potentially very dangerous, ” Li says. She contributed to research last year that showed attaching stickers to stop signs could make them invisible to machine learning software.

Li coauthored one of the papers reviewed by Athalye and his collaborators. She and others from Berkeley described a style to analyze adversarial attacks, and depicted it could be used to see them. Li is philosophical about Athalye’s project depicting the defense is porous, telling such feedback helps researchers make progress. “Their attack shows that there are some problems we need to take into account, ” she says.


The WIRED Guide to Artificial Intelligence

Yang Song, the lead writer of a Stanford study included in Athalye’s analysis, declined to comment on the run, since it is undergoing review for another major conference. Zachary Lipton, a professor at Carnegie Mellon University and coauthor of another newspaper that included Amazon researchers, said he hadn’t examined the analysis closely, but observes it plausible that all existing defenses can be sidestepped. Google declined to comment on the analysis of its own newspaper. A spokesman for the company highlighted Google’s commitment to research on adversarial assaults, and told updates are planned to the company’s Cloud Vision service to defend against them.

To build stronger defenses against such attacks, machine learning researchers may need to get meaner. Athalye and Biggio say the field should adopt practises from security research, which they say has a more rigorous tradition of testing new defensive techniques. “People tend to trust each other in machine learning, ” says Biggio. “The security mindset is precisely the opposite, you have to be always suspicious that something bad may happen.”

A major report from AI and national security researchers last month stimulated similar recommendations. It advised those working on machine learning to think more about how the technology they are creating could be misused or exploited.

Protecting against adversarial attacks will probably be easier for some AI systems than others. Biggio says that learning systems trained to detect malware should be easier to build more robust, for example, because malware must be functional, limiting how varied it can be. Protecting computer-vision systems is much more difficult, Biggio says, because the natural world is so differed, and images contain so many pixels.

Solving that problem–which could challenge designers of self-driving vehicles–may require a more radical rethink of machine-learning technology. “The fundamental problem I would say is that a deep neural network is very different from a human brain, ” tells Li.

Humans aren’t immune to sensory trickery. We can be fooled by optical illusion, and a recent paper from Google generated weird images that tricked both software and humen who glimpsed them for less than a one-tenth of a second to mistake cats for puppies. But when interpreting photos we look at more than patterns of pixels, and consider the relationship between different components of an image, such as the features of a person’s face, says Li.

Google’s most prominent machine-learning researcher, Geoff Hinton, is trying to give software that kind of ability. He thinks that would allow software to learn to acknowledge something from simply a few images , not thousands. Li thinks software with a more human view of the world should also be less susceptible to hallucinations. She and others at Berkeley have begun collaborating with neuroscientists and biologists to try and take clues from nature.

AI Exploitation

Google’s Cloud Computing service can be tricked into seeing things–in one test it perceived a rifle as a helicopter.

A scary survey of how artificial intelligence software could be hacked or misused indicates AI researchers need to be more paranoid.

Fake celebrity porn videos attained with help from machine learning software are spreading online, and the law can’t do much about it.

Leave a Comment