Thanks to advances in machine learning, computers have gotten really good at identifying what’s in photographs. They started beating humans at the chore years ago, and can now even make fake images that appear eerily real. While the technology has come a long way, it’s still not entirely foolproof. In particular, researchers have found that image detection algorithms remain susceptible to a class of problems called adversarial examples.
Adversarial instances are like optical( or audio) illusions for AI. By altering a handful of pixels, a computer scientist can fool a machine learning classifier into thinking, say, a picture of a rifle is actually one of a helicopter. But to you or me, the image still would look like a gun–it nearly seems like the algorithm is hallucinating. As image recognition technology is used in more places, adversarial examples may present a troubling security risk. Experts have shown they can be used to do things like cause a self-driving car to ignore a stop sign, or make a facial recognition system falsely identify someone.
Organizations like Google and the US Army have analyzed adversarial instances, but what exactly causes them is still largely a mystery. Portion of the problem is that the visual world is incredibly complex, and photos can contain millions of pixels. Another issue is deciphering whether adversarial instances are a product of the original photos, or how an AI is trained to look at them. Some researchers have hypothesized they are a high-dimensional statistical phenomenon, or caused when the AI isn’t trained on enough data.
Now, a leading group of researchers from MIT have found a different answer, in a paper that was presented earlier this week: adversarial instances merely look like hallucinations to people . In reality, the AI is picking up on tiny details that are imperceptible to the human eye. While you might look at an animal’s ears to differentiate a puppy from a cat, AI sees minuscule patterns in the photo’s pixels and uses those to classify it. “The only thing that make-ups these features special is that we as humen are not sensitive to them, ” says Andrew Ilyas, a PhD student at MIT and one of the leading writers of the run, which has yet to be peer-reviewed.
The explanation attains intuitive sense, but is difficult to document because it’s hard to untangle which features an AI uses to classify an image. To conduct their study, the researchers used a fiction technique to separate “robust” characteristics of images, which humen can often perceive, from the “non-robust” ones that only an AI can see. Then in one experiment, they trained a classifier utilizing an intentionally mismatched dataset of images. According to the robust features–i.e ., what the pictures looked like to the human eye–the photos were of dogs. But according to the non-robust features, invisible to us, the photos were in fact of cats, and that’s how the classifier was trained–to think the photos were of kitties.
The researchers then tested illustrate the classifier new , normal pictures of cats it hadn’t seen before. It was able to identify the kitties correctly, indicating the AI was relying on the hide , non-robust features embedded in the training set. That indicates these invisible characteristics represent real patterns in the visual world, simply ones that humans can’t assure. And adversarial examples are instances where these patterns don’t line up with how we view the world.
When algorithms fall for an adversarial example, they’re not hallucinating–they’re watch something that people don’t. “It’s not something that the model is doing weird, it’s just that you don’t watch these things that are really predictive, ” says Shibani Santurkar, a PhD student at MIT and another lead author on the working papers. “It’s about humen not being able to see these things in the data.”
The study calls into question whether computer scientists can really explain how their algorithms make decisions. “If we know that our models are relying on these microscopic patterns that we don’t see, then we can’t pretend that they are interpretable in a human fashion, ” says Santurkar. That may be problematic, say, if someone needs to prove in tribunal that a facial recognition algorithm identified them incorrectly. There might not be a way to account for why the algorithm thought they were a person they’re not.
Engineers may ultimately need to make a choice between build automated systems that are the most accurate, versus ones that are the most similar to humans. If you force an algorithm to rely solely on robust features, there’s a chance it might induce more missteps than if it also utilized hidden , non-robust ones. But if the AI also leans on those invisible characteristics, it may be more susceptible to attacks like adversarial instances. As image recognition tech is increasingly used for tasks like identifying hate speech and scanning luggage at the airport, deciding how to navigate these kinds of trade offs will only become more important.