Quantcast

Neural Networks: What They Are and How They Know the Internet is Full of Cats

By Will Smith

Google built a 16,000 CPU neural network to parse images from YouTube videos. We're excited about this, but not for the reasons you might think.

Cats. The Internet is full of them. Researchers at Google X, the Goog's in-house skunkworks, has created a neural network out of a massive cluster of computers to detect patterns in YouTube videos. What patterns did the cluster detect? Cat faces. Somehow, I bet you aren't surprised.

It's important to understand exactly what's happened here, because it's simultaneously very exciting and kind of mundane. Detecting patterns in images is something that the human brain is so exceptionally good at that you don't realize how difficult a task it actually is. The fact that you can not only recognize that those shiny things in the carpark are in fact, automobiles is amazing. The fact that you can pick your car out from a group of hundreds is damn near miraculous.

If your brain is atop the leaderboard for pattern recognition, computers are near the bottom, somewhere below brine shrimp and some forms of protozoa. Anyone who has used facial recognition software in popular photo managers knows exactly how bad computers are at detecting faces. Conversely, studies have shown that parts of the human brain actually specialize in detecting faces. This is why the Thatcher Effect optical illusion works. You can thank the neural network in your brain for that.

Computer-based neural networks have much greater success at recognizing patterns in data than traditional computational models. They do this by mimicing the massively connected nature of neurons. The simulated neurons are arranged in layers, with each neuron in a layer connected to all the neurons in the layer above and below it. This is a gross oversimplification, but data enters the input layer of the network, which triggers a series of signals. Those signals propogate through the network and eventually exit the output side of the network. That output contains the information that the neural network uncovered.

Typically, before you can use a neural network to detect a pattern in data, you need to seed it with examples of the pattern you want to find. This trains the network to look for specific patterns. If you want to find pictures of cats, you seed the network with bunches of pictures of cats.

The Google X team did something a little different. They ran millions of images culled from YouTube videos through the neural network, but they didn't seed the network with patterns first. Instead they just let the algorithm find patterns in the data--all of the patterns. They reported that the network found human faces, human bodies, and cats. The top-performing networks were almost twice as accurate at detection as previous efforts.

This is not a cat.

The big news here isn't that the Internet is full of cats. We knew that. Proof of concept that neural networks can still work when they're scaled up to massive clusters of machines and equally massive data sets is a huge step forward. Google X processed 10,000,000 images using a 16,000 CPU cluster in about 3 days, and the images were significantly larger than normal.

However, that doesn't mean that this type of processing will be coming to your Android phone anytime soon. And your computer still doesn't understand the philosophical implications of Magritte paintings--it doesn't understand the metaphysical difference between a picture of a cat and a cat. Even the fanciest neural network built is nothing more than a pattern recognizer.