Taking inspiration from genetic screening techniques, researchers from Harvard and MIT have demonstrated a way to build better artificial visual systems with the help of low-cost, high-performance gaming hardware. The neural processing involved in visually recognizing even the simplest object in a natural environment is profound -- and profoundly difficult to mimic. Neuroscientists have made broad advances in understanding the visual system, but much of the inner workings of biologically-based systems remain a mystery.
Using Graphics Processing Units (GPUs), the same technology video game designers use to render life-like graphics, researchers are now making progress faster than ever before. A new study, co-led by David Cox, Principal Investigator of the Visual Neuroscience Group at the Rowland Institute at Harvard, and Nicolas Pinto, a Ph.D. Candidate in James DiCarlo's laboratory at the McGovern Institute for Brain Research and the Department of Brain and Cognitive Sciences at MIT, was published in the November 26th issue of PLoS Computational Biology.
"Reverse engineering a biological visual system -- a system with hundreds of millions of processing units -- and building an artificial system that works the same way is a daunting task," says Cox. "It is not enough to simply assemble together a huge amount of computing power. We have to figure out how to put all the parts together so that they can do what our brains can do."
"While studying the brain has yielded critical information about how the brain is wired, we currently don't have enough information to build a computer system that works like the brain does," adds Pinto. "Even if we take all of the clues that we have available from experimental neuroscience, there is still an enormous range of possible models for us to explore."
To tackle this problem, the team drew inspiration from screening techniques in molecular biology, where a multitude of candidate organisms or compounds are screened in parallel to find those that have a particular property of interest. Rather than building a single model and seeing how well it could recognize visual objects, the team constructed thousands of candidate models, and screened for those that performed best on an object recognition task.
The resulting models outperformed a crop of state-of-the-art computer vision systems across a range of test sets, more accurately identifying a range of objects on random natural backgrounds with variation in position, scale, and rotation.
Using ordinary computer processing units, the effort would have required either years of time or millions of dollars of computing hardware. Instead, by harnessing modern graphics hardware, the analysis was done in just one week, and at a small fraction of the cost.
"GPUs are a real game-changer for scientific computing. We made a powerful parallel computing system from cheap, readily available off-the-shelf components, delivering over hundred-fold speed-ups relative to conventional methods," says Pinto. "With this expanded computational power, we can discover new vision models that traditional methods miss."
This high-throughput approach could be applied to other areas of computer vision, such as face identification, object tracking, pedestrian detection for automotive applications, and gesture and action recognition. Moreover, as scientists understand better what components make a good artificial vision system, they can use these hints when studying real brains to understand them better as well.
"Reverse and forward engineering the brain is a virtuous cycle. The more we learn about one, the more we can learn about the other," says Cox. "Tightly coupling experimental neuroscience and computer engineering holds the promise to greatly accelerate both fields."
Cox's and Pinto's co-authors included David Doukhan and James J. DiCarlo, both of the McGovern Institute for Brain Science and Department of Brain and Cognitive Sciences at MIT. Hardware for the study was donated by the NVIDIA Corporation.