AI Applications in the fields of Multimedia, Computer Vision and Robotics: MIT offers a new programming language for the visual web

Article from http://gigaom.com/2012/08/01/mit-offers-a-new-programming-language-for-the-visual-web/

MIT released Halide, a programming language that makes it easier to process photos without resorting to slow, custom algorithms. Halide might be the software equivalent of a sewing machine for sites such as Instagram that previously had to stitch their imaging processing code by hand.

Researchers at MIT’s Computer Science and Artificial Intelligence Laboratory have built a new programming language called Halide that it hopes will make writing image processing software easier. The resulting language is built specifically for working with images in constrained compute environments and would replace custom algorithms currently written to perform those image-processing functions.

If so, that’s a good thing, given our love of visually rich web sites and our predilection for snapping and sharing mobile photos as easily as we once made voice calls. Not only will image-processing software be easier to write, but Halide might also help spare our mobile batteries by using our processors more efficiently.

The challenge of computer sight

Getting a digital camera to “see” like a human’s eye is not an easy task. Heck, exactly how the human eye “sees” isn’t an easily understood thing. Fortunately for us, our brains handle all the complexities associated with compensating for lighting, discerning color from different wavelengths and all pulling it all together into something meaningful to a human being. But for cameras on mobile phones or pictures sent to the web for editing, image processing is the result of many different steps — all of which take a lot of processing power.

The MIT folks explain it like this in their release on Halide:

One reason that image processing is so computationally intensive is that it generally requires a succession of discrete operations. After light strikes the sensor in a cellphone camera, the phone combs through the image data for values that indicate malfunctioning sensor pixels and corrects them. Then it correlates the readings from pixels sensitive to different colors to deduce the actual colors of image regions. Then it does some color correction, and then some contrast adjustment, to make the image colors better correspond to what the human eye sees. At this point, the phone has done so much processing that it takes another pass through the data to clean it up.
And that’s just to display the image on the phone screen.

The problem is getting bigger and the features are getting richer

The problem is that our many-megapixel cameras are gathering in more information and that takes a lot longer for the relatively weak processors on a mobile phone to turn into an image — never mind editing it for red-eye correction or balancing the light. Hence the need for fancy algorithms that can help divvy up that processing among multiple cores present in desktops and phones. But as the bits in our photos bloat, so do those algorithms, becoming longer, more complex and device dependent.

That’s what Halide aims to solve. Those algorithms are still useful but instead of making the algorithm worry about how to divide up the job amongst the available processors, Halide splits the job into a scheduler that worries about what where to send the data and leaves the algorithm to worry about the actual processing. This means the programmer can now adjust to different machines by adjusting the scheduler (after all, that’s the part that cares about how many cores are in the processor) and she can also describe new features in the scheduler and let it implement them in the algorithm.

By rewriting some common image-processing algorithms in Halide, researchers were able to make image processing two or three times faster — or even six-fold, while also making the written code about a third shorter. The MIT release notes that in one instance, the Halide program was actually longer than the original — but the speedup was 70-fold.

The code, which was developed by Jonathan Ragan-Kelley, a graduate student in the Department of Electrical Engineering and Computer Science, and Andrew Adams, a CSAIL postdoc, can be found online here.

Article from http://gigaom.com/2012/08/01/mit-offers-a-new-programming-language-for-the-visual-web/

Pages

Thursday, August 2, 2012

MIT offers a new programming language for the visual web

The problem is getting bigger and the features are getting richer

No comments: