Friday, May 22, 2015

MPEG’s Compact Descriptors for Visual Search (CDVS)

Recently, I found these new documents regarding the MPEG-7 - Multimedia Content Description Interface

Past decades have seen an exponential growth in usage of digital media. Early solutions to the management of these massive amounts of digital media fell short of expectations, stimulating intensive research in areas such as Content Based Image Retrieval (CBIR) and, most recently, Visual Search (VS) and Mobile Visual Search (MVS).

The field of Visual Search has been researched for more than a decade leading to recent deployments in the marketplace. As many companies are coming up with proprietary solutions to address the VS challenges, resulting in a fragmented technological landscape and a plethora of non-interoperable systems, MPEG introduces a new worldwide standard for the VS and MVS technology.

MPEG’s Compact Descriptors for Visual Search (CDVS) aims to standardize technologies, in order to enable an interoperable, efficient and cross-platform solution for internet-scale visual search applications and services.

The forthcoming CDVS standard is particularly important because it will ensure interoperability of visual search applications and databases, enabling high level of performance of implementations conformant to the standard, simplifying design of descriptor extraction and matching for visual search applications. It will also enable low complex, low memory hardware support for descriptor extraction and matching in mobile devices and sensibly reduce load on wireless networks carrying visual search-related information. All this will stimulate the creation of an ecosystem benefiting consumers, manufacturers, content and service providers alike.


MAV Urban Localization from Google Street View Data

A.L. Majdik, D. Verda, Y. Albers-Schoenberg, D. Scaramuzza, Air-ground Matching: Appearance-based GPS-denied Urban Localization of Micro Aerial Vehicles, Journal of Field Robotics, 2015.

Read the paper

The state of the art in robotics - highlights from the ICRA 2015 conference

Dyson 360 Eye and Baidu Deep Learning at the Embedded Vision Summit in Santa Clara

Article From

While vision has been a research priority for decades, the results have often remained out of reach of the consumer. Huge strides have been made, but the final, and perhaps toughest, hurdle is how to integrate vision into real world products. It’s a long road from concept to finished machine, and to succeed, companies need clear objectives, a robust test plan, and the ability to adapt when those fail. 

Image from ExtremeTech: Dyson 360 Eye: Dyson’s ‘truly intelligent’ robotic vacuum cleaner is finally here

The Dyson 360 Eye robot vacuum cleaner uses computer vision as its primary localization technology. 10 years in the making, it was taken from bleeding edge academic research to a robust, reliable and manufacturable solution by Mike Aldred and his team at Dyson. 

Mike Aldred’s keynote at next week's Embedded Vision Summit (May 12th in Santa Clara) will chart some of the high and lows of the project, the challenges of bridging between academia and business, and how to use a diverse team to take an idea from the lab into real homes.

Enabling Ubiquitous Visual Intelligence Through Deep Learning

Ren Wu 

Distinguished Scientist, Baidu Institute of Deep Learning

Deep learning techniques have been making headlines lately in computer vision research. Using techniques inspired by the human brain, deep learning employs massive replication of simple algorithms which learn to distinguish objects through training on vast numbers of examples. Neural networks trained in this way are gaining the ability to recognize objects as accurately as humans. Some experts believe that deep learning will transform the field of vision, enabling the widespread deployment of visual intelligence in many types of systems and applications. But there are many practical problems to be solved before this goal can be reached. For example, how can we create the massive sets of real-world images required to train neural networks? And given their massive computational requirements, how can we deploy neural networks into applications like mobile and wearable devices with tight cost and power consumption constraints? 

Ren Wu’s morning keynote at next week's Embedded Vision Summit (May 12th in Santa Clara) will share an insider’s perspective on these and other critical questions related to the practical use of neural networks for vision, based on the pioneering work being conducted by his team at Baidu.

Read More

Love affair with CBIR

Article from Technical Insanity

In this article, I will walkthru the sample application, which I have created to demonstrate the working of CBIR. This is continuation of the series of article, so, if you are not able to catch with this post, read the previous ones to understand the context.

Last article, you have probably read about the theory involve in finding the similar image. But, you won’t get it, till you see the working code. Same was my frustration, when I was doing the research for the CBIR stuff. I find tons of papers from various universities around the world. But, it was hard to find working demo, especially in .NET I googled for days, found blogs, article, but most of it doesn’t share implementation code. Most of the Computer Vision stuff happens in C++\Matlab\Phyton. Very few taker for .NET languages.

That’s where I decided to help the poor souls like me, who are searching for CBIR implementation in .NET would get benefited with this proof of concept application. Especially, college student’s who are trying to learn C# language, .NET framework and trying to built CBIR all at same time. This would be boon from them, like the oasis in the desert of internet

I would be showing you demo on the Wang image dataset. You can download 1000 test images, which contain 10 sets of 100 images each. This will help you understand how reverse image search works, and how efficient the algorithm is, in getting the similar image.

You can download Image Database application from


Note: This application which I have build is under GPLv3 license, but the libraries it (EMGU CV, Accord Framework, etc.) aren’t under same license. If you need to use it in commercial application, I request you to check the license of individual libraries used in this application, and use it appropriately.

This utility is been developed in WPF, .NET framework 4.5 as the proof of concept for various image recognition algorithm.

This utility consists of three areas


Continue Reading