AI Applications in the fields of Multimedia, Computer Vision and Robotics: TOP-SURF

Monday, December 6, 2010

TOP-SURF

Αναρτήθηκε από Savvas Chatzichristofis

TOP-SURF is an image descriptor that combines interest points with visual words, resulting in a high performance yet compact descriptor that is designed with a wide range of content-based image retrieval applications in mind. TOP-SURF offers the flexibility to vary descriptor size and supports very fast image matching. In addition to the source code for the visual word extraction and comparisons, we also provide a high level API and very large pre-computed codebooks targeting web image content for both research and teaching purposes.

Authors
Bart Thomee, Erwin M. Bakker and Michael S. Lew

Licenses
The TOP-SURF descriptor is completely open source, although the libraries it depends on use different licenses. As the original SURF descriptor is closed source, we used the open source alternative called OpenSURF, which is released under the GNU GPL version 3 license. OpenSURF itself is dependent on OpenCV that is released under the BSD license. Furthermore we used FLANN for approximate nearest neighbor matching, which is also released under the BSD license. To represent images we used CxImage, which is released under the zlib license. Our own code is licensed under the GNU GLP version 3 license, and also under the Creative Commons Attribution version 3 license. The latter license simply asks you give us credit whenever you use our library. All the aforementioned open source licenses are compatible with each other.

Visualization of the visual words

Figure 1. Visualizing the visual words.

Comparing descriptors

Figure 2. Comparing the descriptors of several images using cosine normalized difference, which ranges between 0 (identical) and 1 (completely different). The first image is the original image, the second is the original image but significantly changed in saturation, the third image is the original image but framed with black borders and the fourth image is a completely different one. Using a dictionary of 10,000 words, the distance between the first and second images is 0.42, the distance between the first and the third is 0.64 and the distance between the first and the fourth is 0.98. We have noticed that a (seemingly high) threshold of around 0.80 appears to be able to separate the near-duplicates from the non-duplicates, although this value requires more validation.

http://press.liacs.nl/researchdownloads/topsurf/

Pages

Monday, December 6, 2010

TOP-SURF

No comments: