Recently, standard benchmark databases and evaluation campaigns have been created allowing a quantitative comparison of CBIR systems. These benchmarks allow the comparison of image retrieval systems under different aspects: usability and user interfaces, combination with text retrieval, or overall performance of a system.
1. WANG database
The WANG database is a subset of 1,000 images of the Corel stock photo database which have been manually selected and which form 10 classes of 100 images each. The WANG database can be considered similar to common stock photo retrieval tasks with several images from each category and a potential user having an image from a particular category and looking for similar images which have e.g. cheaper royalties or which have not been used by other media. The 10 classes are used for relevance estimation: given a query image, it is assumed that the user is searching for images from the same class, and therefore the remaining 99 images from the same class are considered relevant and the images from all other classes are considered irrelevant
2. The MIRFLICKR-25000 Image Collection
Access to the collection is simple and reliable, with image copyright clearly established. This is realized by selecting only images offered under the Creative Commons license. See the copyright section below.
Images are also selected based on their high interestingness rating. As a result the image collection is representative for the domain of original and high-quality photography.
In particular for the research community dedicated to improving image retrieval. We have collected the user-supplied image Flickr tags as well as the EXIF metadata and make it available in easy-to-access text files. Additionally we provide manual image annotations on the entire collection suitable for a variety of benchmarks.
MIRFLICKR-25000 is an evolving effort with many ideas for extension. So far the image collection, metadata and annotations can be downloaded below. If you enter your email address before downloading, we will keep you posted of the latest updates.
3. UW database
The database created at the University of Washington consists of a roughly categorized collection of 1,109 images.These images are partly annotated using keywords. The remaining images were annotated by our group to allow the annotation to be used for relevance estimation; our annotations are publicly available10.The images are of various sizes and mainly include vacation pictures from various locations. There are 18 categories,for example “spring ﬂowers”, “Barcelona”, and “Iran”. Some example images with annotations are shown in Figure 2. The complete annotation consists of 6,383 words with a vocabulary of 352 unique words. On the average, each image has about 6 words of annotation. The maximum number of key-words per image is 22 and the minimum is 1. The database is freely available11. The relevance assessment for the experiments with this database were performed using the annotation: an image is considered to be relevant w.r.t. a given query image if the two images have a common keyword in the annotation. On the average, 59.3 relevant images correspond to each image. The keywords are rather general; thus for example images showing sky are relevant w.r.t. each other,which makes it quite easy to ﬁnd relevant images (high precision is likely easy) but it can be extremely diﬃcult to obtain a high recall since some images showing sky might have hardly any visual similarity with a given query.This task can be considered a personal photo retrieval task,e.g. a user with a collection of personal vacation pictures is looking for images from the same vacation, or showing the same type of building.
4. IRMA-10000 database
5. ZuBuD database
The “Zurich Buildings Database for Image Based Recognition”(ZuBuD) is a database which has been created by the Swiss Federal Institute of Technology in Zurich. The database consists of two parts, a training part of 1,005images of 201 buildings, 5 of each building and a query part of 115 images. Each of the query images contains one of the buildings from the main part of the database. The pictures of each building are taken from diﬀerent viewpoints and some of them are also taken under diﬀerent weather conditions and with two diﬀerent cameras. Given a query image, only images showing exactly the same building are considered relevant.
6. UCID database (Suggested)
The UCID database13 was created as a benchmark database for CBIR and image compression applications. This database is similar to the UW database as it consists of vacation images and thus poses a similar task.For 264 images, manual relevance assessments among all database images were created, allowing for performance evaluation. The images that are judged to be relevant are images which are very clearly relevant, e.g. for an image showing a particular person, images showing the same person are searched and for an image showing a football game, images showing football games are considered to be relevant. The used relevance assumption makes the task easy on one hand,because relevant images are very likely quite similar, but on the other hand, it makes the task diﬃcult, because there are likely images in the database which have a high visual similarity but which are not considered relevant. Thus, it can be diﬃcult to have high precision results using the given rel-evance assessment, but since only few images are considered relevant, high recall values might be rather easy to obtain.
<Yaroslav Bulatov> I've collected this dataset for a project that involves automatically reading bibs in pictures of marathons and other races. This dataset is larger than robust-reading dataset of ICDAR 2003 competition with about 20k digits and more uniform because it's digits-only. I believe it is more challenging than the MNIST digit recognition dataset.
I'm now making it publicly available in hopes of stimulating progress on the task of robust OCR. Use it freely, with only requirement that if you are able to exceed 80% accuracy, you have to let me know ;)
The dataset file contains raw data (images), as well as Weka-format ARFF file for simple set of features.
For completeness I include matlab script used to for initial pre-processing and feature extraction, Python script to convert space-separated output into ARFF format. Check "readme.txt" for more details.
- Database of thousands of weakly labelled, high-res images. Please, click here to download the database.
- Pixel-wise labelled image database v1 (240 images, 9 object classes). Please, click here to download the database. This database was used in paper 1 below and in the above demo video.
- Pixel-wise labelled image database v2(591 images, 23 object classes). Please, click here to download the database.
- Pixel-wise labelled image database of textile materials. Please, click here to download the database.