Saturday, December 27, 2008

A new section of the Compact Composite Descriptors (CCD Layer 2) is now ready.

Two new papers will be presented at The Sixth IASTED International Conference on Signal Processing, Pattern Recognition and Applications ~SPPRA 2009~.

In the first paper we present a new low level compact composite descriptor for Content Based Medical Image Retrieval.
Abstract: The rapid advances made in the field of radiology, the increased frequency in which oncological diseases appear, as well as the demand for prevailing medical checks, led to the creation of a large database of radiology images in every hospital or medical center. There is now an imperative need to create an effective method for the indexing and retrieval of these images. This paper proposes a new method for content based medical image retrieval. The description of images relies on a new Composite Descriptor (CD) which includes global image features, capturing both brightness and texture characteristics at the same time. Image information is extracted using a set of fuzzy approaches. To be applicable in the design of large medical image databases, the proposed descriptor is compact, requiring only 48 bytes per image. Experiments demonstrate the effectiveness of the proposed technique. Authors: Savvas A. Chatzichristofis and Yiannis Boutalis.

The second paper is presenting a method for auto selection the proper compact composite descriptor in order to retrieve natural color images.
Abstract: Compact Composite Descriptors (CCD) are global image features capturing both, color and texture characteristics, at the same time in a very compact representation. In this paper we propose a combination of two recently introduced CCDs (CEDD and FCTH) into a Joint Composite Descriptor (JCD). We further present a method for descriptor selection to approach the best ANMRR that would result from CEDD and FCTH. With our approach the most appropriate descriptor in terms of maximization of information content can be found on a per image basis without knowledge of the data set as a whole. Experiments conducted on three known benchmarking image databases demonstrate the effectiveness of the proposed technique. Authors: Savvas A. Chatzichristofis, Mathias Lux and Yiannis Boutalis.

The descriptors will be added soon in the CCD section.

Monday, December 22, 2008

Video summary generation. A java based open source project

Mathias Lux is working on a summary tool, which extracts still images from a video. The goal of the tool is to find frames, which describe the image in an optimal way. Now it’s in a (rather) stable state and ready to release. For Windows users it should be a single click to start the thing, for Linux you need to install ffmpeg. Note that the tool is open source & the code is GPL-ed

Read More

Saturday, December 20, 2008

New search-by-style options for Google Image Search

Many of us use Google Image Search to find imagery of people, clip art for presentations, diagrams for reports, and of course symbols and patterns for artistic inspiration. Unfortunately, searching for the perfect image can be challenging if the search results match the meaning of your query but aren't in a style that's useful to you. So some time ago we launched face search, which lets you limit your search results to only images containing faces (see a search without and with this option). More recently we also rolled out photo search, which limits results to images that contain photographic elements, ignoring many cartoons and drawings which may not be useful to you (see a search without and with this option).Today we're pleased to extend this capability to clip art and line drawings. To see the effect of these new options, let's take a look at the first few results for "Christmas," one of our most popular queries on Image Search right now.

All of these options can be selected from the "Any content" drop down in the blue title bar on any search results page, or by selecting one of the "Content types" on the Advanced Image Search page. The good news: no extra typing! In all these examples our query remained exactly the same, we just restricted our results to different visual styles. So whether you're interested holiday wreaths, Celtic patterns, or office clip art, it just became a lot easier to find the images you're looking for.

Thursday, December 18, 2008

Painting of the Mona Lisa using Microsoft paint

I found this amazing video at youtube.

Original painting time 2hrs 30mins.

Things You Need to Know Before You Buy Digital Camera

Digital cameras come in many sizes, colors, brands, zooms, resolutions, playbacks, etc. There are so many features and qualities that are being placed in the devices that buyers especially first timers become overwhelmed and dizzy with these outstanding arrays of gadgets. This is even without including the various advertisements and different ratings that are used to promote these products.
So what are the things to look for if you want to buy digital camera? To be able to answer these, there are 2 sets of information you have to know before you can decide. The first type of information is defining what YOU need and want in a digital camera. To do this, you can ask yourself the following questions:
What do you want to take with your digital camera? Before you buy digital camera, it is important to determine what kind of pictures you want to take with it. If you are a digital photography enthusiast, any digital camera will not just do. You have to look for features that can support the zooming you need, the resolution, etc.
How much is your budget? This is a very important question any person who intends to buy digital camera should ask. Because no matter what your needs and wants are for the device, your financial resource will play a huge part in dictating the type of digital camera you will buy.
What are you resources? When you buy digital camera, sometimes the spending does not end there. You also have to consider the capacity and the power of the computer and the printer you will be hooking your camera with for your editing and printing needs. Editing software are already included when you buy digital camera but other devices aren’t. Aside from a printer, ink and paper for printing, you might also need additional memory cards for your camera and a more powerful computer to support image editing and image storage and retrieval.
Read More

Wednesday, December 17, 2008

Benchmark databases for CBIR

Recently, standard benchmark databases and evaluation campaigns have been created allowing a quantitative comparison of CBIR systems. These benchmarks allow the comparison of image retrieval systems under different aspects: usability and user interfaces, combination with text retrieval, or overall performance of a system.

1. WANG database

The WANG database is a subset of 1,000 images of the Corel stock photo database which have been manually selected and which form 10 classes of 100 images each. The WANG database can be considered similar to common stock photo retrieval tasks with several images from each category and a potential user having an image from a particular category and looking for similar images which have e.g. cheaper royalties or which have not been used by other media. The 10 classes are used for relevance estimation: given a query image, it is assumed that the user is searching for images from the same class, and therefore the remaining 99 images from the same class are considered relevant and the images from all other classes are considered irrelevant



2. The MIRFLICKR-25000 Image Collection

The new MIRFLICKR-25000 collection consists of 25000 images downloaded from the social photography site Flickr through its public API.


  • OPEN
    Access to the collection is simple and reliable, with image copyright clearly established. This is realized by selecting only images offered under the Creative Commons license. See the copyright section below.
    Images are also selected based on their high interestingness rating. As a result the image collection is representative for the domain of original and high-quality photography.
    In particular for the research community dedicated to improving image retrieval. We have collected the user-supplied image Flickr tags as well as the EXIF metadata and make it available in easy-to-access text files. Additionally we provide manual image annotations on the entire collection suitable for a variety of benchmarks.

MIRFLICKR-25000 is an evolving effort with many ideas for extension. So far the image collection, metadata and annotations can be downloaded below. If you enter your email address before downloading, we will keep you posted of the latest updates.



3. UW database

The database created at the University of Washington consists of a roughly categorized collection of 1,109 images.These images are partly annotated using keywords. The remaining images were annotated by our group to allow the annotation to be used for relevance estimation; our annotations are publicly available10.The images are of various sizes and mainly include vacation pictures from various locations. There are 18 categories,for example “spring flowers”, “Barcelona”, and “Iran”. Some example images with annotations are shown in Figure 2. The complete annotation consists of 6,383 words with a vocabulary of 352 unique words. On the average, each image has about 6 words of annotation. The maximum number of key-words per image is 22 and the minimum is 1. The database is freely available11. The relevance assessment for the experiments with this database were performed using the annotation: an image is considered to be relevant w.r.t. a given query image if the two images have a common keyword in the annotation. On the average, 59.3 relevant images correspond to each image. The keywords are rather general; thus for example images showing sky are relevant w.r.t. each other,which makes it quite easy to find relevant images (high precision is likely easy) but it can be extremely difficult to obtain a high recall since some images showing sky might have hardly any visual similarity with a given query.This task can be considered a personal photo retrieval task,e.g. a user with a collection of personal vacation pictures is looking for images from the same vacation, or showing the same type of building.

Read More

4. IRMA-10000 database

The IRMA database consists of 10,000 fully annotated radio-graphs taken randomly from medical routine at the RWTH Aachen University Hospital. The images are split into 9,000training and 1,000 test images. The images are sub dividedinto 57 classes. The IRMA database was used in the ImageCLEF 2005 image retrieval evaluation for the automatic annotation task. For CBIR, the relevances are defined by the classes, given a query image from a certain class, all database images from the same class are considered relevant
Read More

5. ZuBuD database

The “Zurich Buildings Database for Image Based Recognition”(ZuBuD) is a database which has been created by the Swiss Federal Institute of Technology in Zurich. The database consists of two parts, a training part of 1,005images of 201 buildings, 5 of each building and a query part of 115 images. Each of the query images contains one of the buildings from the main part of the database. The pictures of each building are taken from different viewpoints and some of them are also taken under different weather conditions and with two different cameras. Given a query image, only images showing exactly the same building are considered relevant.

6. UCID database (Suggested)

The UCID database13 was created as a benchmark database for CBIR and image compression applications. This database is similar to the UW database as it consists of vacation images and thus poses a similar task.For 264 images, manual relevance assessments among all database images were created, allowing for performance evaluation. The images that are judged to be relevant are images which are very clearly relevant, e.g. for an image showing a particular person, images showing the same person are searched and for an image showing a football game, images showing football games are considered to be relevant. The used relevance assumption makes the task easy on one hand,because relevant images are very likely quite similar, but on the other hand, it makes the task difficult, because there are likely images in the database which have a high visual similarity but which are not considered relevant. Thus, it can be difficult to have high precision results using the given rel-evance assessment, but since only few images are considered relevant, high recall values might be rather easy to obtain.
7.Yaroslav Bulatov OCR dataset

<Yaroslav Bulatov> I've collected this dataset for a project that involves automatically reading bibs in pictures of marathons and other races. This dataset is larger than robust-reading dataset of ICDAR 2003 competition with about 20k digits and more uniform because it's digits-only. I believe it is more challenging than the MNIST digit recognition dataset.
I'm now making it publicly available in hopes of stimulating progress on the task of robust OCR. Use it freely, with only requirement that if you are able to exceed 80% accuracy, you have to let me know ;)
The dataset file contains raw data (images), as well as Weka-format ARFF file for simple set of features.
For completeness I include matlab script used to for initial pre-processing and feature extraction, Python script to convert space-separated output into ARFF format. Check "readme.txt" for more details.


8. Microsoft Object Class Recognition
  1. Database of thousands of weakly labelled, high-res images. Please, click here to download the database.
  2. Pixel-wise labelled image database v1 (240 images, 9 object classes). Please, click here to download the database. This database was used in paper 1 below and in the above demo video.
  3. Pixel-wise labelled image database v2(591 images, 23 object classes). Please, click here to download the database.
  4. Pixel-wise labelled image database of textile materials. Please, click here to download the database.
9. Images from Digital Image Processing, 3rd ed, by Gonzalez and Woods.


1.  Deselaers, T., Keysers, D., and Ney, H. 2008. Features for image retrieval: an experimental comparison. Inf. Retr. 11, 2 (Apr. 2008), 77-107. DOI=
2. S. A. Chatzichristofis, K Zagoris, Y. S. Boutalis and Nikolas Papamarkos, “ACCURATE IMAGE RETRIEVAL BASED ON COMPACT COMPOSITE DESCRIPTORS AND RELEVANCE FEEDBACK INFORMATION”, «International Journal of Pattern Recognition and Artificial Intelligence (IJPRAI) », to Appear, 2009

Monday, December 15, 2008

Second Update in a Week for img(Rummager)

2008-12-15 1.0.3271
1. Spatial Fuzzy Brightness and Texture Directionality Descriptor (New Descriptor). This descriptor is suittable for Content Based Medical Image Retrieval.
2. New Quantization Tables for Brightness and Texture Directionality Descriptor
3. Gustafson Kessel Classifier for Color Reduction

Img Retrieval
1. Spatial Fuzzy Brightness and Texture Directionality Descriptor (New Descriptor). This descriptor is suittable for Content Based Medical Image Retrieval.
2. Get the Precision->Recall Graph after the retrieval procedure.

Download the latest version

Friday, December 12, 2008

Multiple Image Resizer .NET v2.4.0.0 Released!

With Multiple Image Resizer .NET, you can resize, add borders, add text, overlay images, crop, rotate and flip - with a few simple mouse clicks.
And what's more, Multiple Image Resizer .NET is FREE for personal and educational use! Commercial users of Multiple Image Resizer .NET should buy a commercial use license from us.
Multiple Image Resizer .NET also has a completely customisable user interface that you can arrange to suit yourself.
Have a look at the features page for more information about what Multiple Image Resizer .NET is capable of.

Latest News, 3rd December 2008:
The software has been updated to support the Greek language.
Multiple Image Resizer .NET's user interface now supports 14 different languages. If you would like to see the software in your own language then why not provide a translation - see the translate page for more information.

Multiple Image Resizer .NET Version is now available for download.


Imprezzeo, backed by Independent News and Media PLC launches its new image search product providing a new way of searching for digital images. The beta product uses images to search for images, rather than text.
Press Release:
Images can be scenes; landmarks; objects; graphics; people or even personalities. Irrespective of the size of the image collection Imprezzeo Image Search helps users find the right image - fast. For a demonstration, visit
The potential for this technology is huge. Stock photo libraries and news agencies can provide more relevant search results to image buyers (and sell content that might not have been found using traditional text-based search); search engines can provide users with a far more sophisticated image search experience than is currently available; photo sharing sites can offer search by example image, rather than search that solely relies on user’s tagging; consumers can organise their personal photo collections by content and person rather than by date; even retailers can recommend similar products for purchase (for example, if a consumer is searching for a ‘red handbag’, Imprezzeo could be used to find all similar products).
The proprietary search software is more sophisticated than any other image-based search technology on the market, combining content-based image retrieval (CBIR) and facial recognition technologies. Imprezzeo’s sophisticated image analytics ensures the Imprezzeo platform can deliver great results when applied to a whole range of different image content and when used in a whole range of applications.
The new technology generates image search results that closely match a sample image either chosen by the user from an initial set of search results that can then be refined, or from an image uploaded by the user.
Dermot Corrigan, CEO of Imprezzeo, says: “This will fundamentally change the way users and consumers expect to search for images, whether that’s in a photo stock library, the desktop or the web. We know that currently many image searches are abandoned at the first set of results, because the returned results are not what the user is looking for. This technology changes that.”

Thursday, December 11, 2008

Image Processing Using C#

Please Update your bookmarks: New URL:

Sunday, December 7, 2008

img(Rummager) Update

Change Log


1. Tamura Texture Directionality Histogram (Bug Fixed)

2. Auto Correlograms using several methods and max distances

3. Color Histogram Crisp Linking

4. Brightness and Texture Directionality Descriptor (New Descriptor)

5. Scalable Fuzzy Brightness and Texture Directionality Descriptor (New Descriptor)

6. Spatial Color Layout (Beta Version) (New Descriptor)

7. Color Reduction Using Gustafson Kessel

Img Retrieval

8. Joint Composite Descriptor (Final Version) (New Descriptor)

9. Auto Descriptor Selector (Final Version)

10. Color Histograms (RGB)

11. Auto Correlogram

12. Tamura Texture 

Evaluation Methods

13. Evaluate retrieval results using ANMRR and/or Mean Average Precision


14. Faster creation of index files

15. .Net Framework 3.5 Support


16. Retrieve images form sketches using the beta version of "Spatial Color Layout" (New Descriptor)

17. Retrieve images from Sketches using "Color Layout Descriptor"


18. Retrieve images form sketches using the beta version of "Spatial Color Layout" (New Descriptor)


Thursday, December 4, 2008

ImageSorter V3 (Beta 3)

ImageSorter is an image browsing application, which allows an automatic sorting by color, date taken, name, or size.
The idea of ImageSorter is to find images of which you remember how they look but you forgot in which folder they were. If one or several folders are selected, all images from these folders will be visually arranged such that similar images are close to each other. In this sorted display it will be much easier to find a particular image. Selected images can be copied, moved or deleted (right mouse click).ImageSorter does cache thumbnails and sortings, therefore after images have been loaded once, everything will be much faster.
The current version of ImageSorter is 3.0 BETA 3 for Windows and V2.0.2 for Mac OS X. ImageSorter 3 introduced an Internet image search (Yahoo! and Flickr) and the possibility to search for similar images on the local disk or the Internet. The software profits from a greatly improved stability and run-time performance. Furthermore quite a few suggestions made in the forum have been included. See the change log for a detailed list of changes
Download the ImageSorter software

Tuesday, December 2, 2008


pixolu is a prototype of an image search system combining keyword search with visual similarity search and semi-automatically learned inter-image relationships. Enter a search term and pixolu searches the image indexes of Yahoo and Flickr. Compared to other image search systems pixolu retrieves more images in an initial phase. Due to a visually sorted display up to several hundred images can be easily inspected, which in most cases is sufficient to get good representations of the entire result set. The user can quickly identify images, which are good candidates for his/her desired search result.In the next step the selected candidate images are used to refine the result by filtering out visually non-similar images from an even larger result set. In addition pixolu learns the inter-image relationships from the candidate sets of different users. This helps pixolu to suggest other images that are semantically similar to the candidate images.

Monday, December 1, 2008

13th International Conference on Computer Analysis of Images and Patterns (CAIP09)

CAIP is a series of biennial international conferences devoted to all aspects of computer vision, image analysis and processing, pattern recognition and related fields. It has been held since 1985 and the last conference sites were Warsaw (2001), Groningen (2003), Paris (2005), and Vienna (2007). Additional information about the CAIP conference series can be found at:
The proposed scope of CAIP09 includes, but not limited to, the following areas:
* 3D Vision
* 3D TV
* Biometrics
* Color and texture
* Document analysis
* Graph-based Methods
* Image and video indexing and database retrieval
* Image and video processing
* Image-based modeling
* Kernel methods
* Medical imaging
* Mobile multimedia
* Model-based vision approaches
* Motion Analysis
* Non-photorealistic animation and rendering
* Object recognition
* Performance evaluation
* Segmentation and grouping
* Shape representation and analysis
* Structural pattern recognition
* Tracking
* Applications

Invited speakers (incomplete yet)
David G. Stork (Ricoh Innovations and Stanford University, USA) Aljoscha Smolic (Fraunhofer Institute for Telecommunications, Germany)

Saturday, November 29, 2008

Yahoo's New VideoTagGame Lets You Tag Within Videos

The transfer of human intelligence to the machine is something the internet makes easy to do. With reCAPTCHA, we keep spammers at bay while helping digitize old books, Amazon's Mechanical Turk lets us crowdsource small tasks to a dynamic human workforce available on demand, and Google Image Labeler makes the tedious task of tagging fun. Now Yahoo is trying to tap into that human machine through their new VideoTagGame, a game that encourages participants to tag sections within a video for better retrieval.
The first VideoTagGame ran back in summer of 2007 during a Yahoo! party in Amsterdam. Now they're ready to take their experiment to the public through the Yahoo! Sandbox so they can collect more statistics on its usage.

The objective of the VideoTagGame is to collect time-based annotations of the video which could then enable the retrieval of relevant parts in a video when a search is performed, rather than returning the entire video itself. These annotations are collected in the context of a multi-player game.

How To Play
To play the VideoTagGame, participants must sign in with their Yahoo! ID and join a new game. There will always be at least three players in each game. After a 3-second countdown, the video will begin to play. As it plays, participants enter tags that correspond to the various parts of the video. When two players agree on a tag (that is, they enter the same tag), they each get points. The closer together the tags were entered, the more points are rewarded. After the video ends, participants can then watch as it plays again, this time with the tags overlaid on top of the video.

Read More

Friday, November 28, 2008


Yottalook™ is a free radiology-centric web search engine the provides desicion support at the point of care using proprietary relevance and ranking algorithms by iVirtuoso. Yottalook™ is designed to provide the practicing radiologists the most important and most relevant information they need at the time of patient care.
Core Technologies

Yottalook™ is based on core technologies developed by iVirtuoso to achieve optimized search results. First is automated analysis of the search term to understand what the radiolgist is trying to look for - this core technology is called "natural query analysis".

Yottalook™ has also developed a thesaurus of medical terminologies that not only identifies synonyms of terms but also defines relationships between terms. This second core technology is called "semantic ontology" and is based on existing medical ontologies that have been enhanced by iVirtuoso, such as RadLex - A Lexicon for Uniform Indexing and Retrieval of Radiology Information Resources developed by the Radiology Society of North America.
Third core technology is "relevance algorithm" for image search that differentiates medical terms from other words in text associated with medical images and uses them to create ranking for Yottalook image search.
The fourth core technology is a specialized content delivery system called "Yottalinks" that provides high yield content based on the search term. This content may also be provided by a third party vendor licensing Yottalook search. Yottalook™ can be integrated with any web based medical application so that context relevant information is provided to the physician at the point of care.

Thursday, November 27, 2008

Pen with digital capabilities is a truly innovative and fun way to take notes and record audio

Livescribe's Pulse "smartpen" is part pen, part voice recorder, and part nothing you've ever seen before. Remember Picture Pages? If not, watch this YouTube clip, and then imagine Livescribe's Pulse as the Picture Pages pen on a combination of steroids, hallucinogens, and time-travel pills. It's fun to use, and it could prove to be a groundbreaking, useful tool for students, meeting-hoppers, and journalists.

Tuesday, November 25, 2008

New version of PhotoEnhancer

Version 2.2 of PhotoEnhancer (image enhancement software) is now available here:

The new version features better visualization techniques for all the kinds of screens, plus, a new feature that preserves even better the correctly exposed image regions. PhotoEnhancer 2.2 is the most complete version of this project.

Monday, November 24, 2008


Kitware, a software company with offices in New York and North Carolina, won an initial $6.7 million contract for what is technically called Video and Image Retrieval and Analysis Tool, or VIRAT.In a statement about the contract award, Kitware projected that through its proposed system, “the most high-value intelligence content will be clearly and intuitively presented to the video analyst, resulting in substantial reductions in analyst workload per mission as well as increasing the quality and accuracy of intelligence yield.”Anthony Hoogs, Kitware’s project leader, said, ”This project will really make a difference to the war fighter.”To carry out the project, Kitware said it was teaming up with two leading military technology companies, Honeywell and General Dynamics, as well as a number of academic researchers.

Jena – A Semantic Web Framework for Java

Jena is a Java framework for building Semantic Web applications. It provides a programmatic environment for RDF, RDFS and OWL, SPARQL and includes a rule-based inference engine.
Jena is open source and grown out of work with the HP Labs Semantic Web Programme.

The Jena Framework includes:
Reading and writing RDF in RDF/XML, N3 and N-Triples
In-memory and persistent storage
SPARQL query engine

Read More

Sunday, November 23, 2008

CVPR 2009

CVPR 2009 will be held at the Fontainebleau Hotel in Miami, Florida.
Papers in the main technical program must describe high-quality, original research.

Topics of interest include all aspects of computer vision and pattern recognition (applied to images and video) including, but not limited to, the following areas:
Early and Biologically-inspired Vision
Color and Texture
Segmentation and Grouping
Computational Photography and Video
Motion and Tracking
Stereo and Structure from Motion
Image-Based Modeling
Illumination and Reflectance Modeling
Shape Representation and Matching
Object Detection, Recognition, and Categorization
Video Analysis and Event Recognition
Face and Gesture Analysis
Statistical Methods and Learning
Performance Evaluation
Medical Image Analysis
Image and Video Retrieval
Vision for Graphics
Vision for Robotics
Vision for Internet
Applications of Computer Vision

OpenCV: How to keystone an image

Saturday, November 22, 2008

CBMI 2009

Following the six successful previous events (Toulouse 1999, Brescia 2001, Rennes 2003, Riga 2005, Bordeaux 2007, London 2008), CBMI 2009 will be held on June 3-5, 2009 at the picturesque city of Chania, in Crete Island, Greece. It will be organized by Image, Video and Multimedia Laboratory of National Technical University of Athens. CBMI 2009 aims at bringing together the various communities involved in the different aspects of content-based multimedia indexing, such as image processing and information retrieval with current industrial trends and developments. CBMI 2009 is supported by IEEE and EURASIP. The technical program of CBMI 2009 will include presentation of invited plenary talks, special sessions as well as regular sessions with contributed research papers.

Topics of interest include, but are not limited to:

Multimedia indexing and retrieval (image, audio, video, text)
Matching and similarity search
Construction of high level indices
Multimedia content extraction
Identification and tracking of semantic regions in scenes
Multi-modal and cross-modal indexing
Content-based search
Multimedia data mining
Metadata generation, coding and transformation
Large scale multimedia database management
Summarisation, browsing and organization of multimedia content
Presentation and visualization tools
User interaction and relevance feedback
Personalization and content adaptation
Evaluation and metrics

Thursday, November 20, 2008


ACM International Conference on Image and Video Retrieval
July 8-10, 2009, Santorini Island, Greece - -
Image and Video retrieval have now reached a state where successful techniques and applications start flourishing. The ACM International Conference on Image and Video Retrieval (ACM-CIVR) series of conferences is the ideal opportunity to present and encounter such developments. Originally set up to illuminate the state-of-the-art in image and video retrieval throughout the world, it is now a reference event in the field where researchers and practitioner exchange knowledge and ideas. CIVR2009 is seeking original high quality special sessions addressing innovative research in the broad field of image and video retrieval. We wish to highlight significant and emerging areas of the main problem of search and retrieval but also the equally important related issues of multimedia content management, user interaction and community-based management.
Example topics of interest include but are not limited to: social network information mining, unsupervised methods for data exploration, large scale issues for algorithms and data set generation.
Each special session will consist of 5 invited papers. The organizers’ role is to attract the speakers and chair the session itself. Proposals will be evaluated based on the timeliness of the topic, relevance to CIVR, the degrees to which they will bring together key researchers in the area, introduce the area to the larger research community, further develop the area, and potentials to establish a larger community around the area. Please note that all papers in the proposed session will undergo the same review process as regular papers. If after the reviewing process less than the necessary number of papers solicited for a special session are selected, the Special Session will be cancelled, and the solicited papers that passed review process will be presented within regular sessions of the conference.

Photo Tourism: Exploring Photo Collections in 3D

Photo Tourism is a system for browsing large collections of photographs in 3D. Our approach takes as input large collections of images from either personal photo collections or Internet photo sharing sites, and automatically computes each photo's viewpoint and a sparse 3D model of the scene. Our photo explorer interface enables the viewer to interactively move about the 3D space by seamlessly transitioning between photographs, based on user control.
Microsoft Live Labs has turned these research ideas into a streaming multi-resolution Web-based service called Photosynth.
You can also read about newer research we have been doing in this area at the University of Washington Photo Tourism project page.

Paper and video

Noah Snavely, Steven M. Seitz, and Richard Szeliski, Photo Tourism: Exploring photo collections in 3D," ACM Transactions on Graphics, 25(3), August 2006. (Video WMV), Video (MOV))

We have developed a system for interactively browsing and exploring large unstructured collections of photographs of a scene using a novel 3D interface. Our system consists of an image-based modeling front end, which automatically computes the viewpoint of each photograph as well as a sparse 3D model of the scene and image to model correspondences. Our photo navigation tool uses image-based rendering techniques to smoothly transition between photographs, while also enabling full 3D navigation and exploration of the set of images and world geometry, along with auxiliary information such as overhead maps. Our system also makes it easy to construct photo tours of scenic or historic locations, as well as to annotate image details, which are automatically transferred to other relevant images in the collection. We demonstrate our system on several large personal photo collections as well as images gathered from photo sharing Web sites on the Internet.

Wednesday, November 19, 2008

EngLab - Open Source Mathematical Platform

EngLab is a cross-compile mathematical platform with a C like syntax, intended to be used both by engineers and users with little programming knowledge. The initiative has been taken from a group of students a year ago.

"Our goal is to develop an easy-to-use computaion and simulation platform with a C++ like syntax. We have adopted Matlab's structure philoshophy and C++ 's structured language syntax. There are various toolboxes (packages of functions relative to a certain scientific field), which depend on open-source libraries." EngLab Team

The EngLab distribution is available in two ways: there are two basic Englab releases, EngLab Console and EngLab GUI. EngLab Console allows EngLab's execution through the console(Linux or Windows). EngLab GUI gives the opportunity of using EngLab through a graphical user interface. EngLab GUI is implemented with the use of the open-source library wxWidgets 2.8, providing additional usability compared to EngLab Console edition. EngLab GUI is independent, so there is no need for EngLab Console to be installed, in order to properly install and execute EngLab GUI.

Toolboxes are distributed as seperate packages. Their installation is possible either through EngLab Console or EngLab GUI. The reason is that those toolboxes depend on open-source libraries that have to be previously installed. So as the user not to be forced to install those libraries directly, user can install packages and toolboxes at his/her own will.

For the time being, EngLab Console edition is available for Windows and Linux and Englab GUI is available for Linux only.

Until now EngLab has the following features :

- 16 types of variable declaration (int, float, ...)
- Variable declaration with unlimited number of dimensions.
- Loop structures (for, while, ...)
- Arithmetic, logical and binary operations
- Constant number declaration (pi, phi, ...)
- Graphical manipulation of variable values of any dimension (Englab GUI)
- Adjustable graphical environment (Englab GUI)
- Editor for writing *.eng functions (Englab GUI)
- Command history for the last 5 sessions
- Immediate access to variables, constants and functions (EngLab GUI)
- Recent files opened through EngLab (EngLab GUI)

Toolboxes that have been fully or partially implemented:

- a package containing fundamental functions of C (trigonemetric, hyperbolic trigonometrical, ...)
- a package containing some statistic functions
- a package containing functions that allow convertions of the variable type

All these toolboxes accompany the basic two EngLab editions, since they do not depend on another open-source library. Moreover, some other toolboxes have been partially implemented:

- a package that contains functions for the manipulation of 2-D matrices (determinant, inverse array, ...). This package depends on the open-source library NewMat10.
- a package that contains functions for image processing. This package depends on the open-source library CImg.
- a package that contains functions for image processing. This package depends on the open-source library OpenCV.
- a toolbox for visual data representation(plots etc)
- a toolbox that contains functins for manipulating polyonymials, root detection, computation of integrals and derivatives, special functions and more.

Tuesday, November 18, 2008

Smile And Robot Smiles With You

Humanoid 'Jules' is a disembodied androgynous robotic head that automatically copies the movement and expressions of a human face.
The technology works using 10 stock human emotions - for instance happiness, sadness, concern - that have been programmed into the robot.
The software then maps what it sees to Jules' face to combine expressions instantly to mimic those being shown by a human subject.
Controlled only by its own software, Jules can grin and grimace, furrow its brow, and 'speak' as the software translates real expressions observed through video camera 'eyes'.
If you want people to be able to interact with machines, then you've got to be able to do it naturally...When it moves it has to look natural in the same way that human expressions are, to make interaction useful.Chris Melhuish, head of the Bristol Robotics Laboratory
The robot - made by US roboticist David Hanson - then copies the facial expressions of the human by converting the video image into digital commands that make the robot's inner workings produce mirrored movements.
And it all happens in real time as Jules is bright enough to interpret the commands at 25 frames per second.
The project was developed over more than three years at the Bristol Robotics Laboratory, a lab run by the University of the West of England and the University of Bristol under the leadership of Chris Melhuish, Neill Campbell and Peter Jaeckel.
The aim of the developers was to make it easier for humans to interact with 'artificial intelligence', in other words to create a 'feelgood' factor.
The BRL's Peter Jaeckel said: ''Realistic, life-like robot appearance is crucial for sophisticated face-to-face robot - human interaction.
''Researchers predict that one day robotic companions will work, or assist humans in space, care and education. Robot appearance and behaviour need to be well matched to meet expectations formed by our social experience."
But a warning note has been sounded.
Kerstin Dautenhahn, a robotics researcher at the University of Herefordshire, believes that people may be disconcerted by humanoid automatons that simply look 'too human'.
''People might easily be fooled into thinking that this robot not only looks like a human and behaves like a human, but that it can also feel like a human. And that's not true," she pointed out.

Monday, November 17, 2008

Zunavision: Novel software by Stanford university students/prof to embed video/image inside video

Stanford artificial intelligence researchers have developed software that makes it easy to reach inside an existing video and place a photo on the wall so realistically that it looks like it was there from the beginning. The photo is not pasted on top of the existing video, but embedded in it It works for videos as well - you can play a video on a wall inside your video. The technology can cheaply do some of the tricks normally performed by expensive commercial editing systems.

Friday, November 14, 2008

New Experimental Features on img(Anaktisi)

Two new experimental features are implemented in the on-line image retrieval system img(Anaktisi)
1. Draw a sketch to retrieve similar images from our database. The method is based on a new spatial compact color descriptor.

2. Automatic keyword annotation. Select a combination of words and retrieve images. The method is based on a fuzzy support vector machine system. The network was trained using a combination of CEDD and FCTH descriptors

Note that both techniques are still under study.

Thursday, November 13, 2008


TouchGraph lets you see how your friends are connected, and who has the most photos together.

The TouchGraph Facebook Browser shows connections between users based on friendships and common photo appearances.
Friendships are shown as dark gray lines, and common photo appearances are shown as a lighter gray line with a number in the center. The number indicates how many photos the two people appeared in together.
Personal vs. Friends social networks
* When launched from one's own profile, information about all of one's friends and their friendships is loaded.* When launched from another user's profile (Using the "TouchGraph Friends + Photos" link below their profile picture) only people tagged in their photos will appear in the graph.
One can not see another person's whole social network because Facebook only allows applications to get a list of one's own friends. For other users it is only possible to get a list of people that they appear in photos with. Perhaps Facebook's policy will change in the future.
The TouchGraph Facebook Browser determines the clusters/cliques to which one's Friends belong and uses different colors to show each clique. Cliques are characterized by having lots of friendships within a group of friends and few connections to members outside the group.
Friends are assigned a Rank so that one can reduce clutter by only showing a set of 'Top' friends. TouchGraph gives the highest rank to friends who are connectors between different cliques. Finding connecters involves a metric called Betweeness Centrality which is an established measure for a person's importance within a social network.

Wednesday, November 12, 2008

A New Era for Image Annotation

Searching for images on the Web has traditionally been more complicated than text search – for instance, a Google image search for “tiger” not only yields images of tigers, but also returns images of Tiger Woods, tiger sharks and many others that are ‘related’ to the text in the query string. This is because contemporary search engines look for images using any ‘text’ linked to images rather than the ‘content’ of the picture itself. In an effort to improve the recall of image searches, folks from UC San Diego are working on a search engine that works differently – one that analyzes the image itself. “You might finally find all those unlabeled pictures of your kids playing soccer that are on your computer somewhere,” says Nuno Vasconcelos, a professor of electrical engineering at the UCSD Jacobs School of Engineering. They claim that their Supervised Multiclass Labeling System “may be folded into next-generation image search engines for the Internet; and in the shorter term, could be used to annotate and search commercial and private image collections.
Read More

Sunday, November 9, 2008


PhotoEnhancer is an experimental image enhancement software, which employs the characteristics of the ganglion cells of the Human Visual System. Many times the image captured by a camera and the image in our eyes are dramatically different. Especially when there are shadows or highlights in the same scene. In these cases our eyes can distinguish many details in the shadows or the highlights, while the image captured by the camera suffers from loss of visual information in these regions.

PhotoEnhancer attempts to bridge the gap between "what you see" and "what the camera sees". It enhances the shadow or the highlight regions of an image, while keeping intact all the correctly exposed ones. The final result is a lot closer to the human perception of the scene, than the original captured image, revealing visual information that otherwise wouldn't be available to the human observer.

PhotoEnhancer 2.4

The latest version of PhotoEnhancer (2.4) has been released. Version 2.4 features a 'Batch Processing' mode, for the quick enhancement of many image files just with a few clicks, as well as an improved user interface.

Version History:

PhotoEnhancer 2.3

The new version of PhotoEnhancer features the method of "Multi-Scale Image Contrast Enhancement". This additional algorithm enhances locally the contrast of images, maximizing the available visual information. It can be applied to foggy scenes, aerial or satellite images, images with smoke or medical images.


Bugs report and suggestions at:
More Details:

Saturday, November 8, 2008


GazoPa is a similar image search service on the web in private beta by Hitachi. Users can search images from the web based on user’s own photo, drawings, images found on the web and keywords. GazoPa enables users to search for a similar image from characteristics such as a color or a shape extracted from an image itself. There are abundant quantities of images on the web, however many of these simply cannot be described by keywords. Since GazoPa uses image features to search other similar images, a vast range of images can be retrieved from the web. GazoPa is a new visual search service that can navigate users to new territories on the web.

Wednesday, November 5, 2008

Multimodal and Mobile Personal Image Retrieval: A User Study

These last months, I have been collaborating to a research project on Multimodal Information Retrieval of digital pictures collected through camera phones. Recently, one of the papers resuming the results of the research was presented at the International Workshop of Mobile Information Retrieval, held in conjunction with SIGIR in Singapore. Here goes the abstract and the URL to download the paper.
X. Anguera, N. Oliver, and M. Cherubini. Multimodal and mobile personal image retrieval: A user study. In K. L. Chan, editor, Prooceeding of the International Workshop on Mobile Information Retrieval (MobIR’08), pages 17–23, Singapore, 20-24 July 2008. [PDF]
Mobile phones have become multimedia devices. Therefore it is not uncommon to observe users capturing photos and videos on their mobile phones. As the amount of digital multimedia content expands, it becomes increasingly difficult to find specific images in the device. In this paper, we present our experience with MAMI, a mobile phone prototype that allows users to annotate and search for digital photos on their camera phone via speech input. MAMI is implemented as a mobile application that runs in real-time on the phone. Users can add speech annotations at the time of capturing photos or at a later time. Additional metadata is also stored with the photos, such as location, user identification, date and time of capture and image-based features. Users can search for photos in their personal repository by means of speech without the need of connectivity to a server. In this paper, we focus on our findings from a user study aimed at comparing the efficacy of the search and the ease-of-use and desirability of the MAMI prototype when compared to the standard image browser available on mobile phones today.

Adobe Photoshop Lightroom 2

Adobe Photoshop Lightroom 2 is best categorized as a Digital Processor. That is, from bringing images to your computer, cataloging them for later retrieval (and, if you want, backing them up to insure protection against accidental loss), enhancing and fine tuning your images, all the way to printing and/or digital distribution—one can do it all from Lightroom. However, one of the strongest reasons to use Lightroom is the opportunity for playing with images with no concern about how many versions and variations of the image you create without screwing up your original image. Any alteration you make on an image in Lightroom is only how Lightroom lets you "see" the image. Nothing is changed in the image itself unless you save a copy with those changes. The biggest negative about Lightroom is that the interface constantly changes, depending upon what you've clicked. This makes "hacking" the program a challenge, and working with Lightroom isn't helped by the manual that doesn't properly explain the conditions where you will see what is being explained. Despite the complex learning curve, there is much to like in Lightroom.

Read More

Neural Networks on C#

It is known fact, that there are many different problems, for which it is difficult to find formal algorithms to solve them. Some problems cannot be solved easily with traditional methods; some problems even do not have a solution yet. For many such problems, neural networks can be applied, which demonstrate rather good results in a great range of them. The history of neural networks starts in 1950-ies, when the simplest neural network's architecture was presented. After the initial work in the area, the idea of neural networks became rather popular. But then the area had a crash, when it was discovered that neural networks of those times are very limited in terms of the amount of tasks they can be applied to. In 1970-ies, the area got another boom, when the idea of multi-layer neural networks with the back propagation learning algorithm was presented. From that time, many different researchers have studied the area of neural networks, what lead to a vast range of different neural architectures, which were applied to a great range of different problems. For now, neural networks can be applied to such tasks, like classification, recognition, approximation, prediction, clusterization, memory simulation, and many other different tasks, and their amount is growing.
In this article, a C# library for neural network computations is described. The library implements several popular neural network architectures and their training algorithms, like Back Propagation, Kohonen Self-Organizing Map, Elastic Network, Delta Rule Learning, and Perceptron Learning. The usage of the library is demonstrated on several samples:
Classification (one-layer neural network trained with perceptron learning algorithms);
Approximation (multi-layer neural network trained with back propagation learning algorithm);
Time Series Prediction (multi-layer neural network trained with back propagation learning algorithm);
Color Clusterization (Kohonen Self-Organizing Map);
Traveling Salesman Problem (Elastic Network).
The attached archives contain source codes for the entire library, all the above listed samples, and some additional samples which are not listed and discussed in the article.
The article is not intended to provide the entire theory of neural networks, which can be found easily on the great range of different resources all over the Internet, and on CodeProject as well. Instead of this, the article assumes that the reader has general knowledge of neural networks, and that is why the aim of the article is to discuss a C# library for neural network computations and its application to different problems.

Special Issue: Advances in Medical Intelligent Decision Support Systems

Advances in Medical Intelligent Decision Support Systems.
Intelligent Decision Technologies (IDT) journal seeks original manuscripts
for a Special Issue on Advances in Medical Decision Support Systems scheduled to appear in Vol. 3, No. 2, 2009.The last few decades have witnessed significant advancements in intelligent computation techniques. Driven by the need to solve complex real-world problems, powerful and sophisticated intelligent data analysis technologies have been exploited or emerged, such as neural networks, support vector machines, evolutionary algorithms, clustering methods, fuzzy logic, particle swarm optimization, data mining, etc. In recent years, the volume of biological data has been increasing exponentially, thus, allowing significant learning and experimentation to be carried out using a multidisciplinary approach, which gives rise to many challenging problems. The foundation for any medical decision support is the medical knowledge base which contains the necessary rules and facts. This knowledge needs to be acquired from information and data in the fields of interest, such as medicine. Clinical decision-making is a challenging, multifaceted process. Its goals are precision in diagnosis and institution of efficacious treatment. Achieving these objectives involves access to pertinent data and application of previous knowledge to the analysis of new data in order to recognise patterns and relations. As the volume and complexity of data have increased, use of digital computers to support data analysis has become a necessity. In addition to computerisationof standard statistical analysis, several other techniques for computer-aided data classification and reduction generally referred to as intelligent systems, have evolved. This special issue will focus on illustrative and detailed information about medical intelligent decision support systems and feature extraction/selection for automated diagnostic systems.The focus of this special issue is on advances in medical intelligent decision support systems including determination of optimum classification schemes for the problem under study and also to infer clues about the extracted features. Topics include, but are not limited to, the following:
* Bioinformatics and Computational Biology
* Neural Networks, Fuzzy Logic Systems and Support Vector Machines in Biological Signal Processing
* Decision Support Systems and Computer Aided Diagnosis
* Biomedical Signal Processing
* Biomedical Imaging and Image Processing
* Modelling, Simulation, Systems, and Control
Paper submission: Submitted articles must not have been previously published or currently submitted for journal publication elsewhere. As an author, you are responsible for understanding and adhering to our submission guidelines. You can access them from Please thoroughly read these before submitting your manuscript. Each paper will go through a rigorous review process.

Please note the following important dates:

Paper submission for review: November 30, 2008 (Final deadline)
Review results: January 15, 2009
Revised Paper submission: February 20, 2009
Final acceptance: March 1, 2009
Manuscript delivery to the publisher: April 15, 2009

Interested authors should submit digital copies (PDF preferred) of their papers (suggested paper length: 15 pages), including all tables, diagrams, and illustrations, to the Guest Editor, Dr. Vassilis S. Kodogiannis, bye-mail.

Tuesday, November 4, 2008


Create 3D scenes using language.Share them with others

Here is an example:

Text: "a tiny grey manatee is in the aquarium. it is facing right. the fishdead of the aquarium is invisible. the manatee is two inches above the tank_sand of the aquarium. the ground is tile. there is a large brick wall behind the aquarium."


Monday, October 27, 2008

Photo enhancement -Transform Your Digital Images

SBL’s image enhancement team can transform your digital images to make ordinary shots look brilliant. From adjusting saturation, color balance, contrast, brightness and density of images to applying filters, removing or inserting backgrounds, cropping and removing blemishes, noise and grains in images, we offer professional and expert services in quick turnarounds and at highly competitive rates.
For More Details click Here

Sunday, October 26, 2008


Picsearch connects its users to the vast visual resources of the internet. Picsearch uses its own technology to crawl the web and has created a searchable index of images. When a user sends a query to Picsearch the result is received as a set of thumbnail images that are sorted to ensure that they are as highly relevant as possible. When the user clicks on a thumbnail they are linked to the original web site where that image is located.
Picsearch image search technology has three main features that make it unique. It has a relevancy unrivalled on the web due to its patent-pending indexing algorithms. Also, Picsearch has a family friendliness that allows children to surf in safety as all offensive material is filtered out by our advanced filtering systems. The site is also very user friendly as it's designed to be simple, fast and accurate. Due to all of these features, Picsearch is perfect for fun, school, business and families!

Saturday, October 25, 2008

TinEye image search engine

TinEye is the first image search engine on the web to use image identification technology. Given an image to search for, TinEye tells you where and how that image appears all over the web—even if it has been modified.
Just as you are familiar with entering text in a regular search engine such as Google to find web pages that contain that text, TinEye lets you submit an image to find web pages that contain that image
Every day TinEye's spiders crawl the web for additional images. Using sophisticated pattern recognition algorithms, TinEye creates a unique and compact digital signature or 'fingerprint' for each one and adds it to the index.
When you want to find out where an image is being used on the web, you submit it to TinEye. The attributes of the image are analyzed instantly, and its fingerprint is compared to the fingerprint of every single image in the TinEye search index. The result? A detailed list of any websites using that image, worldwide.
Use TinEye to find out where and how an image appears on the web, even if it has been cropped or heavily modified.

Friday, October 24, 2008

Signal Processing, Pattern Recognition and Applications (SPPRA 2009)

This conference is an international forum for researchers and practitioners interested in the advances in, and applications of, signal processing and pattern recognition. It is an opportunity to present and observe the latest research, results, and ideas in these areas. All papers submitted to this conference will be double-blind reviewed by at least two reviewers. Acceptance will be based primarily on originality and contribution. The conference chair makes the final decision on the acceptance or rejection of the paper.

SPPRA 2009 will be held in conjunction with the IASTED International Conferences on:
• Artificial Intelligence and Applications (AIA 2009)
• Software Engineering (SE 2009)
• Parallel and Distributed Computing and Networks (PDCN 2009)
• Computer Graphics and Imaging (CGIM 2009)

Innsbruck is nestled in the valley of the Inn River and tucked between the Austrian Alps and the Tuxer mountain range. It has twice hosted the Winter Olympics and is surrounded by the eight ski regions of the Olympic Ski World, including the Stubai Glacier, which offers skiing year round. Climbing the 14th century Stadtturm on Herzog Friedrich Strasse provides a stunning view of the town and the breathtaking scenery that surrounds it. Concerts at Ambras Castle provide listening pleasure in a beautiful renaissance setting. The sturdy medieval houses and sidewalk cafés of Old Town Innsbruck beckon you to sit for a while and watch people stroll by.
Innsbruck, with its unique blend of historical, intellectual, and recreational pursuits, offers something for every visitor. SPPRA 2009 will be held at the world-famous Congress Innsbruck, located in the heart of the city, near the historical quarter.

Submissions due November 27, 2008 (NEW)
Notification of acceptance December 12, 2008
Camera-ready manuscripts due January 6, 2009

I am planning to submit 2 papers in this conference. :)

Lire 0.7 Released

Lire 0.7 is a major release fixing a lot of bugs and introducing several new features including new descriptor, a simplified way to use descriptors by introducing new generic searchers and indexers as well as an generalized interface for image descriptors. There are also several improvements in indexing and search speed (especially in autocolorcorrelogram). Furthermore retrieval performance was optimized based on the Wang 1000 data set. If you use Lire 0.7 to update an existing version, please make sure that your indices are created newly from scratch. All new features have also found their way into LireDemo, which now also supports multi-threaded indexing.

Download Lire 0.7 and/or LireDemo 0.7
Screencast: Introduction to LireDemo

Saturday, October 11, 2008

Image processing useful links

Colour and Vision Research Laboratories
A very interesting site about color models. This web site provides an annotated library of easily-downloadable standard data sets relevant to color and vision research.

Computer Vision Homepage
The Computer Vision Homepage was established at Carnegie Mellon University in 1994 to provide a central location for World Wide Web links relating to computer vision research. The emphasis of the Computer Vision Homepage is on computer vision research rather than on commercial products.

Color and Computers
Web site on color, color quantization, palettes

Computer Vision, University of Nevada
Useful material on Computer Vision

NeuQuant: Fast High-Quality Image Quantization
The NeuQuant Neural-Net image quantization algorithm is a replacement for the common Median Cut algorithm.

Intelligent Control Systems Laboratory, Georgia Tech
The Intelligent Control Systems Laboratory (ICSL) at Georgia Tech is the campus center for research and academic studies in soft computing for control applications. The lab is equipped with state-of-the-art control systems and modern software development packages that enable applied research in a number of application domains such as biomedical engineering, diagnostics and prognostics, and unmanned aerial vehicles.

The Face Recognition Home Page
Relevant information in the the area of face recognitionInformation pool for the face recognition communityEntry point for novices as well as a centralized information resource

Document Understanding and Character Recognition
This system serves as a repository for Document Image Understanding and Optical Character Recognition (OCR) information and resources. Amara's Wavelet Page Matlab Wavelet Toolbox (Rice University)
The fundamental idea behind wavelets is to analyze according to scale. Indeed, some researchers in the wavelet field feel that, by using wavelets.......

Wavelets at Imager
This site offers several services intended to foster the exchange of knowledge and viewpoints related to theory and applications of wavelets.

An Introduction to Wavelets
Here at Imager, we've been doing some work with wavelets, those clever little multiresolution basis functions.

Wavelets, Signal Processing Algorithms, Orthogonal Basis Functions, Wavelet Applications

Rainer Lienhart Home Page
Research Interests: Image processing, 3D reconstruction, object tracking

Filip Room Home Page
Research Interests include image/video/audio content analysis, machine learning, scalable signal processing, scalable learning, scalable and adaptive algorithms, ubiquitous and distributed media computing in heterogeneous networks, and peer-to-peer networking and mass media sharing.

Anca Doloc-Mihu Home Page
Research interests: Digital image processing and restoration

Computer Vision Bibliography

The pattern recognition files
Information on the pattern recognition research area.

The Lena Story
The Lenna (or Lena) picture is one of the most widely used standard test images used for compression algorithms.

Nikos Papamarkos
Nikos Papamarkos current research interests are in digital image processing, computer vision, document processing, analysis and recognition, pattern recognition, neural networks, signal processing, filter design and optimization algorithms . He has published a number of Journal and Conferences. Also, he is author of three Greek books.

Homepage of Thomas Deselaers
Thomas Deselaers is a research and teaching assistant and PhD-student at the Human Language Technology and Pattern Recognition Group of the RWTH Aachen University.

Friday, October 3, 2008


The MLDM´2009 conference is the sixth event in a series of Machine Learning and Data Mining meetings, initially organised as international workshops. The aim of MLDM´2009 is to bring together from all over the world researchers dealing with machine learning and data mining, in order to discuss the recent status of the research in the field and to direct its further developments.
Basic research papers as well as application papers are welcome. All kinds of applications are welcome, but special preference will be given to multimedia related applications, biomedical applications, and webmining. Paper submissions should be related but not limited to any of the following topics:

applications of clustering
applications in medicine
aspects of data mining
autoamtic semantic
annotation of media content
Bayesian models and methods
conceptional learning and clustering
case-based reasoning and learning
classification and interpretation of images,
text, video classification and model
estimation case-cased ... and more


We are pleased to announce that the Tenth International Conference on Document Analysis and Recognition (ICDAR'2009), sponsored by the International Association for Pattern Recognition (IAPR) TC-10 (Graphics Recognition) and TC-11 (Reading Systems) will be held at the Universitat Autònoma de Barcelona, Catalonia, Spain during July 26-29, 2009. ICDAR is an outstanding international forum for researchers and practitioners at all levels of experience for identifying, encouraging and exchanging ideas on the state-of-the-art in document analysis, understanding, retrieval, and performance evaluation, including various forms of multimedia documents.

The topics of interest include, but are not limited to:
Character Recognition
Handwriting Recognition
Graphics Recognition
Document Image Analysis
Document Understanding
Document Analysis Systems
Basic Research and Methodologies for Document Processing
Camera-based Document Processing
Document Databases and Digital Libraries
Multimedia Documents
Sketching Interfaces
Performance Evaluation
Forensic Science
Analysis of Historical Documents

Paper Submission
Manuscripts of maximum five pages are encouraged to be submitted. Papers must describe original work on any of the ICDAR related topics. The format templates and instructions for paper submission will be available in the Conference web site. The deadline for paper submission is January 12, 2009.

WIAMIS 2009 Call for Papers

The International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS) is one of the main international fora for the presentation and discussion of the latest technological advances in interactive multimedia services. The objective of the workshop is to bring together researchers and developers from academia and industry working in all areas of image, video and audio applications, with a special focus on analysis.

Topics of interest include, but are not limited to:
Multimedia content analysis and understanding
Content-based browsing, indexing and retrieval of images, video and audio
2D/3D feature extraction
Advanced descriptors and similarity metrics for audio and video
Relevance feedback and learning systems
Segmentation of objects in 2D/3D image sequences
Motion analysis and tracking
Video analysis and event recognition
Analysis for coding efficiency and increased error resilience
Analysis and tools for content adaptation
Multimedia content adaptation tools, transcoding and transmoding
Content summarization and personalization strategies
End-to-end quality of service support for Universal Multimedia Access
Semantic mapping and ontologies
Multimedia analysis for new and emerging applications
Multimedia analysis hardware and middleware
Semantic web and social networks
Advanced interfaces for content analysis and relevance feedback

Paper Submission
The intention is to publish the proceedings in the Springer's Lecture Notes in Computer Science Series and to make them available in IEEExplore. The authors are requested to send their submissions (4 pages double column in English). All submissions will be peer reviewed by at least three members of the technical program committee.Accepted papers will be distributed via the IEEE Xplore™. Papers must be formatted according to the IEEE Computer Society standards and their length must not exceed 4 IEEE double column style pages including all figures, tables, and references.

Important Dates:
Proposal for Special Session: November 21, 2008
Paper Submission: December 1, 2008
Notification of Acceptance: January 16, 2009
Camera-ready Papers: February 06, 2009

IAPR Pattern Recognition Resources web site

"We have recently completed the first round of development on the IAPR Pattern Recognition Education Resources web site:

This work was initiated by the Internation Association for Pattern Recognition (
The goal was a web site that can support students, researchers and staff.
Of course, advances in pattern recognition and its subfields means that developing the site will be a never-ending process. However, we believe that the current site is
now well developed enough for general use.
What resources does the IAPR

Education web site have?

The most important resources are for students, researchers and educators. These include lists with URLs to:

- Tutorials and surveys
- Explanatory text
- Online demos
- Datasets
- Book lists
- Free code
- Course notes
- Lecture slides
- Course reading lists
- Coursework/homework
- A list of course web pages at many universities

There are many areas for extension in the web pages, but they already link to more than 3000 resources.

These resources are subdivided into five areas. Of course, the boundaries are never distinct and we undoubtedly will also provoke a few dissenting opinions. However, we have tried to address the main work done by the IAPR community, as clustered into 3 core technology areas and 2 broad families of application areas:

1. Symbolic Pattern Recognition
2. Statistical Pattern Recognition
3. Machine Learning
4. 1D Signal Analysis
5. Computer vision/Image Processing/Machine Vision


Initial website development was by Christos Papadopoulos and Apostolos Antonacopoulos. Content entry was Edinburgh University PhD students Kisuh Ahn, Edwin Bonilla, Lei Chen, Tim Hospedales, Gail Sinclair, and Narayanan Unny E, supervised by Bob Fisher. Content advice supplied by by the 2006-8 Education Committee: Bruce Maxwell, Sudeep Sarkar, Xiaoyi Jiang, Laurent Heutte and
Sergios Theodoridis. Funding was provided by the EC funded euCognition network, The British Machine Vision Association and the UK's EPSRC. "

Prof. Robert B. Fisher, School of Informatics, Univ. of Edinburgh