AI Applications in the fields of Multimedia, Computer Vision and Robotics: July 2009

Friday, July 17, 2009

New version of img(Rummager)

Αναρτήθηκε από Savvas Chatzichristofis

List of Changes

Laboratory
1.New version of the Spatial Color Layout Descriptor
2.New (Final) version of the BTDH descriptor
3.New Fyzzy Linking System (HSV-8 bin linking histogram)

img Retrieval
4. New (Final) version of BTDH. This descriptor is now more compact - You have to re index your images.
5. New version of SpCD. This descriptor combines color and spatial color distribution information. The descriptors of this type can be used for image retrieval by using hand-drawn sketch queries, since this descriptor captures the layout information of color feature.
6. Design Custom Search Method Bug (X23227) Fixed
7. EHD bug Fixed (Thanks to Mattia Broilo :) )

Extras
8. img(Paint.Anaktisi) uses the latest version of SpCD
9. img(Finder) uses the latest version of SpCD

http://savvash.blogspot.com/2008/06/imgrummager-in-now-available-for.html

Thursday, July 16, 2009

Bill Gates: Project Natal coming to PC’s too

Αναρτήθηκε από Savvas Chatzichristofis

Last month at E3, everyone was talking about "Project Natal", the code name for Microsoft's controller-less camera technology for the Xbox 360 game console. When asked if the technology could come to the PC as well, Microsoft execs wouldn't commit saying only that they were just talking about the Xbox 360's uses during E3.
Well, no less than Microsoft founder Bill Gates has blown Project Natal's Xbox 360 exclusivity. News.com reports that Gates has confirmed that they are bringing the gesture-based technology to Windows as well. Gates is quoted as saying, "Both the Xbox guys and the Windows guys latched onto that and now even since they latched onto it the idea of how it can be used in the office is getting much more concrete, and is pretty exciting."
Office use, Bill? We are betting that when Project Natal tech does come to Windows we will see some PC game apps as well.

http://news.bigdownload.com/2009/07/15/bill-gates-project-natal-tech-coming-to-the-pc-too/

Sony Venture Animates Human Face in Picture

Αναρτήθηκε από Savvas Chatzichristofis

An infinite number of facial expressions can be created from a single picture. Its eyes rotate and lips move. It smiles and cries. A familiar face gazes and talks at you in unfamiliar ways.

An image processing technology with such capabilities is drawing attention as a promotional tool on the Internet in Japan. MotionPortrait Inc, a subsidiary of leading Internet provider So-net Entertainment Corp, developed the technology.

Since established in July 2007, the company provided the technology to more than 30 companies, including cosmetic firm Shiseido Co Ltd and shaver company Schick Japan KK, as a sales promotion tool. Also, it has been used in various kinds of software for home game consoles. The technology, which deals with human faces, one of the most familiar objects to us, is now making consumers interested and have a sense of affinity.

The image processing technology is a brainchild of Sony Corp's Kihara Laboratory, which was closed in 2006. It was established by Nobutoshi Kihara, the company's former senior managing director who developed the first tape recorder and household video tape recorder in Japan.

The laboratory is also famous for developing technologies for PlayStation 2, Sony's home game console. Its image processing and computer graphics technologies are about to blossom again with Internet technologies.

http://techon.nikkeibp.co.jp/english/NEWS_EN/20090602/171114/

Friday, July 10, 2009

Apple preps iPhone face recognition

Αναρτήθηκε από Savvas Chatzichristofis

Article From http://www.theregister.co.uk/2009/07/09/more_july_apple_patents/

The US Patent and Trademark Office published 33 new Apple patent applications on Thursday, bringing the total filed in July to 55 - and we're not even a third of the way through the month.

Today's cluster of creativity ranged from flexible cabling to scrolling lyrics, but the bulk of the filings described new powers for the ubiquitous iPhone and its little brother, the iPod touch - especially when the 'Pod is equipped with a camera, which it seems destined to be.

Two of the filings are directly camera-related. One focuses on object identification and the other on face recognition. The former is targeted specifically for handhelds, while the latter's reach extends both into your pocket and out to the entire universe of consumer electronics.

The object-identification filing describes a system in which a handheld's camera captures an image of an object either in visual light or infrared, then compares that image with information stored over a network. The network then asks the you what information about that object you'd like to download, then provides it.

The filing also describes the system using an RFID reader rather than a camera, but the detect-compare-download sequence remains the same.

Apple uses a museum visit to illustrate the utility of this technology: You could simply point your iPhone at a work of art and quickly be presented with info about its artist, genre, provenance, and the availability of T-shirts featuring that work in the museum store - which another of Thursday's filings, on online shopping, could help you buy.

Tapping into a handheld's GPS and digital compass could also enable the system to provide location-based resources - the filing suggests a "RESTAURANT mode" to help you find east in Vegas - and to support the captioned landscapes provided by augmented reality that are getting so much press lately.

Apple object-recognition patent illustration

Claude Monet at his most minimalist, captured and ID-ed by your iPhone

Finally, the filing includes a way for you to capture a log of all the identified objects, complete with downloaded multimedia content for creating a record of your peregrinations. Look for such a media-rich slideshow to appear in some student's "How I Spent My Summer Vacation" school assignment.

The face-recognition filing describes two different system: one that merely checks for a face - any face - and a second that matches what it sees with a database of whom it knows.

The system that doesn't care who you are is merely looking to see if anyone's using it - and if someone is, it won't time out, as would an iPhone, or fire up a screensaver, as would a PC. The system that knows you by your dashing good looks would also be aware of your privileges level, and allow access to its services based on that level.

Apple, as usual in its patent applications, isn't shy about the scope of this technology. It list 37 different devices that could incorporate it, from personal communications devices to vehicle operating systems to automatic teller machines. Then just in case it forgot anything, adds "any like computing device capable of interfacing with a person."

That should just about cover it.

Wednesday, July 8, 2009

Introducing the Google Chrome OS

Αναρτήθηκε από Savvas Chatzichristofis

Article From “The Official Google Blog”
It's been an exciting nine months since we launched the Google Chrome browser. Already, over 30 million people use it regularly. We designed Google Chrome for people who live on the web — searching for information, checking email, catching up on the news, shopping or just staying in touch with friends. However, the operating systems that browsers run on were designed in an era where there was no web. So today, we're announcing a new project that's a natural extension of Google Chrome — the Google Chrome Operating System. It's our attempt to re-think what operating systems should be.
Google Chrome OS is an open source, lightweight operating system that will initially be targeted at netbooks. Later this year we will open-source its code, and netbooks running Google Chrome OS will be available for consumers in the second half of 2010. Because we're already talking to partners about the project, and we'll soon be working with the open source community, we wanted to share our vision now so everyone understands what we are trying to achieve.
Speed, simplicity and security are the key aspects of Google Chrome OS. We're designing the OS to be fast and lightweight, to start up and get you onto the web in a few seconds. The user interface is minimal to stay out of your way, and most of the user experience takes place on the web. And as we did for the Google Chrome browser, we are going back to the basics and completely redesigning the underlying security architecture of the OS so that users don't have to deal with viruses, malware and security updates. It should just work.
Google Chrome OS will run on both x86 as well as ARM chips and we are working with multiple OEMs to bring a number of netbooks to market next year. The software architecture is simple — Google Chrome running within a new windowing system on top of a Linux kernel. For application developers, the web is the platform. All web-based applications will automatically work and new applications can be written using your favorite web technologies. And of course, these apps will run not only on Google Chrome OS, but on any standards-based browser on Windows, Mac and Linux thereby giving developers the largest user base of any platform.

Comparing the Scientific Impact of Conference and Journal Publications in Computer Science

Αναρτήθηκε από Savvas Chatzichristofis

Rahm, E.
Comparing the Scientific Impact of Conference and Journal Publications in Computer Science
Proc. Academic Publishing in Europe (APE08) 2008

The impact of scientific publications is often estimated by the number of citations they receive, i.e. how frequently they are referenced by other publications. Since publications have associated authors, originating institutions and publication venues (e.g. journals, conference proceedings) citations have also been used to compare their scientific impact. For instance, one commonly considered indicator of the quality of a journal is its impact factor [AM00]. The impact factors are published yearly by Thomson ISI in the Journal Citation Report (JCR) by counting the citations from articles of thousands of journals. However, research results in computer science are often published in high-quality conferences which are not covered by the JCR citation databases [MV07]. Other commercial citation data sources such as Elsevier Scopus also focus on journals and contain comparatively few conference publications. Hence these data sources cover only a fraction of quality scientific publications in computer science. Furthermore, they miss many citations even for journal articles since all references to them are not captured which originate from conference papers or other papers not included in the publication database.

Download - Download Slides

Saturday, July 4, 2009

Preprocessing for Content-Based Image Retrieval

Αναρτήθηκε από Savvas Chatzichristofis

Rodhetbhai, W. (2009) Preprocessing for Content-Based Image Retrieval. PhD thesis, University of Southampton.

Published Version

PDF

Abstract

The research focuses on image retrieval problems where the query is formed as an
image of a specific object of interest. The broad aim is to investigate pre-processing for retrieval of images of objects when an example image containing the object is given.
The object may be against a variety of backgrounds. Given the assumption that the
object of interest is fairly centrally located in the image, the normalized cut
segmentation and region growing segmentation are investigated to segment the object from the background but with limited success. An alternative approach comes from identifying salient regions in the image and extracting local features as a representation of the regions. The experiments show an improvement for retrieval by local features when compared with retrieval using global features from the whole image. For situations where object retrieval is required and where the foreground and
background can be assumed to have different characteristics, it is useful to exclude
salient regions which are characteristic of the background if they can be identified
before matching is undertaken. This thesis proposes techniques to filter out salient
regions believed to be associated with the background area. Background filtering using background clusters is the first technique which is proposed in the situation where only the background information is available for training. The second technique is the K-NN classification based on the foreground and background probability. In the last chapter, the support vector machine (SVM) method with PCA-SIFT descriptors is applied in an attempt to improve classification into foreground and background salient region classes. Retrieval comparisons show that the use of salient region background filtering gives an improvement in performance when compared with the unfiltered method.

Friday, July 3, 2009

ICAART 2010 (International Conference on Agents and Artificial Intelligence)

Αναρτήθηκε από Savvas Chatzichristofis

ICAART 2010 (International Conference on Agents and Artificial Intelligence - http://www.icaart.org) has an open call for papers, whose deadline is at the end of July. We hope you can participate in this prestigious conference by submitting a paper reflecting your current research.

In cooperation with the Association for the Advancement of Artificial Intelligence (AAAI), the Portuguese Association for Artificial Intelligence (Associaηγo Portuguesa Para a Inteligκncia Artificial - APPIA), the Spanish Association for Artificial Intelligence (Asociaciσn Espaρola de Inteligencia Artificial - AEPIA), the Workflow Management Coallition (WfMC) and the Association for Computing Machinery (ACM SIGART), ICAART brings together top researchers and practitioners in several areas of Artificial Intelligence, from multiple areas of knowledge, such as Agents, Multi-Agent Systems and Software Platforms, Distributed Problem Solving and Distributed AI in general, including web applications, on one hand, and within the area of non-distributed AI, including the more traditional areas such as Knowledge Representation, Planning, Learning, Scheduling, Perception and also not so traditional areas such as Reactive AI Systems, Evolutionary Computing and other aspects of Computational Intelligence and many other areas related to intelligent systems, on the other hand.

ICAART will be held in Valencia, Spain next year, on January 22-24, 2010 and the paper submission deadline is scheduled for the next July 28, 2009.

The conference program features a number of Keynote Lectures to be delivered by distinguished world-class researchers, including those listed below.

Submitted papers will be subject to a double-blind review process. All accepted papers will be published in the conference proceedings, under an ISBN reference, on paper and on CD-ROM support. The proceedings will be indexed by DBLP and INSPEC.

Additionaly, a selection of the best papers of the conference will be published in a book, by Springer-Verlag. Best paper awards will be given during the conference.

Please check further details at the ICAART conference web site (http://www.icaart.org/). There you will find detailed information about the conference structure and its main topic areas.

Thursday, July 2, 2009

MoleExpert micro software

Αναρτήθηκε από Savvas Chatzichristofis

The MoleExpert software is a product is based on experiences of many years with the automated analysis of pigmented skin lesions

Important requirement with this software project was the usefulness of the software with most different photograph systems.

Qualitatively high-quality, evenly and well illuminated top illumination-microscopic pictures of the lesions is the most important condition for the operability of this software.

MoleExpert micro software

MoleExpert micro was developed for the support of the diagnostic identification. The system spends no diagnosis for this reason, but supplies as results of measurement data to asymmetry, for the delimitation of the lesion, to the color and to the size. These parameters of the ABCD rule are recognized for some years as important dermatoskopic parameters. According to a special algorithm adapted on the image analysis the four ABCD values are combined into a total core, which can take values between zero to unify. With lesions with high Score, it acts with higher probability around a Melanoma, than with lesions with low Score.
Download demo version from here: MoleExpert micro

http://melanoma.blogsome.com/category/skin-image-processing

DullRazor

Αναρτήθηκε από Savvas Chatzichristofis

Recently, there has been an increase in the number of studies using image processing techniques to analyze melanocytic lesions for atypia and possible malignancy, and for total-body mole mapping. Such lesions, however, can be partially obscured by body hair; and, to date, no study has fully addressed the problem of human hair occluding imaged lesions. In a previous study we designed an automatic segmentation program to differentiate skin lesions from normal healthy skin. Our program performed well with most images — the exception being images where dark thick hair covers part of the lesions. Dark hair confused the program which resulted in unsatisfactory segmentation results.

Presented here is a method to remove hair from images using a pre-processing program called DullRazor. DullRazor performs the following steps:

It identifies the dark hair locations by a generalized grayscale morphological closing operation,
It verifies the shape of the hair pixels as thin and long structure, and replace the verified pixels by a bilinear interpolation, and
It smooths the replaced hair pixels with an adaptive median filter.

The algorithm has been implemented in C on a SunOS 4.x workstation. (The program can be run on Sun Solaris workstations as well.) It has been tested on real nevi images with satisfactory results. Figure 1. shows a lesion covered by thick hair and Figure 2. shows the result with hair removed. The man.txt file provides more details.

DullRazor can be downloaded and used without fee for non-commerical purpose. The full license is included in the download.

Download the Unix version of DullRazor (dullrazor.zip, 87KB): follow this link.
Download the Windows version of DullRazor (dullrazor_wins.zip, 327KB): follow this link.

http://www.dermweb.com/dull_razor/

Konstantinos Zagoris PhD Thesis

Αναρτήθηκε από Savvas Chatzichristofis

My colleague Konstantinos Zagoris presented his PhD Thesis.

ABSTRACT
In the last years, the world has experienced a significant growth of the size of multimedia data without any indexing information, which have been increased thanks to the easiness to create such images using scanners or digital cameras. In order to satisfactorily exploit these quantities of images, it is necessary to develop effective techniques to browse, store and retrieve them. The present PhD Thesis introduces five methods that improve the content-based image retrieval systems.

The first technique proposes a new color clustering technique which is based on a combination of a neural network and a fuzzy classifier. Initially, the colors are reduced by using the Kohonen Self Organized Feature Map (KSOFM). After this, each initial color is classified to one of the output KSOFM classes. In the final stage, the KSOFM results initialize the Gustafson – Kessel Fuzzy Classifier (GTFC). The final clustering results obtained by the GTFC are the color palette of the final image.
The experimental results have shown the ability to retain the image’s dominant colors. Also, it can merge areas of the image with similar colors producing uniform color areas. In this point of view the proposed technique can be used for color segmentation. The second method introduces a relevance feedback technique based on four MPEG-7- like descriptors.   The user searching for a subset of images, sometimes has not a clearly and accurate vision of these images. He/she has a general notion of the image in quest but not the exact visual depiction of it. Also, sometimes there is not an appropriate query image to use for retrieval. So, the system must provide a mechanism to fine tune the retrieval results.
Primarily, the initial image query one-dimensional descriptor is transformed to a three-dimensional vector based on the inner features of the descriptor which stores the user history search information and it is initialized by the original query descriptor. When the user selects a relevant image from the retrieval results, each bin of that selected image's descriptor updates the corresponding value of the three-dimensional vector. The final descriptor to query the image database is formed by the values of the three-dimension vector and the new results are presented to the user. The proposed relevance feedback technique improves the original retrieval results, it is simple to implement and has low computational cost.
The third method detects and extracts homogeneous text in document images indifferent to font types and size by using connected components analysis to detect the objects, Document Structure Elements (DSE) to construct a descriptor and Support Vector Machines to tag the appropriate objects as text. Also, it has the ability to adapt to the peculiarities of each document images database since the features adjust to it. Primarily, the connected components detect and extract the object blocks that reside inside the image. From every such block a descriptor is extracted which it is constructed from a set of document structures elements.    Also, the length of the descriptor can be reduced from the 510 initial DSEs to any number using an algorithm called Feature Standard Deviation Analysis of Structure Elements (FSDASE). Finally, the output of the SVM is using the descriptors to classify each block as text or not and extract those blocks from the original image or locate them on it.
The proposed technique has the ability to adapt to the peculiarities of each document images database since the features adjust to it. It provides, also, the ability to increase or decrease text localization speed by the manipulation of the block descriptor length.
The fourth technique encounters the document retrieval problem using a word matching procedure. This technique performs the word matching directly in the document images bypassing OCR and using word-images as queries. The entire system consists of the Offline and the Online procedures.
In the Offline procedure which it is transparent to the user, the document images are analyzed and the results are stored in a database. This procedure consists of three main stages. Initially, the document images pass the preprocessing stage which consists of a Median filter, in order to face the existence of noise e.g in case of historical or badly maintained documents, and the Otsu binarization method. The word segmentation stage follows the preprocessing stage. Its primary goal is to detect the word limits. This is accomplished by using the Connected Components Labeling and Filtering method.   A set of features, capable of capturing the word shape and discard detailed differences due to noise or font differences are used for the word-matching process. These features are: Width to Height Ratio, Word Area Density, Center of Gravity, Vertical Projection, Top – Bottom Shape
Projections, Upper Grid Features, Down Grid Features. Finally, these features create a 93-dimention vector that is the word descriptor and it is stored in a database. In the Online procedure, the user enters a query word and the proposed system creates an image from it with font height equal to the average height of all the word-boxes
obtained through Offline operation. Then, the system calculates the descriptor of the query word image. Finally, the system using the Minkowski L1 distance presents the documents that contain the words which their descriptors are closest to the query descriptor. The experimental results show that the proposed system performs better than a commercial OCR package.
The last method involves a MPEG-like compact shape descriptor that contains conventional contour and region shape features with a wide applicability from any arbitrary shape to document retrieval through word spotting. It is called Compact Shape Portrayal Descriptor and its computation can be easily parallize as each feature can be calculated separately. These features are the Width to Height Ratio, Vertical – Horizontal Projections,
Top – Bottom Shape Projections which construct a 41 dimension descriptor.
In order to compress the descriptor even more, the values of the feature vectors are quantized for binary representation in three bits for each element of the descriptor. So the storage requirement is equal to 3x41=123 bits. The values of the descriptor are concentrated within small ranges so they must be non-linearly quantized in order to minimize the overall number of bits. Also, each feature is not related to each other so they must have differing quantization values. Finally, the MPEG-7 quantizes its compact descriptors, too. The quantization is achieved by the Gustafson-Kessel Fuzzy Classifier (GKFC) which it produces eight clusters defined by a center and a positive-define matrix adapted according to the topological structure of the data inside the cluster. So, the output of GKFC maps the descriptor values for the decimal area [0,1] into the integer area [0,7] or into the binary area
[000,111]. In addition to the descriptor, a Relevance Feedback technique is provided that employs the above descriptor with the purpose to measure how well it performs with it. It is based on the Support Vector Machines (SVMs). When the system presents the initial retrieval results to the user, he/she is able to tag one or more images as wrongly or rightly retrieved. The system utilizes this information by grouping the descriptor of those word-images (including the original query descriptor) as training data for the SVMs. Then, all the words-images are presented to the user with respect to the normalized SVMs decision function.
The Compact Shape Portrayal Descriptor main advantages are the very small size (only 123bits); its low computation cost and its general applicability without compromise its retrieval accuracy.   In the bottom line, the present thesis presents solutions to real problems of the content-based image retrieval systems as image segmentation, text localization, relevance feedback algorithms and shape/word descriptors. All the proposed methods can be combined in order to create a fast and modern MPEG-7 compatible content-based retrieval image system.

Download the Thesis (In Greek)

Congratulations Konstantinos

Wednesday, July 1, 2009

An Empirical Study on Large-Scale Content-Based Image Retrieval

Αναρτήθηκε από Savvas Chatzichristofis

One key challenge in content-based image retrieval (CBIR) is to develop a fast solution for indexing high-dimensional image contents, which is crucial to building large-scale CBIR systems. In this paper, we propose a scalable content-based image retrieval scheme using locality-sensitive hashing (LSH), and conduct extensive evaluations on a large image test bed of a half million images. To the best of our knowledge, there is less comprehensive study on large-scale CBIR evaluation
with a half million images. Our empirical results show that our proposed solution is able to scale for hundreds of thousands of images, which is promising for building web-scale CBIR systems.

http://www.cais.ntu.edu.sg/~chhoi/paper_pdf/ICME07LCBIR.pdf

RDF TV - The Baloney Detection Kit - Michael Shermer

Αναρτήθηκε από Savvas Chatzichristofis

The Richard Dawkins Foundation, Michael Shermer, Josh Timonen

The LIRE (Lucene Image REtrieval) library provides a simple way to retrieve images and photos based on their color and texture characteristics. LIRE creates a Lucene index of image features for content based image retrieval (CBIR). Three of the available image features are taken from the MPEG-7 Standard: ScalableColor, ColorLayout and EdgeHistogram a fourth one, the Auto Color Correlogram has been implemented based on recent research results. Furthermore simple methods for searching the index and result browsing are provided by LIRE. The LIRE library and the LIRE Demo application as well as all the source are available under the Gnu GPL license.

Pages

Friday, July 17, 2009

Thursday, July 16, 2009

Friday, July 10, 2009

Wednesday, July 8, 2009

Saturday, July 4, 2009

Published Version

Abstract

Friday, July 3, 2009

Thursday, July 2, 2009

Wednesday, July 1, 2009