Pages

Friday, July 17, 2009

New version of img(Rummager)

List of Changes

Laboratory
1.New version of the Spatial Color Layout Descriptor
2.New (Final) version of the BTDH descriptor
3.New Fyzzy Linking System (HSV-8 bin linking histogram)

img Retrieval
4. New (Final) version of BTDH. This descriptor is now more compact - You have to re index your images.
5. New version of SpCD. This descriptor combines color and spatial color distribution information. The descriptors of this type can be used for image retrieval by using hand-drawn sketch queries, since this  descriptor captures the layout information of color feature.
6. Design Custom Search Method Bug (X23227) Fixed
7. EHD bug Fixed (Thanks to Mattia Broilo :) )

Extras
8. img(Paint.Anaktisi) uses the latest version of SpCD
9. img(Finder) uses the latest version of SpCD

http://savvash.blogspot.com/2008/06/imgrummager-in-now-available-for.html

Thursday, July 16, 2009

Bill Gates: Project Natal coming to PC’s too

Last month at E3, everyone was talking about "Project Natal", the code name for Microsoft's controller-less camera technology for the Xbox 360 game console. When asked if the technology could come to the PC as well, Microsoft execs wouldn't commit saying only that they were just talking about the Xbox 360's uses during E3.
Well, no less than Microsoft founder Bill Gates has blown Project Natal's Xbox 360 exclusivity. News.com reports that Gates has confirmed that they are bringing the gesture-based technology to Windows as well. Gates is quoted as saying, "Both the Xbox guys and the Windows guys latched onto that and now even since they latched onto it the idea of how it can be used in the office is getting much more concrete, and is pretty exciting."
Office use, Bill? We are betting that when Project Natal tech does come to Windows we will see some PC game apps as well.

http://news.bigdownload.com/2009/07/15/bill-gates-project-natal-tech-coming-to-the-pc-too/

Sony Venture Animates Human Face in Picture

An infinite number of facial expressions can be created from a single picture. Its eyes rotate and lips move. It smiles and cries. A familiar face gazes and talks at you in unfamiliar ways.

An image processing technology with such capabilities is drawing attention as a promotional tool on the Internet in Japan. MotionPortrait Inc, a subsidiary of leading Internet provider So-net Entertainment Corp, developed the technology.

Since established in July 2007, the company provided the technology to more than 30 companies, including cosmetic firm Shiseido Co Ltd and shaver company Schick Japan KK, as a sales promotion tool. Also, it has been used in various kinds of software for home game consoles. The technology, which deals with human faces, one of the most familiar objects to us, is now making consumers interested and have a sense of affinity.

The image processing technology is a brainchild of Sony Corp's Kihara Laboratory, which was closed in 2006. It was established by Nobutoshi Kihara, the company's former senior managing director who developed the first tape recorder and household video tape recorder in Japan.

The laboratory is also famous for developing technologies for PlayStation 2, Sony's home game console. Its image processing and computer graphics technologies are about to blossom again with Internet technologies.

http://techon.nikkeibp.co.jp/english/NEWS_EN/20090602/171114/

Friday, July 10, 2009

Apple preps iPhone face recognition

Article From http://www.theregister.co.uk/2009/07/09/more_july_apple_patents/

The US Patent and Trademark Office published 33 new Apple patent applications on Thursday, bringing the total filed in July to 55 - and we're not even a third of the way through the month.

Today's cluster of creativity ranged from flexible cabling to scrolling lyrics, but the bulk of the filings described new powers for the ubiquitous iPhone and its little brother, the iPod touch - especially when the 'Pod is equipped with a camera, which it seems destined to be.

Two of the filings are directly camera-related. One focuses on object identification and the other on face recognition. The former is targeted specifically for handhelds, while the latter's reach extends both into your pocket and out to the entire universe of consumer electronics.

The object-identification filing describes a system in which a handheld's camera captures an image of an object either in visual light or infrared, then compares that image with information stored over a network. The network then asks the you what information about that object you'd like to download, then provides it.

The filing also describes the system using an RFID reader rather than a camera, but the detect-compare-download sequence remains the same.

Apple uses a museum visit to illustrate the utility of this technology: You could simply point your iPhone at a work of art and quickly be presented with info about its artist, genre, provenance, and the availability of T-shirts featuring that work in the museum store - which another of Thursday's filings, on online shopping, could help you buy.

Tapping into a handheld's GPS and digital compass could also enable the system to provide location-based resources - the filing suggests a "RESTAURANT mode" to help you find east in Vegas - and to support the captioned landscapes provided by augmented reality that are getting so much press lately.

Apple object-recognition patent illustration

Claude Monet at his most minimalist, captured and ID-ed by your iPhone

Finally, the filing includes a way for you to capture a log of all the identified objects, complete with downloaded multimedia content for creating a record of your peregrinations. Look for such a media-rich slideshow to appear in some student's "How I Spent My Summer Vacation" school assignment.

The face-recognition filing describes two different system: one that merely checks for a face - any face - and a second that matches what it sees with a database of whom it knows.

The system that doesn't care who you are is merely looking to see if anyone's using it - and if someone is, it won't time out, as would an iPhone, or fire up a screensaver, as would a PC. The system that knows you by your dashing good looks would also be aware of your privileges level, and allow access to its services based on that level.

Apple, as usual in its patent applications, isn't shy about the scope of this technology. It list 37 different devices that could incorporate it, from personal communications devices to vehicle operating systems to automatic teller machines. Then just in case it forgot anything, adds "any like computing device capable of interfacing with a person."

That should just about cover it.

Wednesday, July 8, 2009

Introducing the Google Chrome OS

googlechromew[1]Article From “The Official Google Blog
It's been an exciting nine months since we launched the Google Chrome browser. Already, over 30 million people use it regularly. We designed Google Chrome for people who live on the web — searching for information, checking email, catching up on the news, shopping or just staying in touch with friends. However, the operating systems that browsers run on were designed in an era where there was no web. So today, we're announcing a new project that's a natural extension of Google Chrome — the Google Chrome Operating System. It's our attempt to re-think what operating systems should be.
Google Chrome OS is an open source, lightweight operating system that will initially be targeted at netbooks. Later this year we will open-source its code, and netbooks running Google Chrome OS will be available for consumers in the second half of 2010. Because we're already talking to partners about the project, and we'll soon be working with the open source community, we wanted to share our vision now so everyone understands what we are trying to achieve.
Speed, simplicity and security are the key aspects of Google Chrome OS. We're designing the OS to be fast and lightweight, to start up and get you onto the web in a few seconds. The user interface is minimal to stay out of your way, and most of the user experience takes place on the web. And as we did for the Google Chrome browser, we are going back to the basics and completely redesigning the underlying security architecture of the OS so that users don't have to deal with viruses, malware and security updates. It should just work.
Google Chrome OS will run on both x86 as well as ARM chips and we are working with multiple OEMs to bring a number of netbooks to market next year. The software architecture is simple — Google Chrome running within a new windowing system on top of a Linux kernel. For application developers, the web is the platform. All web-based applications will automatically work and new applications can be written using your favorite web technologies. And of course, these apps will run not only on Google Chrome OS, but on any standards-based browser on Windows, Mac and Linux thereby giving developers the largest user base of any platform.

Comparing the Scientific Impact of Conference and Journal Publications in Computer Science

Rahm, E.
Comparing the Scientific Impact of Conference and Journal Publications in Computer Science
Proc. Academic Publishing in Europe (APE08) 2008

The impact of scientific publications is often estimated by the number of citations they receive, i.e. how frequently  they  are  referenced  by  other  publications.  Since  publications  have  associated  authors, originating institutions and publication venues (e.g. journals, conference proceedings) citations have also been used  to  compare  their  scientific  impact. For  instance, one commonly considered  indicator of  the quality of a journal is its impact factor [AM00]. The impact factors are published yearly by Thomson ISI in the Journal Citation Report (JCR) by counting the citations from articles of thousands of journals.  However, research results in computer science are often published in high-quality conferences which are not  covered  by  the  JCR  citation  databases  [MV07]. Other  commercial  citation  data  sources  such  as Elsevier Scopus also  focus on  journals  and contain comparatively  few conference publications. Hence these  data  sources  cover  only  a  fraction  of  quality  scientific  publications  in  computer  science. Furthermore,  they miss many  citations  even  for  journal  articles  since  all  references  to  them  are  not captured  which  originate  from  conference  papers  or  other  papers  not  included  in  the  publication database.

Download - Download Slides

image

Saturday, July 4, 2009

Preprocessing for Content-Based Image Retrieval

Rodhetbhai, W. (2009) Preprocessing for Content-Based Image Retrieval. PhD thesis, University of Southampton.

Published Version
PDF

Abstract

The research focuses on image retrieval problems where the query is formed as an
image of a specific object of interest. The broad aim is to investigate pre-processing for retrieval of images of objects when an example image containing the object is given.
The object may be against a variety of backgrounds. Given the assumption that the
object of interest is fairly centrally located in the image, the normalized cut
segmentation and region growing segmentation are investigated to segment the object from the background but with limited success. An alternative approach comes from identifying salient regions in the image and extracting local features as a representation of the regions. The experiments show an improvement for retrieval by local features when compared with retrieval using global features from the whole image. For situations where object retrieval is required and where the foreground and
background can be assumed to have different characteristics, it is useful to exclude
salient regions which are characteristic of the background if they can be identified
before matching is undertaken. This thesis proposes techniques to filter out salient
regions believed to be associated with the background area. Background filtering using background clusters is the first technique which is proposed in the situation where only the background information is available for training. The second technique is the K-NN classification based on the foreground and background probability. In the last chapter, the support vector machine (SVM) method with PCA-SIFT descriptors is applied in an attempt to improve classification into foreground and background salient region classes. Retrieval comparisons show that the use of salient region background filtering gives an improvement in performance when compared with the unfiltered method.

Friday, July 3, 2009

ICAART 2010 (International Conference on Agents and Artificial Intelligence)

ICAART 2010 (International Conference on Agents and Artificial Intelligence - http://www.icaart.org) has an open call for papers, whose deadline is at the end of July. We hope you can participate in this prestigious conference by submitting a paper reflecting your current research.

In cooperation with the Association for the Advancement of Artificial Intelligence (AAAI), the Portuguese Association for Artificial Intelligence (Associaηγo Portuguesa Para a Inteligκncia Artificial - APPIA), the Spanish Association for Artificial Intelligence (Asociaciσn Espaρola de Inteligencia Artificial - AEPIA), the Workflow Management Coallition (WfMC) and the Association for Computing Machinery (ACM SIGART), ICAART brings together top researchers and practitioners in several areas of Artificial Intelligence, from multiple areas of knowledge, such as Agents, Multi-Agent Systems and Software Platforms, Distributed Problem Solving and Distributed AI in general, including web applications, on one hand, and within the area of non-distributed AI, including the more traditional areas such as Knowledge Representation, Planning, Learning, Scheduling, Perception and also not so traditional areas such as Reactive AI Systems, Evolutionary Computing and other aspects of Computational Intelligence and many other areas related to intelligent systems, on the other hand.

ICAART will be held in Valencia, Spain next year, on January 22-24, 2010 and the paper submission deadline is scheduled for the next July 28, 2009.

The conference program features a number of Keynote Lectures to be delivered by distinguished world-class researchers, including those listed below.

Submitted papers will be subject to a double-blind review process. All accepted papers will be published in the conference proceedings, under an ISBN reference, on paper and on CD-ROM support. The proceedings will be indexed by DBLP and INSPEC.

Additionaly, a selection of the best papers of the conference will be published in a book, by Springer-Verlag. Best paper awards will be given during the conference.

Please check further details at the ICAART conference web site (http://www.icaart.org/). There you will find detailed information about the conference structure and its main topic areas.

Thursday, July 2, 2009

MoleExpert micro software

The MoleExpert software is a product is based on experiences of many years with the automated analysis of pigmented skin lesions

Important requirement with this software project was the usefulness of the software with most different photograph systems.

Qualitatively high-quality, evenly and well illuminated top illumination-microscopic pictures of the lesions is the most important condition for the operability of this software.

MoleExpert micro software

MoleExpert micro was developed for the support of the diagnostic identification. The system spends no diagnosis for this reason, but supplies as results of measurement data to asymmetry, for the delimitation of the lesion, to the color and to the size. These parameters of the ABCD rule are recognized for some years as important dermatoskopic parameters. According to a special algorithm adapted on the image analysis the four ABCD values are combined into a total core, which can take values between zero to unify. With lesions with high Score, it acts with higher probability around a Melanoma, than with lesions with low Score.
Download demo version from here: MoleExpert micro

http://melanoma.blogsome.com/category/skin-image-processing

DullRazor

Recently, there has been an increase in the number of studies using image processing techniques to analyze melanocytic lesions for atypia and possible malignancy, and for total-body mole mapping. Such lesions, however, can be partially obscured by body hair; and, to date, no study has fully addressed the problem of human hair occluding imaged lesions. In a previous study we designed an automatic segmentation program to differentiate skin lesions from normal healthy skin.  Our program performed well with most images — the exception being images where dark thick hair covers part of the lesions. Dark hair confused the program which resulted in unsatisfactory segmentation results.

Presented here is a method to remove hair from images using a pre-processing program called DullRazor.   DullRazor performs the following steps:

  1. It identifies the dark hair locations by a generalized grayscale morphological closing operation,
  2. It verifies the shape of the hair pixels as thin and long structure, and replace the verified pixels by a bilinear interpolation, and
  3. It smooths the replaced hair pixels with an adaptive median filter.

The algorithm has been implemented in C on a SunOS 4.x workstation. (The program can be run on Sun Solaris workstations as well.) It has been tested on real nevi images with satisfactory results. Figure 1. shows a lesion covered by thick hair and Figure 2. shows the result with hair removed. The man.txt file provides more details.

 

DullRazor can be downloaded and used without fee for non-commerical purpose.  The full license is included in the download.

Download the Unix version of DullRazor (dullrazor.zip, 87KB): follow this link.
Download the Windows version of DullRazor (dullrazor_wins.zip, 327KB): follow this link.

http://www.dermweb.com/dull_razor/

Konstantinos Zagoris PhD Thesis

My colleague Konstantinos Zagoris presented his PhD Thesis.

ABSTRACT
In  the  last  years,  the  world  has  experienced  a  significant  growth  of  the  size  of multimedia data without any indexing information, which have been increased thanks to the easiness  to  create  such  images using  scanners or digital  cameras.  In order  to satisfactorily exploit these quantities of images, it is necessary to develop effective techniques to browse, store and retrieve them.  The present PhD Thesis introduces five methods that improve the content-based image retrieval systems.

6333_1164179218360_1044255415_30473766_4052057_n

The  first  technique  proposes  a  new  color  clustering  technique which  is  based  on  a combination of a neural network and a  fuzzy  classifier.  Initially,  the  colors are  reduced by using the  Kohonen  Self  Organized  Feature Map  (KSOFM).  After  this,  each  initial  color is classified to one of the output KSOFM classes. In the final stage, the KSOFM results initialize the Gustafson – Kessel  Fuzzy Classifier  (GTFC). The  final  clustering  results obtained by  the GTFC are the color palette of the final image.  
The  experimental  results  have  shown  the  ability  to  retain  the  image’s  dominant colors.  Also,  it  can merge  areas  of  the  image with  similar  colors  producing  uniform  color areas. In this point of view the proposed technique can be used for color segmentation. The second method introduces a relevance feedback technique based on four MPEG-7-  like descriptors.   The user searching  for a subset of  images, sometimes has not a clearly and accurate vision of these  images. He/she has a general notion of the  image  in quest but not the exact visual depiction of it. Also, sometimes there is not an appropriate query image to use for retrieval.  So,  the  system must  provide  a mechanism  to  fine  tune  the  retrieval results.
Primarily,  the  initial  image  query  one-dimensional  descriptor  is  transformed  to  a three-dimensional vector based on the inner features of the descriptor which stores the user history  search  information  and  it  is initialized  by  the  original  query  descriptor. When the user  selects  a  relevant  image  from  the  retrieval results,  each  bin of  that  selected  image's descriptor  updates  the  corresponding  value  of  the three-dimensional  vector.  The  final descriptor  to  query  the  image  database  is  formed  by  the  values  of  the  three-dimension vector and the new results are presented to the user. The proposed relevance feedback technique improves the original retrieval results,  it is simple to implement and has low computational cost. 
The  third  method  detects  and  extracts  homogeneous  text  in  document  images indifferent  to  font  types  and  size  by  using  connected  components  analysis  to  detect  the objects, Document  Structure Elements  (DSE)  to construct  a descriptor  and  Support Vector Machines  to  tag  the  appropriate  objects  as  text.  Also,  it  has  the  ability  to  adapt  to  the peculiarities of each document images database since the features adjust to it. Primarily, the connected components detect and extract the object blocks that reside inside  the  image.  From  every  such  block  a  descriptor  is  extracted which  it  is  constructed from  a  set  of  document  structures elements.    Also,  the  length  of  the  descriptor  can  be reduced from the 510 initial DSEs to any number using an algorithm called Feature Standard Deviation Analysis of Structure Elements  (FSDASE).  Finally,  the output of  the  SVM  is using the descriptors to classify each block as text or not and extract those blocks from the original image or locate them on it.
The proposed technique has the ability to adapt to the peculiarities of each document images database  since  the  features adjust  to  it.  It provides, also,  the ability  to  increase or decrease text localization speed by the manipulation of the block descriptor length.  
The  fourth  technique  encounters  the  document  retrieval  problem  using  a  word matching procedure. This  technique performs  the word matching directly  in  the document images bypassing OCR and using word-images as queries. The entire system consists of the Offline and the Online procedures.
In the Offline procedure which it is transparent to the user, the document images are analyzed  and  the  results  are  stored  in  a  database.  This  procedure  consists  of  three main stages.  Initially,  the  document  images  pass  the  preprocessing  stage  which  consists  of  a Median  filter,  in  order  to  face  the  existence  of  noise  e.g  in  case  of  historical  or  badly maintained  documents,  and  the Otsu  binarization method.  The word  segmentation  stage follows  the  preprocessing  stage.  Its  primary  goal  is  to  detect  the  word  limits.  This  is accomplished by using the Connected Components Labeling and Filtering method.   A set of features, capable of capturing the word shape and discard detailed differences due to noise or  font differences are used  for  the word-matching process. These  features are: Width to Height Ratio, Word Area Density, Center of Gravity, Vertical Projection, Top – Bottom Shape
Projections, Upper Grid  Features, Down Grid  Features.  Finally,  these  features  create a 93-dimention vector that is the word descriptor and it is stored in a database.  In the Online  procedure,  the  user  enters  a  query  word  and  the  proposed  system creates an image from it with font height equal to the average height of all the word-boxes
obtained through Offline operation. Then, the system calculates the descriptor of the query word  image.  Finally,  the  system  using the Minkowski  L1  distance  presents  the  documents that  contain  the  words  which  their  descriptors  are  closest  to  the  query  descriptor.  The experimental  results  show  that  the  proposed  system  performs  better  than  a  commercial OCR package.
The  last  method  involves  a  MPEG-like  compact  shape  descriptor  that  contains conventional contour and region shape features with a wide applicability from any arbitrary shape  to  document  retrieval  through word  spotting.  It  is  called  Compact  Shape  Portrayal Descriptor  and  its  computation  can  be  easily  parallize  as  each  feature  can  be  calculated separately. These  features are  the Width  to Height Ratio, Vertical – Horizontal Projections,
Top – Bottom Shape Projections which construct a 41 dimension descriptor. 
In order to compress the descriptor even more, the values of the feature vectors are quantized for binary representation  in three bits for each element of the descriptor. So the storage  requirement  is  equal  to  3x41=123  bits.  The  values  of  the  descriptor  are concentrated  within  small  ranges  so  they  must  be  non-linearly  quantized  in order to minimize the overall number of bits. Also, each feature  is not related to each other so they must  have  differing  quantization  values.  Finally,  the  MPEG-7  quantizes  its  compact descriptors, too. The quantization  is achieved by  the Gustafson-Kessel Fuzzy Classifier  (GKFC) which  it produces eight clusters defined by a center and a positive-define matrix adapted according to the topological structure of the data  inside the cluster. So, the output of GKFC maps the descriptor values for the decimal area [0,1] into the integer area [0,7] or into the binary area
[000,111]. In  addition  to  the  descriptor,  a  Relevance  Feedback  technique  is  provided  that employs the above descriptor with the purpose to measure how well it performs with it. It is based  on  the  Support  Vector  Machines  (SVMs).  When  the  system  presents  the  initial retrieval results to the user, he/she  is able to tag one or more  images as wrongly or rightly retrieved.  The  system  utilizes  this  information  by  grouping  the  descriptor  of  those word-images  (including the original query descriptor) as training data for the SVMs. Then, all the words-images  are  presented  to  the  user  with  respect  to  the  normalized  SVMs  decision function.
The Compact Shape Portrayal Descriptor main advantages are the very small size (only 123bits);  its  low  computation  cost  and  its  general  applicability  without  compromise its retrieval accuracy.   In  the  bottom  line,  the  present  thesis  presents  solutions  to  real  problems  of  the content-based  image  retrieval  systems as  image  segmentation,  text  localization,  relevance feedback  algorithms  and  shape/word  descriptors.  All  the  proposed  methods  can  be combined in order to create a fast and modern MPEG-7 compatible content-based retrieval image system. 

Download the Thesis (In Greek)

Congratulations Konstantinos

Wednesday, July 1, 2009

An Empirical Study on Large-Scale Content-Based Image Retrieval

One key challenge in content-based image retrieval (CBIR) is to develop a fast solution for indexing high-dimensional image contents, which is crucial to building large-scale CBIR systems. In this paper, we propose a scalable content-based image retrieval scheme using locality-sensitive hashing (LSH), and conduct extensive evaluations on a large image test bed of a half million images. To the best of our knowledge, there is less comprehensive study on large-scale CBIR evaluation
with a half million images. Our empirical results show that our proposed solution is able to scale for hundreds of thousands of images, which is promising for building web-scale CBIR systems.

http://www.cais.ntu.edu.sg/~chhoi/paper_pdf/ICME07LCBIR.pdf

RDF TV - The Baloney Detection Kit - Michael Shermer

The Richard Dawkins Foundation, Michael Shermer, Josh Timonen