Monday, December 26, 2011

Image retrieval, Research Assistant (19 months)

This post offers the opportunity to work on an UK Engineering and Physical Sciences Research Council and a Technology Strategy Board funded project PhotoBrief. The project will create a platform to create briefs which specify the need for images as well as the ability to find a suitable
image and negotiate for it. It will provide users  with an environment for interactive brief creation and dynamic negotation of images. The project considers the image information seeking and retrieval behaviour of different user groups including professionals. Partners from industry
include an SEO (search engine optimisation) company, a mobile information company, and a media advertising company.

The project is based within the Centre for Interactive Systems Research which has 20 years experience of R&D in search algorithms and technologies, including a well-known search algorithm.

Applicants should be qualified in Information Retrieval (or related area). More specifically, knowledge of meta-data creation, text/image retrieval/ content-based techniques will be needed. The qualification sought is PhD or equivalent experience in field.

In this post you will develop novel brief creation and negotiation techniques that use meta-data about images, which has been captured to enable that image to be more findable. You will implement and integrate these in an image retrieval system. The project will make use of  current
state-of-art meta-data techniques, collections, relevant open-source initiatives, and evaluation methods. You will be involved in integrating these, creating user interfaces, and developing new system functionality where necessary.

Strong programming skills are essential, and experience of any of the following areas is an advantage: Information Retrieval (IR); User-centred IR system design, implementation, and evaluation; Context learning, Adaptive IR, Recommender systems, Social media; Computer Supported Cooperative Working (CSCW); User Interface design and evaluation. Experience in collaborating with partners from industry is desirable.

For informal enquiries, please contact Dr. Ayse Goker:
Further details of the position available at:

The post will also be advertised on

Closing date: 23 January, 2011
Interview date: 1 February, 2011
Start date: March 2011

Further information:

Saturday, December 24, 2011

Merry Christmas and a happy 2012

Personalize funny videos and birthday eCards at JibJab!

Thursday, December 22, 2011

2011 PhD Challenge Winner - "Dirty Old Man"

The goal of this year’s challenge was to get one of the nicknames “DIRTY OLD MAN” or “CRAZY CAT LADY”included in the byline for at least one author of a final, camera-ready version of a peer-reviewed academic paper


And the winner is:

Coherence Progress: A Measure of Interestingness Based on Fixed Compressors

By Tom Schaul, IDSIA/TU Munich
In Proceedings of the Conference on Artificial General Intelligence (August 2011)

Ετήσιο Βραβείο Καλύτερης Διδακτορικής Διατριβής του Ινστιτούτου Πληροφορικής και Τηλεματικής

Το Ινστιτούτο Πληροφορικής και Τηλεματικής (ΙΠΤΗΛ) προκηρύσσει και φέτος το Ετήσιο Βραβείο Καλύτερης Διδακτορικής Διατριβής το οποίο έχει απονεμηθεί από Ελληνικό Πανεπιστήμιο στον ευρύτερο τομέα της Πληροφορικής-Τηλεματικής κατά την διάρκεια του προηγούμενου ημερολογιακού έτους.

Οι όροι του διαγωνισμού είναι οι ακόλουθοι:

  • Ο διδακτορικός τίτλος θα πρέπει να έχει απονεμηθεί από Ελληνικό Πανεπιστήμιο μέσα στο 2011.
  • Η διδακτορική διατριβή μπορεί να είναι γραμμένη είτε στα Ελληνικά είτε στα Αγγλικά.
  • Η διδακτορική διατριβή θα πρέπει να αναφέρεται σε ένα από τα θέματα στα οποία δραστηριοποιείται ερευνητικά το ΙΠΤΗΛ:
  1. Επεξεργασία εικόνας
  2. Όραση υπολογιστών
  3. Αναγνώριση προτύπων
  4. Επεξεργασία σήματος
  5. Τεχνητή νοημοσύνη
  6. Πολυμέσα
  7. Εικονική και επαυξημένη πραγματικότητα
  8. Δίκτυα και επικοινωνίες

Η υποβολή της διδακτορικής διατριβής θα γίνεται ηλεκτρονικά. Μαζί θα πρέπει να υποβάλλεται βεβαίωση από την γραμματεία της σχολής ότι ο διδακτορικός τίτλος απονεμήθηκε μέσα στο έτος που αναφέρει η προκήρυξη, καθώς και τα ονόματα και οι ηλεκτρονικές διευθύνσεις της Τριμελούς Επιτροπής.

Η επιτροπή κρίσης θα αποτελείται από Ερευνητές του ΙΠΤΗΛ.

Το βραβείο θα απονέμεται κατά την Ανοιχτή Ημέρα του ΙΠΤΗΛ και θα αποτελείται από βεβαίωση και χρηματικό ποσό 600 ευρώ. Το ΙΠΤΗΛ θα καλύπτει τα έξοδα μετακίνησης του νικητή, εντός Ελλάδας, για να παραστεί στην εκδήλωση και να παραλάβει το βραβείο προσωπικά. Το βραβευμένο διδακτορικό θα προβάλλεται από την ιστοσελίδα του ΙΠΤΗΛ καθώς και μέσω δελτίου τύπου.

Προθεσμία υποβολής υποψηφιοτήτων: 15 Ιανουαρίου 2012.

Υποβολή αιτήσεων μέσω της iστοσελίδας

ClustTour By CERTH-ITI

ClustTour is a better way to search, discover and browse interesting city areas, POIs and events. Whether you are planning a trip or just want to check out how a place looks like, ClustTour offers a large collection of photos, maps and descriptions. You may wonder, what's new in this? ClustTour is not based on "official" guides and "experts" but on how people like you capture what is interesting everyday.

Key Features:
* Explore over 30 cities with new cities added regularly.
* Browse and search in areas, spots, POIs and events views.
* See groups of photos, descriptions and where they are on the map.
* The map interactively groups spots for easier browsing. Zoom in and
out to see more or less.
* Search the Web with a single tap to learn more about the spot you browse.
* Mark your favorites for easy review and sharing.
* Enjoy the most interesting spots in every city!

iPhone Screenshot 1iPhone Screenshot 2
How it works:
ClustTour applies advanced analysis technologies to large-scale anonymous user contributions to discover the most interesting locations and events. These range from famous landmarks, such as the Thessaloniki's White Tower, to events and activities like music concerts, shopping, bookcrossing and partying but also off-the-beaten-path spots, from hidden eateries to local artists collections. Every city area, POI and event is connected to informative photo groups, descriptions and search options when you want to learn more.
Everything in ClustTour is automatic. And when we say everything we mean everything! There is no human intervention in selecting areas, spots, events and photos. Well-tuned algorithms select what is most interesting based on people contributions. This way, sometimes, we might miss well-known landmarks (in any case you know where to find these) and some of the spots do not fully make sense but this way we find the hidden gems and we are sure that what ClustTour offers is what people and not what officials and experts think!

Wednesday, December 21, 2011

IBM Next 5 in 5: 2011

IBM unveils its sixth annual "Next 5 in 5" -- a list of innovations with the potential to change the way people work, live and play over the next five years. The Next 5 in 5 is based on market and societal trends expected to transform our lives, as well as emerging technologies from IBM's Labs around the world that can make these innovations possible.

In this installment: you will be able to power your home with the energy you create yourself; you will never need a password again; mind reading is no longer science fiction; the digital divide will cease to exist; and junk mail will become priority mail.

Tuesday, December 20, 2011

Vision Algorithm to Automatically Parse Addresses in Google Street View

Reprinted from

Extracting Door Numbers Building numbers come in a huge variety of shapes, colors and sizes, making them difficult to model for machine vision. Netzer et al

Extrapolating numbers and letters from digital images is still a tough task, even for the best computer programmers. But it would be handy to extract business names, or graffiti, or an address from pictures that are already stored online. Aiming to make its Street View service even more accurate, Google would like to extract your house number from its own Street View photo cache.

Say what you will about Street View (and Helicopter View and Amazon View and etc.) — beyond the novelty factor, the images are full of potentially useful data. Using street numbers on the side of a house or business could make navigation programs more accurate, and help motorists or pedestrians find the right door by providing a preview on the Internet or a mobile device. But while handwriting algorithms are pretty advanced, software systems are still limited in their ability to extract information from images. Factors like blurring, lighting and other distortions can be a problem.

To improve matters, researchers at Google and Stanford devised a new feature-learning algorithm for a set of street numbers captured from a Street View database. They used 600,000 images from various countries and extracted the house numbers using a basic visual algorithm. Then the team used Amazon’s Mechanical Turk system to verify the arrangement of the numbers. The result was two sets of images: One with house number images as they appeared, and one with house numbers all resized to the same resolution.

Initially, traditional handcrafted visual learning algorithms didn’t work very well to extract the numbers. Next, the Google-Stanford team tried feature learning algorithms, which use various sets of parameters to learn recognition patterns. The new feature learning methods worked much better than the regular visual learning method: One of the algorithms (a K-means-based feature learning system) achieved 90 percent accuracy, compared to 98 percent for a human.

The system still needs improvement, but it could be useful for extracting number data from billions of images, the researchers say. Ultimately, this could make Street View a lot more accurate. Without a house-number-based view, an address in Street View is a default panorama, which might not actually be the address you want. Type in “30 Rockefeller Plaza,” for instance, and the first thing you see is a chocolatier next to the 30 Rock observation deck. You have to click and drag to see the NBC building.

“With the house number-based view angle, the user will be led to the desired address immediately, without any further interaction needed,” the paper authors explain.

Reprinted from

ICMR 2012

Effectively and efficiently retrieving information based on user needs is one of the most exciting areas in multimedia research. The Annual ACM International Conference on Multimedia Retrieval (ICMR) offers a great opportunity for exchanging leading-edge multimedia retrieval ideas among researchers, practitioners and other potential users of multimedia retrieval systems. This conference, puts together the long-lasting experience of former ACM CIVR and ACM MIR series, is set up to illuminate the state of the arts in multimedia (text, image, video and audio) retrieval.

ACM ICMR 2012 is soliciting original high quality papers addressing challenging issues in the broad field of multimedia retrieval. See the call-for-papers page for a detailed list of interested topics.

Paper Submission Deadline: January 15, 2012

Monday, December 19, 2011

sFly: Search and Rescue of Victims in GPS-Denied Environments (no Vicon, no Laser, No GPS!)

Final demo of the sFly European Project (2009-2011). This demo simulate a search and rescue in an outdoor GPS-denied factory environment. No laser, no GPS are used for navigation and mapping, but just a single camera and IMU onboard each vehicle. All the processing runs onboard, on a Core2Duo.
Three quadrocopters take off using visual SLAM, explore the environment, build a 3D map of it, and use it to localize themselves and potential victims by measuring the signal strengths of WiseNodes carried by each victim.

MPEG news: a report from the 98th meeting, Geneva, Switzerland

Re-Print from (Author: Christian Timmerer)

MPEG news from its 98th meeting in Geneva, Switzerland with less than 140 characters and a lot of acronyms. The official press release is, as usual,here. As you can see from the press release, MPEG produced significant results, namely:

  • MPEG Dynamic Adaptive Streaming over HTTP (DASH) ratified
  • 3D Video Coding: Evaluation of responses to Call for Proposals
  • MPEG royalty free video coding: Internet Video Coding (IVC) + Web Video Coding (WebVC)
  • High Efficiency Coding and Media Delivery in Heterogeneous Environments:MPEG-H comprising MMT, HEVC, 3DAC
  • Compact Descriptors for Visual Search (CDVS): Evaluation of responses to the Call for Proposals
  • Call for requirements: Multimedia Preservation Description Information (MPDI)
  • MPEG Augmented Reality (AR)

As you can see, a long list of achievements within a single meeting but let's dig inside. For each topic I've also tried to provide some research issues which I think are worth to investigate both inside and outside MPEG. 

MPEG Dynamic Adaptive Streaming over HTTP (DASH): DASH=IS ✔

As the official press release states, the MPEG ratifies its draft standard for DASH and it comes better, the standard should become publicly available which I expect to happen somewhat early next year, approx. March 2012, or maybe earlier. I say "should" because there is no guarantee that this will actually happen but signs are good. In the meantime, feel free using our software to play around and we expect to update it to the latest version of the standard as soon as possible. Finally, IEEE Computer Society Computing Now has put together a theme on Video for the Universal Web featuring DASH.
Research issues: performance, bandwidth estimation, request scheduling (aka adaptation logic), and Quality of Service/Experience.

3D Video Coding: 3DVC=CfP eval ✔

MPEG evaluated more than 20 proposals submitted as a response to the call issued back in April 2011. The evaluation of the proposal comprised subjective quality assessments conducted by 13 highly qualified test laboratories distributed around the world and coordinated by the COST Action IC1003 QUALINET. The report of the subjective test results from the call for proposals on 3D video coding will be available by end of this week. MPEG documented the standardization tracks considered in 3DVC (i.e., compatible with MVC, AVC base-view, HEVC, ...) and agreed on a common software based on the best-performing proposals.
Research issues: encoding efficiency of 3D depth maps and compatibility for the various target formats (AVC, MVC, HEVC) as well as depth map estimation at the client side.

MPEG royalty free video coding: IVC vs. WebVC

In addition to the evaluation of the responses to the call for 3DVC, MPEG also evaluated the responses to the Internet Video Coding call. Based on the responses, MPEG decided to follow up with two approaches namely Internet Video Coding (IVC) and Web Video Coding (WebVC). The former - IVC - is based on MPEG-1 technology which is assumed to be royalty-free. However, it requires some performance boosts in order to make it ready for the Internet. MPEG's approach is a common platform called Internet video coding Test Model (ITM) which serves as the basis for further improvements. The latter - WebVC - is based on the AVC constrained baseline profile which performance is well-known and satisfactory but, unfortunately, it is not clear which patents of the AVC patent pool apply to this profile. Hence, a working draft (WD) of WebVC will be provided (also publicly available) in order to get patent statements from companies. The WD will be publicly available by December 19th.
Further information:

Research issues: coding efficiency with using only royalty free coding tools whereby the optimization is first towards royalties and then efficiency.


A new star is born which is called MPEG-H referred to as "High Efficiency Coding and Media Delivery in Heterogeneous Environments" comprising three parts: Pt. 1 MMT, Pt. 2 HEVC, Pt. 3 3D Audio. There's a document called context and objective of MPEG-H but I can't find out whether it's public (I come back later on this).

Part 1: MMT (MPEG Media Transport) is progressing (slowly) but a next step should be definitely to check the relationship of MMT and DASH for which an Ad-hoc Group has been established (N12395), subscribe here, if interested.
Research issues: very general at the moment, what is the best delivery method (incl. formats) for future multimedia applications? Answer: It depends, ... ;-)
Part 2: HEVC (High-Efficiency Video Coding) made significant progress at the last meeting, in particular: only one entropy coder (note: AVC has two, CABAC and CAVLC which are supported in different profiles), 8 bit decoding (could be also 10 bit, probably done in some profiles), specific integer transform, stabilized and more complete high-level syntax and HRD description (i.e., reference picture buffering, tiles, slices, and wavefronts enabling parallel decoding process). Finally, a prototype has been demonstrated decoding HEVC in software on an iPad 2 at WVGA resolution and the 10min Big Buck Bunny sequence at SD resolution with avg. 800 kbit/s which clearly outperformed the corresponding AVC versions.
Research issues: well, coding efficiency, what else? The ultimative goal to have a performance gain of more than 50% compared to the predecessor which is AVC.

Part 3: 3D Audio Coding (3DAC) is in its early stages but there will be an event during San Jose meeting which will be announced here. As of now, use cases are provided (home theatre, personal TV, smartphone TV, multichannel TV) as well as candidate requirements and evaluation methods. One important aspect seems to be user experience for highly immersive audio (i.e., 22.2, 10.2, 5.1) including bitstream adaptation for low-bandwidth and low-complexity.
Research issues: sorry, I'm not really an audio guy but I assume it's coding efficiency, specifically for 22.2 channels ;-)

Compact Descriptors for Visual Search (CDVS)

For CDVS, responses to the call for proposals (from 10 companies/institutions) have been evaluated and a test model has been established based on the best performing proposals. The next steps include the improvement of the test model towards for inclusion in the MPEG-7 standard.
Research issues: descriptor efficiency for the intended application as well as precision on the information retrieval results.

Multimedia Preservation Description Information (MPDI)

The aim of this new work item is to provide "standard technology helping users to preserve digital multimedia that is used in many different domains, including cultural heritage, scientific research, engineering, education and training, entertainment, and fine arts for long-term across system, organizational, administrative and generational boundaries". It comes along with two public documents, the current requirements and a call for requirements which are due at the 100th MPEG meeting in April 2002.
Research issues: What and how to preserve digital multimedia information?

Augmented Reality (AR)

MPEG's newest project is on Augmented Reality (AR), starting with an application format for which a working draft exists. Furthermore, draft requirements and use cases are available. These three documents will be available on Dec 31st.
Research issues: N/A

Re-Print from (Author: Christian Timmerer)

Saturday, December 17, 2011

Free Access to selected highly cited papers from IJPRAI

World Scientific announce a free access to selected highly cited papers in the International Journal of Pattern Recognition and Artificial Intelligence (IJPRAI). This is valid till 31 December 2011 so do take advantage of the access today!

Selected highly cited papers

Abstract | Full Text - Free Access (PDF, 1673KB) | References - Free Access
Abstract | Full Text - Free Access (PDF, 1820KB) | References - Free Access
Abstract | Full Text - Free Access (PDF, 885KB)) | References - Free Access
Abstract | Full Text - Free Access (PDF, 1646KB) | References - Free Access
Abstract | Full Text - Free Access (PDF, 1692KB) | References - Free Access
Abstract | Full Text - Free Access (PDF, 2471KB) | References - Free Access

Abstract | Full Text - Free Access (PDF, 1495KB) | References - Free Access

Abstract | Full Text - Free Access (PDF, 2283KB) | References - Free Access
Abstract | Full Text - Free Access (PDF, 1214KB) | References - Free Access
Abstract | Full Text - Free Access (PDF, 1232KB) | References - Free Access

ACRI 2012 Santorini Island, Greece, 24-27 September 2012

Cellular automata (CA) present a very powerful approach to the study of spatio-temporal systems where complex phenomena build up out of many simple local interactions. They account often for real phenomena or solutions of problems, whose high complexity could unlikely be formalised in different contexts.

Furthermore parallelism and locality features of CA allow a straightforward and extremely easy parallelisation, therefore an immediate implementation on parallel computing resources. These characteristics of the CA research resulted in the formation of interdisciplinary research teams. These teams produce remarkable research results and attract scientists from different fields.

The main goal of the 10th edition of ACRI 2012 Conference (Cellular Automata for Research and Industry) is to offer both scientists and engineers in academies and industries an opportunity to express and discuss their views on current trends, challenges, and state-of-the art solutions to various problems in the fields of arts, biology, chemistry, communication, cultural heritage, ecology, economy, geology, engineering, medicine, physics, sociology, traffic control, etc.

Topics of either theoretical or applied interest about CA and CA-based models and systems include but are not limited to:

  • Algebraic properties and generalization
  • Complex systems
  • Computational complexity
  • Dynamical systems
  • Hardware circuits, architectures, systems and applications
  • Modeling of biological systems
  • Modeling of physical or chemical systems
  • Modeling of ecological and environmental systems
  • Image Processing and pattern recognition
  • Natural Computing Quantum Cellular Automata
  • Parallelism

This edition of the ACRI conference also hosts workshops on recent and important research topics on theory and applications of Cellular Automata like the following: Crowds and Cellular Automata (3rd edition), Asynchronicity and Traffic and Cellular Automata.

Sunday, December 4, 2011

Terrain Surveillance Coverage using Cognitive Adaptive Optimization

A centralized cognitive‐based adaptive methodology for optimal surveillance coverage using swarms of MAVs has been developed, mathematically analyzed and tested using extensive simulation experiments. The methodology has been successfully tested on a large variety of 2D and 3D non‐convex unknown environments. Moreover, the methodology has been applied using real‐data from the Birmensdorf test area and the ETHZ’s hospital area. A decentralized version of the methodology has been also proposed and evaluated for the 2D case.

Read More

Wednesday, November 30, 2011

Mathias Lux @ ACM MULTIMEDIA 2011

Content based image retrieval with LIRe
You can also download the poster HERE

Thursday, November 17, 2011

Google Scholar Citations Open To All

Article from

A few months ago, we introduced a limited release of Google Scholar Citations, a simple way for authors to compute their citation metrics and track them over time. Today, we’re delighted to make this service available to everyone! Click here and follow the instructions to get started.
Here’s how it works. You can quickly identify which articles are yours, by selecting one or more groups of articles that are computed statistically. Then, we collect citations to your articles, graph them over time, and compute your citation metrics - the widely used h-index; the i-10 index, which is simply the number of articles with at least ten citations; and, of course, the total number of citations to your articles. Each metric is computed over all citations and also over citations in articles published in the last five years.
Your citation metrics will update automatically as we find new citations to your articles on the web. You can also set up automated updates for the list of your articles, or you can choose to review the suggested updates. And you can, of course, manually update your profile by adding missing articles, fixing bibliographic errors, and merging duplicate entries.
As one would expect, you can search for profiles of colleagues, co-authors, or other researchers using their name, affiliation, or areas of interest, e.g., researchers at US universities or researchers interested in genomics. You can add links to your co-authors, if they already have a profile, or you can invite them to create one.
You can also make your profile public, e.g., Alex Verstak, Anurag Acharya. If you choose to make your profile public, it can appear in Google Scholar search results when someone searches for your name, e.g., [alex verstak]. This will make it easier for your colleagues worldwide to follow your work.
We would like to thank the participants in the limited release of Scholar Citations for their detailed feedback. They were generous with their time and patient with an early version. Their feedback greatly helped us improve the service. The key challenge was to make profile maintenance as hands-free as possible for those of you who prefer the convenience of automated updates, while providing as much flexibility as possible for those who prefer to curate their profile themselves.
Here is hoping that Google Scholar Citations will help researchers everywhere view and track the worldwide influence of their own and their colleagues’ work.

Monday, November 14, 2011



The ColorHug is an open source display colorimeter. It allows you to calibrate your screen for accurate color matching.

The ColorHug is a small accessory that measures displayed colors very accurately. It is held on your display and plugged into a spare USB port on the computer for the duration of the calibration.

Have you ever taken a photo and wondered why it does not look the same on your screen as it did the camera?

It's probably because the LCD display on your computer has never been calibrated. This means colors can look washed-out, tinted with certain shades or with different color casts.

About 2 years ago I began working on color management in Linux. It soon became apparent that there was no integrated color management system. The color management support which did exist was often disabled by default in many applications. I have worked hard to make calibrating displays easy ever since. It is my goal to make color management accessable to end users. The hardware for color managing screens was bulky, slow and expensive. With a background in electronics, I thought I could create a device which was smaller, faster and cheaper.

Using the ColorHug it takes about a minute to take several hundred measurements from which the client software creates an ICC color profile. This color profile file can then be saved and used to make colors look correct on your monitor.

Face Recognition Makes the Leap From Sci-Fi


FACIAL recognition technology is a staple of sci-fi thrillers like “Minority Report.”

But of bars in Chicago?

SceneTap, a new app for smart phones, uses cameras with facial detection software to scout bar scenes. Without identifying specific bar patrons, it posts information like the average age of a crowd and the ratio of men to women, helping bar-hoppers decide where to go. More than 50 bars in Chicago participate.

As SceneTap suggests, techniques like facial detection, which perceives human faces but does not identify specific individuals, and facial recognition, which does identify individuals, are poised to become the next big thing for personalized marketing and smart phones. That is great news for companies that want to tailor services to customers, and not so great news for people who cherish their privacy. The spread of such technology — essentially, the democratization of surveillance — may herald the end of anonymity.

And this technology is spreading. Immersive Labs, a company in Manhattan, has developed software for digital billboards using cameras to gauge the age range, sex and attention level of a passer-by. The smart signs, scheduled to roll out this month in Los Angeles, San Francisco and New York, deliver ads based on consumers’ demographics. In other words, the system is smart enough to display, say, a Gillette ad to a male passer-by rather than an ad for Tampax.

Those endeavors pale next to the photo-tagging suggestion tool introduced by Facebookthis year. When a person uploads photos to the site, the “Tag Suggestions” feature uses facial recognition to identify that user’s friends in those photos and automatically suggests name tags for them. It’s a neat trick that frees people from the cumbersome task of repeatedly typing the same friends’ names into their photo albums.

“Millions of people are using it to add hundreds of millions of tags,” says Simon Axten, a Facebook spokesman. Other well-known programs like Picasa, the photo editing software from Google, and third-party apps like PhotoTagger, from, work similarly.

But facial recognition is proliferating so quickly that some regulators in the United States and Europe are playing catch-up. On the one hand, they say, the technology has great business potential. On the other, because facial recognition works by analyzing and storing people’s unique facial measurements, it also entails serious privacy risks.

Using off-the-shelf facial recognition software, researchers at Carnegie Mellon University were recently able to identify about a third of college students who had volunteered to be photographed for a study — just by comparing photos of those anonymous students to images publicly available on Facebook. By using other public information, the researchers also identified the interests and predicted partial Social Security numbers of some students.

“It’s a future where anonymity can no longer be taken for granted — even when we are in a public space surrounded by strangers,” says Alessandro Acquisti, an associate professorof information technology and public policy at Carnegie Mellon who directed the studies. If his team could so easily “infer sensitive personal information,” he says, marketers could someday use more invasive techniques to identify random people on the street along with, say, their credit scores.

Today, facial detection software, which can perceive human faces but not identify specific people, seems benign.

Some video chat sites are using software from, an Israeli company, to make sure that participants are displaying their faces, not other body parts, says Gil Hirsch, the chief executive of The software also has retail uses, like virtually trying out eyeglasses at, and entertainment applications, like, a site that adds a handle bar mustache to a face in a photo.

But privacy advocates worry about more intrusive situations.

Now, for example, advertising billboards that use facial detection might detect a young adult male and show him an ad for, say, Axe deodorant. Companies that make such software, like Immersive Labs, say their systems store no images or data about passers-by nor do they analyze their emotions.

But what if the next generation of mall billboards could analyze skin quality and then publicly display an ad for acne cream, or detect sadness and serve up an ad for antidepressants?


Monday, November 7, 2011

3D Photo Ring

Is the default ‘Photos’ app on the iPhone too limiting, too boring and not convenient enough for your needs? Do you want to use a novel and amazing 3D interface for browsing, searching, and presenting your photos with the iPhone?

Photo Ring turns your iPhone into a convenient 3D photo browser with a stunning interface that enables you to keep track of hundreds of photos at a glance. Moreover, its color sorting technology allows you to save time on task and provides you with a 3D slideshow feature that allows for an automized presentation of your photos. Due to its innovative and natural 3D arrangement and powerful color-based organization feature, searching for photos on your iPhone and showing them to your friends becomes an exiting and fun task!


  • Innovative and intuitive 3D browsing interface (zoomable 3D Ring)
  • Interactive 3D slideshow (animated 3D Wall; with pause/fast-forward/reverse feature)
  • Convenient interaction (e.g., kinetic ring rotation by wipe or tilt)
  • Sorting of photos by recording time
  • Sorting of photos by color
  • Inspection of EXIF/TIFF metadata of photos
  • Browsing of photos from different folders/events
  • Fullscreen photo mode with convenient switching function

PhotoRing is available for iPhone and iPad!

Important note (Nov 5, 2011): if your iPhone/iPad doesn’t run iOS 5 already, please wait until our next update (v2.2, available in a few days), which will fix a serious bug that only occurs for older iOS versions and prevents the app from loading your photos.


Saturday, November 5, 2011

The 6th FTRA International Conference on Multimedia and Ubiquitous Engineering (MUE 2012)
Madrid, Spain, 10-13 July 2012
Papers due: January, 15, 2012
Notification: March, 15 2012
Camera ready: April, 15 2012
Conference dates: July, 10-13, 2012

The new multimedia standards (for example, MPEG-21) facilitate the seamless integration of multiple modalities into interoperable multimedia frameworks, transforming the way people work and interact with multimedia data. These key technologies and multimedia solutions interact and collaborate with each other in increasingly effective ways, contributing to the multimedia revolution and having a significant impact across a wide spectrum of consumer, business, healthcare, education, and governmental domains. Moreover, the emerging mobile computing and ubiquitous networking technologies enable users to access fully broadband mobile applications and new services anytime and everywhere. The continuous efforts have been dedicated to research and development in this wide area including wireless mobile networks, ad-hoc and sensor networks, smart user devices and advanced sensor devices, mobile and ubiquitous computing platforms, and new applications and services including location-based, context-aware, or social networking services.
This conference provides an opportunity for academic and industry professionals to discuss recent progress in the area of multimedia and ubiquitous environment including models and systems, new directions, novel applications associated with the utilization and acceptance of ubiquitous computing devices and systems. MUE 2012 is the next event in a series of highly successful the International Conference on Multimedia and Ubiquitous Engineering MUE-11 (Loutraki, Greece, June 2011), MUE-10 (Cebu, Philippines, August 2010), MUE-09 (Qingdao, China, June 2009), MUE-08 (Busan, Korea, April 2008), and MUE-07 (Seoul, Korea, April 2007).

Topics of interest
* Ubiquitous Computing and Technology
* Context-Aware Ubiquitous Computing
* Parallel/Distributed/Grid Computing
* Novel Machine Architectures
* Semantic Web and Knowledge Grid
* Smart Home and Generic Interfaces
* AI and Soft Computing in Multimedia
* Computer Graphics and Simulation
* Multimedia Information Retrieval (images, videos, hypertexts, etc.)
* Internet Multimedia Mining
* Medical Image and Signal Processing
* Multimedia Indexing and Compression
* Virtual Reality and Game Technology
* Current Challenges in Multimedia
* Protocols for Ubiquitous Services
* Ubiquitous Database Methodologies
* Ubiquitous Application Interfaces
* IPv6 Foundations and Applications
* Smart Home Network Middleware
* Ubiquitous Sensor Networks / RFID
* U-Commerce and Other Applications
* Databases and Data Mining
* Multimedia RDBMS Platforms
* Multimedia in Telemedicine
* Multimedia Embedded Systems
* Multimedia Network Transmission/Streaming
* Entertainment Industry
* E-Commerce and E-Learning
* Novel Multimedia Applications
* Computer Graphics
* Multimedia network transmission/streaming
* Security in Commerce and Industry
* Security in Ubiquitous Databases
* Key Management and Authentication
* Privacy in Ubiquitous Environment
* Sensor Networks and RFID Security
* Multimedia Information Security
* Forensics and Image Watermarking
* Cyber Security
* Intrusion detection
* Biometric Security
* New developments in handheld and mobile information appliances
* New paradigms: mobile cloud, personal networks, social and crowd computing, etc
* Operating systems aspects for personal mobile devices
* New technological advances for personal mobile devices
* End-user interface issues in the design and use of personal technologies
* Enabling technologies for personal multimedia and ubiquitous computing
* Multimedia applications and techniques for personal computing devices
* Usage of personal devices for on-line learning

Submissions should not exceed 8 pages in IEEE CS proceedings paper format, including
tables and figures. All paper submissions must represent original and unpublished work.
Submission of a paper should be regarded as an undertaking that, should the paper be
accepted, at least one of the authors will register for the conference and present the
work. Submissions will be conducted electronically on the conference website.

Wednesday, October 26, 2011

Nano-spring make transparent, super-stretchy skin-like sensors

Article from:


When the nanotubes are airbrushed onto the silicone, they tend to land in randomly oriented little clumps. When the silicone is stretched, some of the "nano-bundles" get pulled into alignment in the direction of the stretching.

When the silicone is released, it rebounds back to its original dimensions, but the nanotubes buckle and form little nanostructures that look like springs.

"After we have done this kind of pre-stretching to the nanotubes, they behave like springs and can be stretched again and again, without any permanent change in shape," Bao said.

Stretching the nanotube-coated silicone a second time, in the direction perpendicular to the first direction, causes some of the other nanotube bundles to align in the second direction. That makes the sensor completely stretchable in all directions, with total rebounding afterward.

Additionally, after the initial stretching to produce the "nano-springs," repeated stretching below the length of the initial stretch does not change the electrical conductivity significantly, Bao said. Maintaining the same conductivity in both the stretched and unstretched forms is important because the sensors detect and measure the force being applied to them through these spring-like nanostructures, which serve as electrodes.

The sensors consist of two layers of the nanotube-coated silicone, oriented so that the coatings are face-to-face, with a layer of a more easily deformed type of silicone between them.

The middle layer of silicone stores electrical charge, much like a battery. When pressure is exerted on the sensor, the middle layer of silicone compresses, which alters the amount of electrical charge it can store. That change is detected by the two films of carbon nanotubes, which act like the positive and negative terminals on a typical automobile or flashlight battery.

The change sensed by the nanotube films is what enables the sensor to transmit what it is "feeling." Whether the sensor is being compressed or extended, the two nanofilms are brought closer together, which seems like it might make it difficult to detect which type of deformation is happening. But Lipomi said it should be possible to detect the difference by the pattern of pressure.

Using carbon nanotubes bent to act as springs, Stanford researchers have developed a stretchable, transparent skin-like sensor. The sensor can be stretched to more than twice its original length and bounce back perfectly to its original shape. It can sense pressure from a firm pinch to thousands of pounds. The sensor could have applications in prosthetic limbs, robotics and touch-sensitive computer displays. Darren Lipomi, a postdoctoral researcher in Chemical Engineering and Zhenan Bao, associate professor in Chemical Engineering, explain their work.

(Photo Credit: Steve Fyffe, Stanford News Service)

With compression, you would expect to see sort of a bull's-eye pattern, with the greatest deformation at the center and decreasing deformation as you go farther from the center.

"If the device was gripped by two opposing pincers and stretched, the greatest deformation would be along the straight line between the two pincers," Lipomi said. Deformation would decrease as you moved farther away from the line.

Bao's research group previously created a sensor so sensitive to pressure that it could detect pressures "well below the pressure exerted by a 20 milligram bluebottle fly carcass" that the researchers tested it with. This latest sensor is not quite that sensitive, she said, but that is because the researchers were focused on making it stretchable and transparent.

"We did not spend very much time trying to optimize the sensitivity aspect on this sensor," Bao said.

"But the previous concept can be applied here. We just need to make some modifications to the surface of the electrode so that we can have that same sensitivity."

Article from:

Artificial intelligence community mourns John McCarthy

Article from

John McCarthyArtificial intelligence researcher, John McCarthy, has died. He was 84.

The American scientist invented the computer language LISP.

It went on to become the programming language of choice for the AI community, and is still used today.

Professor McCarthy is also credited with coining the term "Artificial Intelligence" in 1955 when he detailed plans for the first Dartmouth conference. The brainstorming sessions helped focus early AI research.

Prof McCarthy's proposal for the event put forward the idea that "every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it".

The conference, which took place in the summer of 1956, brought together experts in language, sensory input, learning machines and other fields to discuss the potential of information technology.

Other AI experts describe it as a critical moment.

"John McCarthy was foundational in the creation of the discipline Artificial Intelligence," said Noel Sharkey, Professor of Artificial Intelligence at the University of Sheffield.

"His contribution in naming the subject and organising the Dartmouth conference still resonates today."


Prof McCarthy devised LISP at Massachusetts Institute of Technology (MIT), which he detailed in an influential paper in 1960.

The computer language used symbolic expressions, rather than numbers, and was widely adopted by other researchers because it gave them the ability to be more creative.

"The invention of LISP was a landmark in AI, enabling AI programs to be easily read for the first time," said Prof David Bree, from the Turin-based Institute for Scientific Interchange.

"It remained the AI language, especially in North America, for many years and had no major competitor until Edinburgh developed Prolog."


In 1971 Prof McCarthy was awarded the Turing Award from the Association for Computing Machinery in recognition of his importance to the field.

He later admitted that the lecture he gave to mark the occasion was "over-ambitious", and he was unhappy with the way he had set out his new ideas about how commonsense knowledge could be coded into computer programs.

However, he revisted the topic in later lectures and went on to win the National Medal of Science in 1991.

After retiring in 2000, Prof McCarthy remained Professor Emeritus of Computer Science at Stanford University, and maintained a websitewhere he gathered his ideas about the future of robots, the sustainability of human progress and some of his science fiction writing.

"John McCarthy's main contribution to AI was his founding of the field of knowledge representation and reasoning, which was the main focus of his research over the last 50 years," said Prof Sharkey

"He believed that this was the best approach to developing intelligent machines and was disappointed by the way the field seemed to have turned into high speed search on very large databases."

Prof Sharkey added that Prof McCarthy wished he had called the discipline Computational Intelligence, rather than AI. However, he said he recognised his choice had probably attracted more people to the subject.

Article from

Tuesday, October 25, 2011

Throwable Camera Creates 360-Degree Panoramic Images

Article from

Are you, like so many others, tired of all those old-fashioned cameras you have to hold in order to take pictures? Well here’s a camera you get to throw.

The Throwable Panoramic Ball Camera is a foam-padded ball studded with 36 fixed-focus, 2-megapixel mobile phone camera modules capable of taking a 360-degree panoramic photo.

You use the camera by throwing it directly in the air. When the camera reaches the apex — measured by an accelerometer in the camera — all 36 cameras automatically take a picture. These distinct pictures are then digitally stitched together and uploaded via USB where they are presented in a spherical panoramic viewer. This lets users interactively explore their photos including a zoom function.


SEE ALSO: The Development of the Camera: From Ancient to Instant [INFOGRAPHIC]

The results — as seen in the video above — are pretty darn impressive, but the Ball Camera is definitely not meant for shaky hands. Any spin on the ball when it’s thrown could distort the final image and you certainly wouldn’t want to drop the thing despite its 3D-printed foam padding. The 2-megapixel cameras are adequate but the quality drops as soon as users try to zoom in on distant elements. Besides, it looks a little difficult to fit the thing into a purse, let alone your pocket.

Right now, the Throwable Panoramic Ball Camera is not available to buy, though its creators have it pending a patent. Cool idea, but is it practical? Would you ever buy a camera you could throw? Let us know in the comments.

Monday, October 24, 2011

Rendering Synthetic Objects into Legacy Photographs

Kevin Karsch, Varsha Hedau, David Forsyth, Derek Hoiem
To be presented at SIGGRAPH Asia 2011


We propose a method to realistically insert synthetic objects into existing photographs without requiring access to the scene or any additional scene measurements. With a single image and a small amount of annotation, our method creates a physical model of the scene that is suitable for realistically rendering synthetic objects with diffuse, specular, and even glowing materials while accounting for lighting interactions between the objects and the scene. We demonstrate in a user study that synthetic images produced by our method are confusable with real scenes, even for people who believe they are good at telling the difference. Further, our study shows that our method is competitive with other insertion methods while requiring less scene information. We also collected new illumination and reflectance datasets; renderings produced by our system compare well to ground truth. Our system has applications in the movie and gaming industry, as well as home decorating and user content creation, among others.

Top 10 ACM SIGMM Downloads

Here we present the top downloaded ACM SIGMM articles from the ACM Digital Library, from July 2010 to June 2011. We are hoping that this list gives a much deserved exposure to the ACM SIGMM's best articles.

  1. Guo-Jun Qi, Xian-Sheng Hua, Yong Rui, Jinhui Tang, Tao Mei, Meng Wang, Hong-Jiang Zhang. Correlative multilabel video annotation with temporal kernels. In ACM Trans. Multimedia Comput. Commun. Appl. 5(1), 2008
  2. Michael S. Lew, Nicu Sebe, Chabane Djeraba, and Ramesh Jain. Content-based multimedia information retrieval: State of the art and challenges. In ACM Trans. Multimedia Comput. Commun. Appl. 2(1), 2006
  3. Ba Tu Truong, Svetha Venkatesh. Video abstraction: A systematic review and classification. In ACM Trans. Multimedia Comput. Commun. Appl. 3(1), 2007
  4. Yu-Fei Ma, Hong-Jiang Zhang. Contrast-based image attention analysis by using fuzzy growing. In ACM Multimedia 2003
  5. Simon Tong and Edward Chang. Support vector machine active learning for image retrieval. In ACM Multimedia 2001
  6. J.-P. Courtiat, R. Cruz de Oliveira, L. F. Rust da Costa Carmo. Towards a new multimedia synchronization mechanism and its formal definition. In ACM Multimedia 1994
  7. Gabriel Takacs, Vijay Chandrasekhar, Natasha Gelfand, Yingen Xiong, Wei-Chao Chen, Thanos Bismpigiannis, Radek Grzeszczuk, Kari Pulli, Bernd Girod. Outdoors augmented reality on mobile phone using loxel-based visual feature organization. In ACM SIGMM MIR 2008
  8. Jiajun Bu, Shulong Tan, Chun Chen, Can Wang, Hao Wu, Lijun Zhang, Xiaofei He. Music recommendation by unified hypergraph: combining social media information and music content. In ACM Multimedia 2010
  9. Mathias Lux, Savvas A. Chatzichristofis. Lire: lucene image retrieval: an extensible java CBIR library. In ACM Multimedia 2008

Thursday, October 20, 2011

FaceLight – Silverlight 4 Real-Time Face Detection

This article describes the simple facial recognition method that searches for a certain sized skin color region in a webcam snapshot. This technique is not as perfect as a professional computer vision library like OpenCV and the Haar-like features they use, but it runs in real time and works for most webcam scenarios.

Friday, October 14, 2011


int main()
cout<< "Goodbye Dennis Ritchie";
return 0;

Tuesday, October 11, 2011

ACM International Conference on Multimedia Retrieval (ICMR) 2012

Effectively and efficiently retrieving information based on user needs is one of the most exciting areas in multimedia research. The Annual ACM International Conference on Multimedia Retrieval (ICMR) offers a great opportunity for exchanging leading-edge multimedia retrieval ideas among researchers, practitioners and other potential users of multimedia retrieval systems. This conference, puts together the long-lasting experience of former ACM CIVR and ACM MIR series, is set up to illuminate the state of the arts in multimedia (text, image, video and audio) retrieval.

ACM ICMR 2012 is soliciting original high quality papers addressing challenging issues in the broad field of multimedia retrieval.

Topics of Interest (not limited to)
• Content/semantic/affective based indexing and retrieval
• Large-scale and web-scale multimedia processing
• Integration of content, meta data and social network
• Scalable and distributed search
• User behavior and HCI issues in multimedia retrieval
• Advanced descriptors and similarity metrics
• Multimedia fusion
• High performance indexing algorithms
• Machine learning for multimedia retrieval
• Ontology for annotation and search
• 3D video and model processing
• Large-scale summarization and visualization
• Performance evaluation
• Very large scale multimedia corpus
• Navigation and browsing on the Web
• Retrieval from multimodal lifelogs
• Database architectures for storage and retrieval
• Novel multimedia data management systems and applications
• Applications in forensic, biomedical image and video collections

Important Dates

Paper Submission: January 15, 2012
Notification of Acceptance: March 15, 2012
Camera-Ready Papers Due: April 5, 2012
Conference Date: June 5 - 8, 2012

Monday, October 10, 2011

Kinect Object Datasets: Berkeley's B3DO, UW's RGB-D, and NYU's Depth Dataset

Articlw from

Why Kinect?

The Kinect, made by Microsoft, is starting to become quite a common item in Robotics and Computer Vision research.  While the Robotics community has been using the Kinect as a cheap laser sensor which can be used for obstacle avoidance, the vision community has been excited about using the 2.5D data associated with the Kinect for object detection and recognition.  The possibility of building object recognition systems which have access to pixel features as well as 2.5D features is truly exciting for the vision hacker community!


Berkeley's B3DO

First of all, I would like to mention that it looks like the Berkeley Vision Group jumped on the Kinect bandwagon.  But the data collection effort will be crowdsourced -- they need your help!  They need you to use your Kinect to capture your own home/office environments and upload it to their servers  This way, a very large dataset will be collected, and we, the vision hackers, can use machine learning techniques to learn what sofas, desks, chairs, monitors, and paintings look like.  They Berkeley hackers have a paper on this at one of the ICCV 2011workshops in Barcelona, here is the paper information:

A Category-Level 3-D Object Dataset: Putting the Kinect to Work
Allison Janoch, Sergey Karayev, Yangqing Jia, Jonathan T. Barron, Mario Fritz, Kate Saenko, Trevor Darrell
ICCV-W 2011
[pdf] [bibtex]

UW's RGB-D Object Dataset

On another note, if you want to use 3D for your own object recognition experiments then you might want to check out the following dataset: University of Washington's RGB-D Object Dataset.  With this dataset you'll be able to compare against UW's current state-of-the-art.


In this dataset you will find RGB+Kinect3D data for many household items taken from different views.  Here is the really cool paper which got me excited about the RGB-D Dataset:

A Scalable Tree-based Approach for Joint Object and Pose Recognition
Kevin Lai, Liefeng Bo, Xiaofeng Ren, and Dieter Fox
In the Twenty-Fifth Conference on Artificial Intelligence (AAAI), August 2011.

NYU's Depth Dataset

I have to admit that I did not know about this dataset (created by by Nathan Silberman of NYU), until after I blogged about the other two datasets.  Check out the NYU Depth Dataset homepage. However the internet is great, and only a few hours after posted this short blog post, somebody let me know that I left out this really cool NYU dataset.  In fact, it looks like this particular dataset might be at the LabelMe-level regarding dense object annotations, but with accompanying Kinect data.  Rob Fergus & Co strike again!

Nathan Silberman, Rob Fergus. Indoor Scene Segmentation using a Structured Light Sensor. To Appear: ICCV 2011 Workshop on 3D Representation and Recognition

Sunday, October 9, 2011

The PHD Movie!!!

Screening @ Cyprus University of Technology
11/16/2011 - 7:00PM - CUT
Organized by: Cyprus University of Technology
Add to my google calendar

Visit to find a screening at your school. Is The PHD Movie not coming to your school? Ask your administration to sponsor a screening!

Saturday, October 8, 2011

Exploring Photobios - SIGGRAPH 2011

Read more:

Friday, October 7, 2011

Panasonic unveils first robotic hairdresser

The annual CEATEC technology in Tokyo gives Japanese technology companies a chance to let their hair down and show off robots far, far too odd for Western consumption.
Robot unicyclists and 'robot companions' are regulars at the show - often unveiled by otherwise normal technology companies. This year, Panasonic unveiled the first robotic hairdresser - as well as a robot 'doctor'.
Panasonic's robot hair washer uses advanced robot 'fingers' to massage the scalp while washing your head with jets of water and soap - rather like a car wash for your skull.
Information provided by Thank you

Thursday, October 6, 2011

Recent Image Retrieval Techniques

Untitled - 1

Tuesday, October 4, 2011

“Practical Image and Video Processing Using MATLAB®”



This is the first book to combine image and video processing with a practical MATLAB(R)-oriented approach in order to demonstrate the most important image and video techniques and algorithms. Utilizing minimal math, the contents are presented in a clear, objective manner, emphasizing and encouraging experimentation.

The book has been organized into two parts. Part I: Image Processing begins with an overview of the field, then introduces the fundamental concepts, notation, and terminology associated with image representation and basic image processing operations. Next, it discusses MATLAB(R) and its Image Processing Toolbox with the start of a series of chapters with hands-on activities and step-by-step tutorials. These chapters cover image acquisition and digitization; arithmetic, logic, and geometric operations; point-based, histogram-based, and neighborhood-based image enhancement techniques; the Fourier Transform and relevant frequency-domain image filtering techniques; image restoration; mathematical morphology; edge detection techniques; image segmentation; image compression and coding; and feature extraction and representation.

Part II: Video Processing presents the main concepts and terminology associated with analog video signals and systems, as well as digital video formats and standards. It then describes the technically involved problem of standards conversion, discusses motion estimation and compensation techniques, shows how video sequences can be filtered, and concludes with an example of a solution to object detection and tracking in video sequences using MATLAB(R).

Extra features of this book include:

More than 30 MATLAB(R) tutorials, which consist of step-by-step guides to exploring image and video processing techniques using MATLAB(R)

Chapters supported by figures, examples, illustrative problems, and exercises

Useful websites and an extensive list of bibliographical references

This accessible text is ideal for upper-level undergraduate and graduate students in digital image and video processing courses, as well as for engineers, researchers, software developers, practitioners, and anyone who wishes to learn about these increasingly popular topics on their own.

Call for participation in the ICPR 2012 Contests

We are happy to announce the opening of the six ICPR 2012 Contests, to be
held on November 11, 2012 in conjunction with the 21st International Conference on Pattern Recognition ( The aim of the contests is to encourage better scientific development through comparing competing
approaches on a common dataset.

The Contests (see for full details and links):

    Gesture Recognition Challenge and Kinect Grand Prize
    HEp-2 Cells Classification
    Human activity recognition and localization
    Kitchen Scene Context based Gesture Recognition
    Mitosis Detection in Breast Cancer
    People tracking in wide baseline camera networks

There are no 'publications' for the contest participants other than what each contest organizer prepares. Contest participants are encouraged to submit their results as a normal paper to the main conference where it
will be reviewed as normal. Short introductions for each contest are planned to be included in the main proceedings.

Attending the contest sessions requires registration for the contest, which can be done using the main conference registration form. Registration for the main conference is not obligatory, but is necessary if you want to also attend the main conference.

Each Contest has its own time schedule. See the website of each contest for the dates.
The results of the competitions will be announced at the conference:November 11, 2012

CFP: VII Conf. on Articulated Motion and Deformable Objects (AMDO 2012)

Andratx, Mallorca, Spain
11-13 July, 2012

The Spanish Association for Pattern Recognition and Image Analysis (AERFAI) and the Mathematics and Computer Science Department of UIB are organising the seventh nternational conference AMDO 2012 that will take place in Puerto de Andratx, Mallorca. This conference is the natural evolution of AMDO previous workshops. The new goal of this conference is to promote interaction and collaboration among researchers working directly in the areas covered by the main tracks. The new perceptual user interfaces and the emerging  echnologies increase the relation between aeas involved with human-computer interaction. The perspective of the AMDO 2012 conference will be to strengthen the relationship between the many areas that have as a key point the study of the human body using computer technologies as the main tool.
It is a great opportunity to encourage links between research in the areas of computer vision, computer graphics, advanced multimedia applications and multimodal interfaces that share common problems and frequently use similar techniques or tools. In this particular edition the related topics are divided in several tracks, including the topics above proposed.

AMDO 2012 will consist of three days of lecture sessions, both  regular and invited presentations, a poster session and international tutorials. The conference fee (approx.450 euro) includes a social program (conference dinner, coffee breaks, snacks and cultural activities). Students, AERFAI and EG members can register at a reduced fee.

TOPICS INCLUDE (but not restricted to):
Track 1: Advanced Computer Graphics (Human Modelling & Animation)
Track 2: Human Motion (Analysis, Tracking, 3D Reconstruction & Recognition)
Track 3: Multimodal User Interaction & Applications
Track 4: Affective Interfaces (recognition and interpretation of emotions, ECAs - Embodied Conversational Agents in HCI)

Papers should describe original and unpublished work about the above or closely related topics. Please submit your paper electronically at our website (see URL above) using the software provided. All submissions should be in Adobe Acrobat (pdf). The AMDO2012 secretariat must receive your paper before March 12, 2012, 17:00 GMT
(London time). Up to ten pages will be considered. All papers submitted will be subjected to a blind review process by at least three members of the program committee. The review paper must not provide names and affiliation, and should include a title, a 150-word abstract, keywords and paper manuscript. Accepted papers will
appear in the LNCS Springer-Verlag international proceedings that will be published and distributed to all participants at the workshop. For more details and news, visit our web page. Selected papers will be nominated to be published in an extended version
in a newsletter with impact index.

N.B. Submission implies the willingness of at least one of the authors to register and to present the communication at the conference, if accepted.

Submission of papers March 12, 2012
Notification of acceptance April 12, 2012
Camera-ready April 30, 2012
Early registration May 31, 2012
Late registration June 30 2012
VII AMDO Conference 2012 11-13 July 2012

Monday, September 12, 2011

Job offer : engineer position in mutlimedia retrieval systems' evaluation

Position: research engineer.
Title: evaluation of content-based image and video document indexing systems
Duration: from 18 to 24 months.
Target starting date: 1st November 2011.
Location: Laboratory of Informatics of Grenoble:
Team: Multimedia Information Indexing and Retrieval:
Salary: between 1900 and 2200 € net per month depending upon experience.
Contact: Georges Quénot (Researcher at CNRS, HDR),

In the context of the Quaero Programme (, the recruited person will:
* participate to the development or adaptation of image and video corpus annotation tools;
* manage the use of these tools by a team of annotators for the effective creation of annotated corpus;
* participate to the creation or adaptation of tools for the evaluation of content-based image and video document indexing systems;
* participate to the organization of evaluation campaigns for such systems;
* participate to the administrative management of the project.
Expected skills: Unix/Linux, Windows, C/C++, Java, XML, HTML/CGI.

Congratulations to DEMIR research team for winning the ImageCLEF-MED 2011 Ad-hoc image-based retrieval task using CEDD


Congratulations to DEMIR research team for winning the ImageCLEF-MED 2011 Ad-hoc image-based retrieval task using CEDD!!!!

Abstract. This paper present the details of participation of DEMIR  (Dokuz Eylul University Multimedia Information  Retrieval) research team to the context of our participation to the ImageCLEF 2011 Medical Retrieval task.  This year, we evaluated fusion and re-ranking method which is based on the best low level feature of images with best text retrieval result. We improved results by examination of different weighting models for retrieved text data and low level features. We tested multi–modality image retrieval in ImageCLEF 2011 medical retrieval task and obtained the best seven ranks in mixed retrieval, which includes textual and visual modalities. The results clearly show that proper fusion of different modalities improve the overall retrieval performance.

Read the paper