AI Applications in the fields of Multimedia, Computer Vision and Robotics: December 2011

Monday, December 26, 2011

Image retrieval, Research Assistant (19 months)

Αναρτήθηκε από Savvas Chatzichristofis

This post offers the opportunity to work on an UK Engineering and Physical Sciences Research Council and a Technology Strategy Board funded project PhotoBrief. The project will create a platform to create briefs which specify the need for images as well as the ability to find a suitable
image and negotiate for it. It will provide users with an environment for interactive brief creation and dynamic negotation of images. The project considers the image information seeking and retrieval behaviour of different user groups including professionals. Partners from industry
include an SEO (search engine optimisation) company, a mobile information company, and a media advertising company.

The project is based within the Centre for Interactive Systems Research which has 20 years experience of R&D in search algorithms and technologies, including a well-known search algorithm.

Applicants should be qualified in Information Retrieval (or related area). More specifically, knowledge of meta-data creation, text/image retrieval/ content-based techniques will be needed. The qualification sought is PhD or equivalent experience in field.

In this post you will develop novel brief creation and negotiation techniques that use meta-data about images, which has been captured to enable that image to be more findable. You will implement and integrate these in an image retrieval system. The project will make use of current
state-of-art meta-data techniques, collections, relevant open-source initiatives, and evaluation methods. You will be involved in integrating these, creating user interfaces, and developing new system functionality where necessary.

Strong programming skills are essential, and experience of any of the following areas is an advantage: Information Retrieval (IR); User-centred IR system design, implementation, and evaluation; Context learning, Adaptive IR, Recommender systems, Social media; Computer Supported Cooperative Working (CSCW); User Interface design and evaluation. Experience in collaborating with partners from industry is desirable.

For informal enquiries, please contact Dr. Ayse Goker:
ayse.goker.1@soi.city.ac.uk
Further details of the position available at:
http://tinyurl.com/bpevd9y

The post will also be advertised on www.jobs.ac.uk

Closing date: 23 January, 2011
Interview date: 1 February, 2011
Start date: March 2011

Further information:
www.soi.city.ac.uk/is/research/cisr
www.soi.city.ac.uk/~sbbb872

Saturday, December 24, 2011

Merry Christmas and a happy 2012

Αναρτήθηκε από Savvas Chatzichristofis

Personalize funny videos and birthday eCards at JibJab!

Thursday, December 22, 2011

2011 PhD Challenge Winner - "Dirty Old Man"

Αναρτήθηκε από Savvas Chatzichristofis

The goal of this year’s challenge was to get one of the nicknames “DIRTY OLD MAN” or “CRAZY CAT LADY”included in the byline for at least one author of a final, camera-ready version of a peer-reviewed academic paper

And the winner is:

Coherence Progress: A Measure of Interestingness Based on Fixed Compressors

By Tom Schaul, IDSIA/TU Munich
In Proceedings of the Conference on Artificial General Intelligence (August 2011)

Ετήσιο Βραβείο Καλύτερης Διδακτορικής Διατριβής του Ινστιτούτου Πληροφορικής και Τηλεματικής

Αναρτήθηκε από Savvas Chatzichristofis

Το Ινστιτούτο Πληροφορικής και Τηλεματικής (ΙΠΤΗΛ) προκηρύσσει και φέτος το Ετήσιο Βραβείο Καλύτερης Διδακτορικής Διατριβής το οποίο έχει απονεμηθεί από Ελληνικό Πανεπιστήμιο στον ευρύτερο τομέα της Πληροφορικής-Τηλεματικής κατά την διάρκεια του προηγούμενου ημερολογιακού έτους.

Οι όροι του διαγωνισμού είναι οι ακόλουθοι:

Ο διδακτορικός τίτλος θα πρέπει να έχει απονεμηθεί από Ελληνικό Πανεπιστήμιο μέσα στο 2011.

Η διδακτορική διατριβή μπορεί να είναι γραμμένη είτε στα Ελληνικά είτε στα Αγγλικά.

Η διδακτορική διατριβή θα πρέπει να αναφέρεται σε ένα από τα θέματα στα οποία δραστηριοποιείται ερευνητικά το ΙΠΤΗΛ:

Επεξεργασία εικόνας
Όραση υπολογιστών
Αναγνώριση προτύπων
Επεξεργασία σήματος
Τεχνητή νοημοσύνη
Πολυμέσα
Εικονική και επαυξημένη πραγματικότητα
Δίκτυα και επικοινωνίες

Η υποβολή της διδακτορικής διατριβής θα γίνεται ηλεκτρονικά. Μαζί θα πρέπει να υποβάλλεται βεβαίωση από την γραμματεία της σχολής ότι ο διδακτορικός τίτλος απονεμήθηκε μέσα στο έτος που αναφέρει η προκήρυξη, καθώς και τα ονόματα και οι ηλεκτρονικές διευθύνσεις της Τριμελούς Επιτροπής.

Η επιτροπή κρίσης θα αποτελείται από Ερευνητές του ΙΠΤΗΛ.

Το βραβείο θα απονέμεται κατά την Ανοιχτή Ημέρα του ΙΠΤΗΛ και θα αποτελείται από βεβαίωση και χρηματικό ποσό 600 ευρώ. Το ΙΠΤΗΛ θα καλύπτει τα έξοδα μετακίνησης του νικητή, εντός Ελλάδας, για να παραστεί στην εκδήλωση και να παραλάβει το βραβείο προσωπικά. Το βραβευμένο διδακτορικό θα προβάλλεται από την ιστοσελίδα του ΙΠΤΗΛ καθώς και μέσω δελτίου τύπου.

Προθεσμία υποβολής υποψηφιοτήτων: 15 Ιανουαρίου 2012.

Υποβολή αιτήσεων μέσω της iστοσελίδας http://www.iti.gr/itiPHD

ClustTour By CERTH-ITI

Αναρτήθηκε από Savvas Chatzichristofis

ClustTour is a better way to search, discover and browse interesting city areas, POIs and events. Whether you are planning a trip or just want to check out how a place looks like, ClustTour offers a large collection of photos, maps and descriptions. You may wonder, what's new in this? ClustTour is not based on "official" guides and "experts" but on how people like you capture what is interesting everyday.

Key Features:
* Explore over 30 cities with new cities added regularly.
* Browse and search in areas, spots, POIs and events views.
* See groups of photos, descriptions and where they are on the map.
* The map interactively groups spots for easier browsing. Zoom in and
out to see more or less.
* Search the Web with a single tap to learn more about the spot you browse.
* Mark your favorites for easy review and sharing.
* Enjoy the most interesting spots in every city!

iPhone Screenshot 1 iPhone Screenshot 2
How it works:
ClustTour applies advanced analysis technologies to large-scale anonymous user contributions to discover the most interesting locations and events. These range from famous landmarks, such as the Thessaloniki's White Tower, to events and activities like music concerts, shopping, bookcrossing and partying but also off-the-beaten-path spots, from hidden eateries to local artists collections. Every city area, POI and event is connected to informative photo groups, descriptions and search options when you want to learn more.
Everything in ClustTour is automatic. And when we say everything we mean everything! There is no human intervention in selecting areas, spots, events and photos. Well-tuned algorithms select what is most interesting based on people contributions. This way, sometimes, we might miss well-known landmarks (in any case you know where to find these) and some of the spots do not fully make sense but this way we find the hidden gems and we are sure that what ClustTour offers is what people and not what officials and experts think!

http://itunes.apple.com/app/clusttour/id487608260?mt=8

Wednesday, December 21, 2011

IBM Next 5 in 5: 2011

Αναρτήθηκε από Savvas Chatzichristofis

IBM unveils its sixth annual "Next 5 in 5" -- a list of innovations with the potential to change the way people work, live and play over the next five years. The Next 5 in 5 is based on market and societal trends expected to transform our lives, as well as emerging technologies from IBM's Labs around the world that can make these innovations possible.

In this installment: you will be able to power your home with the energy you create yourself; you will never need a password again; mind reading is no longer science fiction; the digital divide will cease to exist; and junk mail will become priority mail.

Tuesday, December 20, 2011

Vision Algorithm to Automatically Parse Addresses in Google Street View

Αναρτήθηκε από Savvas Chatzichristofis

Reprinted from http://www.popsci.com/technology/article/2011-12/google-wants-computer-vision-algorithms-can-read-addresses-street-view?utm_medium=referral&utm_source=pulsenews

Extracting Door Numbers Building numbers come in a huge variety of shapes, colors and sizes, making them difficult to model for machine vision. Netzer et al

Extrapolating numbers and letters from digital images is still a tough task, even for the best computer programmers. But it would be handy to extract business names, or graffiti, or an address from pictures that are already stored online. Aiming to make its Street View service even more accurate, Google would like to extract your house number from its own Street View photo cache.

Say what you will about Street View (and Helicopter View and Amazon View and etc.) — beyond the novelty factor, the images are full of potentially useful data. Using street numbers on the side of a house or business could make navigation programs more accurate, and help motorists or pedestrians find the right door by providing a preview on the Internet or a mobile device. But while handwriting algorithms are pretty advanced, software systems are still limited in their ability to extract information from images. Factors like blurring, lighting and other distortions can be a problem.

To improve matters, researchers at Google and Stanford devised a new feature-learning algorithm for a set of street numbers captured from a Street View database. They used 600,000 images from various countries and extracted the house numbers using a basic visual algorithm. Then the team used Amazon’s Mechanical Turk system to verify the arrangement of the numbers. The result was two sets of images: One with house number images as they appeared, and one with house numbers all resized to the same resolution.

Initially, traditional handcrafted visual learning algorithms didn’t work very well to extract the numbers. Next, the Google-Stanford team tried feature learning algorithms, which use various sets of parameters to learn recognition patterns. The new feature learning methods worked much better than the regular visual learning method: One of the algorithms (a K-means-based feature learning system) achieved 90 percent accuracy, compared to 98 percent for a human.

The system still needs improvement, but it could be useful for extracting number data from billions of images, the researchers say. Ultimately, this could make Street View a lot more accurate. Without a house-number-based view, an address in Street View is a default panorama, which might not actually be the address you want. Type in “30 Rockefeller Plaza,” for instance, and the first thing you see is a chocolatier next to the 30 Rock observation deck. You have to click and drag to see the NBC building.

“With the house number-based view angle, the user will be led to the desired address immediately, without any further interaction needed,” the paper authors explain.

Reprinted from http://www.popsci.com/technology/article/2011-12/google-wants-computer-vision-algorithms-can-read-addresses-street-view?utm_medium=referral&utm_source=pulsenews

ICMR 2012

Αναρτήθηκε από Savvas Chatzichristofis

Effectively and efficiently retrieving information based on user needs is one of the most exciting areas in multimedia research. The Annual ACM International Conference on Multimedia Retrieval (ICMR) offers a great opportunity for exchanging leading-edge multimedia retrieval ideas among researchers, practitioners and other potential users of multimedia retrieval systems. This conference, puts together the long-lasting experience of former ACM CIVR and ACM MIR series, is set up to illuminate the state of the arts in multimedia (text, image, video and audio) retrieval.

ACM ICMR 2012 is soliciting original high quality papers addressing challenging issues in the broad field of multimedia retrieval. See the call-for-papers page for a detailed list of interested topics.

Paper Submission Deadline: January 15, 2012

http://www.icmr2012.org/

Monday, December 19, 2011

sFly: Search and Rescue of Victims in GPS-Denied Environments (no Vicon, no Laser, No GPS!)

Αναρτήθηκε από Savvas Chatzichristofis

Final demo of the sFly European Project (2009-2011). This demo simulate a search and rescue in an outdoor GPS-denied factory environment. No laser, no GPS are used for navigation and mapping, but just a single camera and IMU onboard each vehicle. All the processing runs onboard, on a Core2Duo.
Three quadrocopters take off using visual SLAM, explore the environment, build a 3D map of it, and use it to localize themselves and potential victims by measuring the signal strengths of WiseNodes carried by each victim.

MPEG news: a report from the 98th meeting, Geneva, Switzerland

Αναρτήθηκε από Savvas Chatzichristofis

Re-Print from http://multimediacommunication.blogspot.com/ (Author: Christian Timmerer)

MPEG news from its 98th meeting in Geneva, Switzerland with less than 140 characters and a lot of acronyms. The official press release is, as usual,here. As you can see from the press release, MPEG produced significant results, namely:

MPEG Dynamic Adaptive Streaming over HTTP (DASH) ratified
3D Video Coding: Evaluation of responses to Call for Proposals
MPEG royalty free video coding: Internet Video Coding (IVC) + Web Video Coding (WebVC)
High Efficiency Coding and Media Delivery in Heterogeneous Environments:MPEG-H comprising MMT, HEVC, 3DAC
Compact Descriptors for Visual Search (CDVS): Evaluation of responses to the Call for Proposals
Call for requirements: Multimedia Preservation Description Information (MPDI)
MPEG Augmented Reality (AR)

As you can see, a long list of achievements within a single meeting but let's dig inside. For each topic I've also tried to provide some research issues which I think are worth to investigate both inside and outside MPEG.

MPEG Dynamic Adaptive Streaming over HTTP (DASH): DASH=IS ✔

As the official press release states, the MPEG ratifies its draft standard for DASH and it comes better, the standard should become publicly available which I expect to happen somewhat early next year, approx. March 2012, or maybe earlier. I say "should" because there is no guarantee that this will actually happen but signs are good. In the meantime, feel free using our software to play around and we expect to update it to the latest version of the standard as soon as possible. Finally, IEEE Computer Society Computing Now has put together a theme on Video for the Universal Web featuring DASH.
Research issues: performance, bandwidth estimation, request scheduling (aka adaptation logic), and Quality of Service/Experience.

3D Video Coding: 3DVC=CfP eval ✔

MPEG evaluated more than 20 proposals submitted as a response to the call issued back in April 2011. The evaluation of the proposal comprised subjective quality assessments conducted by 13 highly qualified test laboratories distributed around the world and coordinated by the COST Action IC1003 QUALINET. The report of the subjective test results from the call for proposals on 3D video coding will be available by end of this week. MPEG documented the standardization tracks considered in 3DVC (i.e., compatible with MVC, AVC base-view, HEVC, ...) and agreed on a common software based on the best-performing proposals.
Research issues: encoding efficiency of 3D depth maps and compatibility for the various target formats (AVC, MVC, HEVC) as well as depth map estimation at the client side.

MPEG royalty free video coding: IVC vs. WebVC

In addition to the evaluation of the responses to the call for 3DVC, MPEG also evaluated the responses to the Internet Video Coding call. Based on the responses, MPEG decided to follow up with two approaches namely Internet Video Coding (IVC) and Web Video Coding (WebVC). The former - IVC - is based on MPEG-1 technology which is assumed to be royalty-free. However, it requires some performance boosts in order to make it ready for the Internet. MPEG's approach is a common platform called Internet video coding Test Model (ITM) which serves as the basis for further improvements. The latter - WebVC - is based on the AVC constrained baseline profile which performance is well-known and satisfactory but, unfortunately, it is not clear which patents of the AVC patent pool apply to this profile. Hence, a working draft (WD) of WebVC will be provided (also publicly available) in order to get patent statements from companies. The WD will be publicly available by December 19th.
Further information:

Research issues: coding efficiency with using only royalty free coding tools whereby the optimization is first towards royalties and then efficiency.

MPEG-H

A new star is born which is called MPEG-H referred to as "High Efficiency Coding and Media Delivery in Heterogeneous Environments" comprising three parts: Pt. 1 MMT, Pt. 2 HEVC, Pt. 3 3D Audio. There's a document called context and objective of MPEG-H but I can't find out whether it's public (I come back later on this).

Part 1: MMT (MPEG Media Transport) is progressing (slowly) but a next step should be definitely to check the relationship of MMT and DASH for which an Ad-hoc Group has been established (N12395), subscribe here, if interested.
Research issues: very general at the moment, what is the best delivery method (incl. formats) for future multimedia applications? Answer: It depends, ... ;-)
Part 2: HEVC (High-Efficiency Video Coding) made significant progress at the last meeting, in particular: only one entropy coder (note: AVC has two, CABAC and CAVLC which are supported in different profiles), 8 bit decoding (could be also 10 bit, probably done in some profiles), specific integer transform, stabilized and more complete high-level syntax and HRD description (i.e., reference picture buffering, tiles, slices, and wavefronts enabling parallel decoding process). Finally, a prototype has been demonstrated decoding HEVC in software on an iPad 2 at WVGA resolution and the 10min Big Buck Bunny sequence at SD resolution with avg. 800 kbit/s which clearly outperformed the corresponding AVC versions.
Research issues: well, coding efficiency, what else? The ultimative goal to have a performance gain of more than 50% compared to the predecessor which is AVC.

Part 3: 3D Audio Coding (3DAC) is in its early stages but there will be an event during San Jose meeting which will be announced here. As of now, use cases are provided (home theatre, personal TV, smartphone TV, multichannel TV) as well as candidate requirements and evaluation methods. One important aspect seems to be user experience for highly immersive audio (i.e., 22.2, 10.2, 5.1) including bitstream adaptation for low-bandwidth and low-complexity.
Research issues: sorry, I'm not really an audio guy but I assume it's coding efficiency, specifically for 22.2 channels ;-)

Compact Descriptors for Visual Search (CDVS)

For CDVS, responses to the call for proposals (from 10 companies/institutions) have been evaluated and a test model has been established based on the best performing proposals. The next steps include the improvement of the test model towards for inclusion in the MPEG-7 standard.
Research issues: descriptor efficiency for the intended application as well as precision on the information retrieval results.

Multimedia Preservation Description Information (MPDI)

The aim of this new work item is to provide "standard technology helping users to preserve digital multimedia that is used in many different domains, including cultural heritage, scientific research, engineering, education and training, entertainment, and fine arts for long-term across system, organizational, administrative and generational boundaries". It comes along with two public documents, the current requirements and a call for requirements which are due at the 100th MPEG meeting in April 2002.
Research issues: What and how to preserve digital multimedia information?

Augmented Reality (AR)

MPEG's newest project is on Augmented Reality (AR), starting with an application format for which a working draft exists. Furthermore, draft requirements and use cases are available. These three documents will be available on Dec 31st.
Research issues: N/A

Re-Print from http://multimediacommunication.blogspot.com/ (Author: Christian Timmerer)

Saturday, December 17, 2011

Free Access to selected highly cited papers from IJPRAI

Αναρτήθηκε από Savvas Chatzichristofis

World Scientific announce a free access to selected highly cited papers in the International Journal of Pattern Recognition and Artificial Intelligence (IJPRAI). This is valid till 31 December 2011 so do take advantage of the access today!

Selected highly cited papers

ROBUST OBJECT TRACKING USING JOINT COLOR-TEXTURE HISTOGRAM
Abstract | Full Text - Free Access (PDF, 1673KB) | References - Free Access
INTELLIGENT SURVEILLANCE BASED ON NORMALITY ANALYSIS TO DETECT ABNORMAL BEHAVIORS
Abstract | Full Text - Free Access (PDF, 1820KB) | References - Free Access
FACIAL BIOMETRICS USING NONTENSOR PRODUCT WAVELET AND 2D DISCRIMINANT TECHNIQUES
Abstract | Full Text - Free Access (PDF, 885KB)) | References - Free Access
FACE DETECTION AND RECOGNITION USING MAXIMUM LIKELIHOOD CLASSIFIERS ON GABOR GRAPHS
Abstract | Full Text - Free Access (PDF, 1646KB) | References - Free Access
PRECISE EYE AND MOUTH LOCALIZATION
Abstract | Full Text - Free Access (PDF, 1692KB) | References - Free Access
ACCURATE IMAGE RETRIEVAL BASED ON COMPACT COMPOSITE DESCRIPTORS AND RELEVANCE FEEDBACK INFORMATION
Abstract | Full Text - Free Access (PDF, 2471KB) | References - Free Access

PLANT LEAF IDENTIFICATION BASED ON VOLUMETRIC FRACTAL DIMENSION
Abstract | Full Text - Free Access (PDF, 1495KB) | References - Free Access

GRAPH CLASSIFICATION BASED ON VECTOR SPACE EMBEDDING
Abstract | Full Text - Free Access (PDF, 2283KB) | References - Free Access
K-MEANS CLUSTERING FOR PROBLEMS WITH PERIODIC ATTRIBUTES
Abstract | Full Text - Free Access (PDF, 1214KB) | References - Free Access
CLASSIFICATION OF IMBALANCED DATA: A REVIEW
Abstract | Full Text - Free Access (PDF, 1232KB) | References - Free Access

ACRI 2012 Santorini Island, Greece, 24-27 September 2012

Αναρτήθηκε από Savvas Chatzichristofis

Cellular automata (CA) present a very powerful approach to the study of spatio-temporal systems where complex phenomena build up out of many simple local interactions. They account often for real phenomena or solutions of problems, whose high complexity could unlikely be formalised in different contexts.

Furthermore parallelism and locality features of CA allow a straightforward and extremely easy parallelisation, therefore an immediate implementation on parallel computing resources. These characteristics of the CA research resulted in the formation of interdisciplinary research teams. These teams produce remarkable research results and attract scientists from different fields.

The main goal of the 10th edition of ACRI 2012 Conference (Cellular Automata for Research and Industry) is to offer both scientists and engineers in academies and industries an opportunity to express and discuss their views on current trends, challenges, and state-of-the art solutions to various problems in the fields of arts, biology, chemistry, communication, cultural heritage, ecology, economy, geology, engineering, medicine, physics, sociology, traffic control, etc.

Topics of either theoretical or applied interest about CA and CA-based models and systems include but are not limited to:

Algebraic properties and generalization
Complex systems
Computational complexity
Dynamical systems
Hardware circuits, architectures, systems and applications
Modeling of biological systems
Modeling of physical or chemical systems
Modeling of ecological and environmental systems
Image Processing and pattern recognition
Natural Computing Quantum Cellular Automata
Parallelism

This edition of the ACRI conference also hosts workshops on recent and important research topics on theory and applications of Cellular Automata like the following: Crowds and Cellular Automata (3rd edition), Asynchronicity and Traffic and Cellular Automata.

http://acri2012.duth.gr/

Sunday, December 4, 2011

Terrain Surveillance Coverage using Cognitive Adaptive Optimization

Αναρτήθηκε από Savvas Chatzichristofis

A centralized cognitive‐based adaptive methodology for optimal surveillance coverage using swarms of MAVs has been developed, mathematically analyzed and tested using extensive simulation experiments. The methodology has been successfully tested on a large variety of 2D and 3D non‐convex unknown environments. Moreover, the methodology has been applied using real‐data from the Birmensdorf test area and the ETHZ’s hospital area. A decentralized version of the methodology has been also proposed and evaluated for the 2D case.

The LIRE (Lucene Image REtrieval) library provides a simple way to retrieve images and photos based on their color and texture characteristics. LIRE creates a Lucene index of image features for content based image retrieval (CBIR). Three of the available image features are taken from the MPEG-7 Standard: ScalableColor, ColorLayout and EdgeHistogram a fourth one, the Auto Color Correlogram has been implemented based on recent research results. Furthermore simple methods for searching the index and result browsing are provided by LIRE. The LIRE library and the LIRE Demo application as well as all the source are available under the Gnu GPL license.

Pages

Monday, December 26, 2011

Saturday, December 24, 2011

Thursday, December 22, 2011

The goal of this year’s challenge was to get one of the nicknames “DIRTY OLD MAN” or “CRAZY CAT LADY”included in the byline for at least one author of a final, camera-ready version of a peer-reviewed academic paper

Wednesday, December 21, 2011

Tuesday, December 20, 2011

Monday, December 19, 2011

Saturday, December 17, 2011

Sunday, December 4, 2011