AI Applications in the fields of Multimedia, Computer Vision and Robotics

Saturday, October 5, 2013

Need software libraries and tools for your research ?

Αναρτήθηκε από Savvas Chatzichristofis

Attend the ACM MM'13 Open Source Software Competition:

M. Hildebrand, M. Brinkerink, R. Gligorov, M. Van Steenbergen, J. Huijkman, and J. Oomen: Waisda ? Video Labeling Game
C.-Y. Huang, D.-Y. Chen, C.-H. Hsu, and K.-T. Chen: GamingAnywhere: An Open-Source Cloud Gaming Testbed
J. Wagner, F. Lingenfelser, T. Baur, I. Damian, F. Kistler, and E. Andre: The Social Signal Interpretation (SSI) Framework – Multimodal Signal Processing and Recognition in Real-Time
F. Eyben, F. Weninger, F. Groß, and B. Schuller: Recent Developments in openSMILE, the Munich Open-Source Multimedia Feature Extractor
I. Tsampoulatidis, D. Ververidis, P. Tsarchopoulos, S. Nikolopoulos, I. Kompatsiaris, and N. Komninos: ImproveMyCity – An open source platform for direct citizen-government communication
M. Lux: LIRE: Open Source Image Retrieval in Java
L. Tsochatzidis, C. Iakovidou, S. Chatzichristofis, and Y. Boutalis: Golden Retriever – A Java Based Open Source Image Retrieval Engine
R. Aamulehto, M. Kuhna, and P. Oittinen: Stage Framework – An HTML5 and CSS3 Framework for Digital Publishing
D. Bogdanov, N. Wack, E. Gómez, S. Gulati, P. Herrera, O. Mayor, G. Roma, J. Salamon, J. Zapata, and X. Serra: ESSENTIA: an Audio Analysis Library for Music Information Retrieval
C. Flynn, D. Monaghan, and N.E. O Connor: SCReen Adjusted Panoramic Effect – SCRAPE
H. Yviquel, A. Lorence, K. Jerbi, G. Cocherel, A. Sanchez, and M. Raulet: Orcc: Multimedia development made easy

NTT DoCoMo Intelligent Glass Demo at CEATEC 2013

Αναρτήθηκε από Savvas Chatzichristofis

Wednesday, September 25, 2013

Matlab implementation of CEDD

Αναρτήθηκε από Savvas Chatzichristofis

Finally, the MATLAB implementation of CEDD is available on-line.

The source code is quite simple and easy to be handled by all users. There is a main function that has the task of extracting the CEDD descriptor from a given image.

Download the Matlab implementation of CEDD (For academic purposes only)

Few words about the CEDD descriptor:

The descriptors, which include more than one features in a compact histogram, can be regarded that they belong to the family of Compact Composite Descriptors. A typical example of CCD is the CEDD descriptor. The structure of CEDD consists of 6 texture areas. In particular, each texture area is separated into 24 sub regions, with each sub region describing a color. CEDD's color information results from 2 fuzzy systems that map the colors of the image in a 24-color custom palette. To extract texture information, CEDD uses a fuzzy version of the five digital filters proposed by the MPEG-7 EHD. The CEDD extraction procedure is outlined as follows: when an image block (rectangular part of the image) interacts with the system that extracts a CCD, this section of the image simultaneously goes across 2 units. The first unit, the color unit, classifies the image block into one of the 24 shades used by the system. Let the classification be in the color $m, m \in [0,23]$. The second unit, the texture unit, classifies this section of the image in the texture area $a, a \in [0,5]$. The image block is classified in the bin $a \times 24 + m$. The process is repeated for all the image blocks of the image. On the completion of the process, the histogram is normalized within the interval [0,1] and quantized for binary representation in a three bits per bin quantization.

The most important attribute of CEDDs is the achievement of very good results that they bring up in various known benchmarking image databases. The following table shows the ANMRR results in 3 image databases. The ANMRR ranges from '0' to '1', and the smaller the value of this measure is, the better the matching quality of the query. ANMRR is the evaluation criterion used in all of the MPEG-7 color core experiments.

Download the Matlab implementation of CEDD (For academic purposes only)

Monday, September 23, 2013

A Multi-Objective Exploration Strategy for Mobile Robots under Operational Constraints

Αναρτήθηκε από Savvas Chatzichristofis

IEEE Access

Multi-objective robot exploration, constitutes one of the most challenging tasks for autonomous robots performing in various operations and different environments. However, the optimal exploration path depends heavily on the objectives and constraints that both these operations and environments introduce. Typical environment constraints include partially known or completely unknown workspaces, limited-bandwidth communications and sparse or dense clattered spaces. In such environments, the exploration robots must satisfy additional operational constraints including time-critical goals, kinematic modeling and resource limitations. Finding the optimal exploration path under these multiple constraints and objectives constitutes a challenging non-convex optimization problem. In our approach, we model the environment constraints in cost functions and utilize the Cognitive-based Adaptive Optimization (CAO) algorithm in order to meet time-critical objectives. The exploration path produced is optimal in the sense of globally minimizing the required time as well as maximizing the explored area of a partially unknown workspace. Since obstacles are sensed during operation, initial paths are possible to be blocked leading to a robot entrapment. A supervisor is triggered to signal a blocked passage and subsequently escape from the basin of cost function local minimum. Extensive simulations and comparisons in typical scenarios are presented in order to show the efficiency of the proposed approach.

3-Sweep: Extracting Editable Objects from a Single Photo

Αναρτήθηκε από Savvas Chatzichristofis

by Tao Chen · Zhe Zhu · Ariel Shamir · Shi-Min Hu · Daniel Cohen-Or

Abstract

We introduce an interactive technique for manipulating simple 3D shapes based on extracting them from a single photograph. Such extraction requires understanding of the components of the shape, their projections, and relations. These simple cognitive tasks for humans are particularly difficult for automatic algorithms. Thus, our approach combines the cognitive abilities of humans with the computational accuracy of the machine to solve this problem. Our technique provides the user the means to quickly create editable 3D parts— human assistance implicitly segments a complex object into its components, and positions them in space. In our interface, three strokes are used to generate a 3D component that snaps to the shape’s outline in the photograph, where each stroke defines one dimension of the component. The computer reshapes the component to fit the image of the object in the photograph as well as to satisfy various inferred geometric constraints imposed by its global 3D structure. We show that with this intelligent interactive modeling tool, the daunting task of object extraction is made simple. Once the 3D object has been extracted, it can be quickly edited and placed back into photos or 3D scenes, permitting object-driven photo editing tasks which are impossible to perform in image-space. We show several examples and present a user study illustrating the usefulness of our technique.

Tuesday, September 17, 2013

PhD Position in Multimodal Person and Social Behaviour Recognition

Αναρτήθηκε από Savvas Chatzichristofis

Application Deadline: Tue, 10/01/2013

Location: Denmark

Employer:Aalborg University

At the Faculty of Engineering and Science, Department of Electronic Systems in Aalborg, a PhD stipend in Multimodal Person and Social Behaviour Recognition is available within the general study programme Electrical and Electronic Engineering. The stipend is open for appointment from November 1, 2013, or as soon as possible thereafter. Job description The PhD student will work on the research project “Durable Interaction with Socially Intelligent Robots” funded by the Danish Council for Independent Research, Technology and Production Sciences. This project aims at developing methods to make service robots socially intelligent and capable of establishing durable relationships with their users. This relies on developing the capabilities to sense and express, which will be achieved by the fusion of sensor signals in an interactive way. The PhD student will research on technologies for vision based social behaviour recognition, person identification and person tracking in the context of human robot interaction. Multimodal fusion will be carried out in collaboration with another PhD student who will work on the same research project with a focus on array based speech processing. The successful applicant must have a Master degree in machine learning, statistical signal processing or computer vision. You may obtain further information from Associate Professor Zheng-Hua Tan, Department of Electronic Systems, phone: +45 9940 8686, email:zt@es.aau.dk , concerning the scientific aspects of the position.

Job URL: Visit the job's url

The LIRE (Lucene Image REtrieval) library provides a simple way to retrieve images and photos based on their color and texture characteristics. LIRE creates a Lucene index of image features for content based image retrieval (CBIR). Three of the available image features are taken from the MPEG-7 Standard: ScalableColor, ColorLayout and EdgeHistogram a fourth one, the Auto Color Correlogram has been implemented based on recent research results. Furthermore simple methods for searching the index and result browsing are provided by LIRE. The LIRE library and the LIRE Demo application as well as all the source are available under the Gnu GPL license.

Pages

Saturday, October 5, 2013

Need software libraries and tools for your research ?

NTT DoCoMo Intelligent Glass Demo at CEATEC 2013

Wednesday, September 25, 2013

Matlab implementation of CEDD

Monday, September 23, 2013

A Multi-Objective Exploration Strategy for Mobile Robots under Operational Constraints

Wednesday, September 18, 2013

3-Sweep: Extracting Editable Objects from a Single Photo

by Tao Chen · Zhe Zhu · Ariel Shamir · Shi-Min Hu · Daniel Cohen-Or

Abstract

Tuesday, September 17, 2013

PhD Position in Multimodal Person and Social Behaviour Recognition