Wednesday, October 30, 2013

CfP: ACM MMSys 2014 Dataset Track

The ACM Multimedia Systems conference ( provides a forum for researchers, engineers, and scientists to present and share their latest research findings in multimedia systems. While research about specific aspects of multimedia systems is regularly
published in the various proceedings and transactions of the networking, operating system, real-time system, and database communities, MMSys aims to cut across these domains in the context of multimedia data types. This provides a unique opportunity to view the intersections and interplay of the various approaches and solutions developed across these domains to deal with multimedia data types.

Furthermore, MMSys provides an avenue for communicating research that addresses multimedia systems holistically. As an integral part of the conference since 2012, the Dataset Track provides an opportunity for researchers and practitioners to make their work available (and citable) to the multimedia community. MMSys encourages and recognizes dataset sharing, and seeks contributions in all areas of multimedia (not limited to MM systems). Authors publishing datasets will benefit by increasing the public awareness of their effort in collecting the datasets.
In particular, authors of datasets accepted for publication will receive:

  • Dataset hosting from MMSys for at least 5 years
  • Citable publication of the dataset description in the proceedings published by ACM
  • 15 minutes oral presentation time at the MMSys 2014 Dataset Track

All submissions will be peer-reviewed by at least two members of the technical program committee of the MMSys 2014. Datasets will be evaluated by the committee on the basis of the collection methodology and the value of the dataset as a resource for the research community.

Submission Guidelines
Authors interested in submitting a dataset should:
1. Make their data available by providing a public URL for download
2. Write a short paper describing
  a. motivation for data collection and intended use of the data set,
  b. the format of the data collected,
  c. the methodology used to collect the dataset, and
  d. basic characterizing statistics from the dataset.

Papers should be at most 6 pages long (in PDF format) prepared in the ACM style and written in English.

Submission site:

Important dates
* Data set paper submission deadline: November 11, 2013
* Notification: December 20, 2013
* MMSys conference : March 19 - 21, 2014
** MMsys Datasets **

Previous accepted datasets can be accessed at

For further queries and extra information, please contact us at

Monday, October 28, 2013

CBMI 2014

Following the eleven successful previous events of CBMI (Toulouse 1999, Brescia 2001, Rennes 2003, Riga 2005, Bordeaux 2007, London 2008, Chania 2009, Grenoble 2010, Madrid 2011, Annecy 2012, and Veszprem 2013), It is our pleasure to welcome you to CBMI 2014, the 12th International Content Based Multimedia Indexing Workshop , in Klagenfurt, Austria on June 18-20 2014.

The 12th International CBMI Workshop aims at bringing together the various communities involved in all aspects of content-based multimedia indexing, retrieval, browsing and presentation. The scientific program of CBMI 2014 will include invited keynote talks and regular, special and demo sessions with contributed research papers.

We sincerely hope that a carefully crafted program, the scientific discussions that the workshop will hopefully stimulate, and your additional activities in Klagenfurt and its surroundings, most importantly the lovely Lake Wörthersee, will make your CBMI 2014 participation worthwhile and a memorable experience.

Important dates:

Paper submission deadline: February 16, 2014
Notification of acceptance: March 30, 2014
Camera-ready papers due: April 14, 2014
Author registration: April 14, 2014
Early registration: May 25, 2014

Friday, October 25, 2013

LIRE presentation at the ACM Multimedia Open Source Software Competition 2013

LIRE Solr []

The Solr plugin itself is fully functional for Solr 4.4 and the source is available at There is a markdown document explaining what can be done with plugin and how to actually install it. Basically it can do content based search, content based re-ranking of text searches and brings along a custom field implementation & sub linear search based on hashing.

Thursday, October 24, 2013

ACM Multimedia 2013 Open Source Competition winner is….

Essentia!!! Congratulations!!!!

Essentia 2.0 beta, is an open-source C++ library for audio analysis and audio-based music information retrieval released under the Affero GPLv3 license (also available underproprietary license upon request). It contains an extensive collection of reusable algorithmswhich implement audio input/output functionality, standard digital signal processing blocks, statistical characterization of data, and a large set of spectral, temporal, tonal and high-level music descriptors. In addition, Essentia can be complemented with Gaia, a C++ library with python bindings which implement similarity measures and classifications on the results of audio analysis, and generate classification models that Essentia can use to compute high-level description of music (same license terms apply).

Essentia is not a framework, but rather a collection of algorithms (plus some infrastructure for multithreading and low memory usage) wrapped in a library. It doesn’t provide common high-level logic for descriptor computation (so you aren’t locked into a certain way of doing things). It rather focuses on the robustness, performance and optimality of the provided algorithms, as well as ease of use. The flow of the analysis is decided and implemented by the user, while Essentia is taking care of the implementation details of the algorithms being used. An example extractor is provided, but it should be considered as an example only, not “the” only correct way of doing things.

The library is also wrapped in Python and includes a number of predefined executable extractors for the available music descriptors, which facilitates its use for fast prototyping and allows setting up research experiments very rapidly. Furthermore, it includes a Vamp plugin to be used with Sonic Visualiser for visualization purposes. The library is cross-platform and currently supports Linux, Mac OS X, and Windows systems. Essentia is designed with a focus on the robustness of the provided music descriptors and is optimized in terms of the computational cost of the algorithms. The provided functionality, specifically the music descriptors included in-the-box and signal processing algorithms, is easily expandable and allows for both research experiments and development of large-scale industrial applications.

2 honorable mentions are OpenSMILE and SSI

List all the open source projects presented at the ACM Multimedia conference

ACM Multimedia 2013 best paper award goes to….

ACM Multimedia 2013 best paper award goes to Attribute-augmented Semantic Hierarchy for Image Retrieval

This paper presents a novel Attribute-augmented Semantic Hierarchy (A2 SH) and demonstrates its effectiveness in bridging both the semantic and intention gaps in Content-based Image Retrieval (CBIR). A2 SH organizes the semantic concepts into multiple semantic levels and augments each concept with a set of related attributes, which describe the multiple facets of the concept and act as the intermediate bridge connecting the concept and low-level visual content. A hierarchical semantic similarity function is learnt to characterize the semantic similarities among images for retrieval. To better capture user search intent, a hybrid feedback mechanism is developed, which collects hybrid feedbacks on attributes and images. These feedbacks are then used to refine the search results based on A2 SH. We develop a content-based image retrieval system based on the proposed A2 SH. We conduct extensive experiments on a large-scale data set of over one million Web images. Experimental results show that the proposed A2 SH can characterize the semantic affinities among images accurately and can shape user search intent precisely and quickly, leading to more accurate search results as compared to state-of-the-art CBIR solutions.

Wednesday, October 23, 2013

Novaemötions dataset

This dataset contains the facial expression images captured using the novaemötions game. It contains over 40,000 images, labeled with the challenged expression and the expression recognized by the game algorithm, augmented with labels obtained through crowdsourcing.

If you are interested in obtaining the dataset, contact

Mantis Shrimp



Tuesday, October 22, 2013

GRIRE paper is now available online (open access to everyone)

Golden Retriever Image Retrieval Engine (GRire) is an open source light weight Java library developed for Content Based Image Retrieval (CBIR) tasks, employing the Bag of Visual Words (BOVW) model. It provides a complete framework for creating CBIR system including image analysis tools, classifiers, weighting schemes etc., for efficient indexing and retrieval procedures. Its eminent feature is its extensibility, achieved through the open source nature of the library as well as a user-friendly embedded plug-in system. GRire is available on-line along with install and development documentation on and on its Google Code page It is distributed either as a Java library or as a standalone Java application, both GPL licensed.

Monday, October 21, 2013

Tweets about ACMMM2013

MAV Urban Localization from Google Street View Data

This approach tackles the problem of globally localizing a camera-equipped micro aerial vehicle flying within urban environments for which a Google Street View image database exists. To avoid the caveats of current image-search algorithms in case of severe viewpoint changes between the query and the database images, the authors proposed to generate virtual views of the scene, which exploit the air-ground geometry of the system. To limit the computational complexity of the algorithm, they rely on a histogram-voting scheme to select the best putative image correspondences. The proposed approach is tested on a 2km image dataset captured with a small quadroctopter flying in the streets of Zurich. The success of the approach shows
that the new air-ground matching algorithm can robustly handle extreme changes in viewpoint, illumination, perceptual aliasing, and over-season variations, thus, outperforming conventional
visual place-recognition approaches.

For more info,
A. Majdik, Y. Albers-Schoenberg, D. Scaramuzza MAV Urban Localization from Google Street View Data, IROS'13, IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS'13, 2013.
More info at:

Friday, October 18, 2013

Wednesday, October 16, 2013

Qualcomm Zeroth - Biologicially Inspired Learning

Computer technology still lags far behind the abilities of the human brain, which has billions of neurons that help us simultaneously process a plethora of stimuli from our many senses. But Qualcomm hopes to shrink that bridge with a new type of computer architecture modeled after the brain, which would be able to learn new skills and react to inputs without needing a human to manually write any code. It's calling its new chips Qualcomm Zeroth Processors, categorized as Neural Processing Units (NPUs), and already has a suite of software tools that can teach computers good and bad behavior without explicit programming.

Qualcomm demoed its technology by creating a robot that learns to visit only white tiles on a gridded floor. The robot first explores the environment, then is given positive reinforcement while on a white tile, and proceeds to only seek out other white tiles. The robot learns to like the white tile due to a simple "good robot" command, rather than any unique algorithm or code

The computer architecture is modeled after biological neurons, which respond to the environment through a series of electrical pulses. This allows the NPU to passively respond to stimuli, waiting for neural spikes to return relevant information for a more effective communication structure. According to MIT Technology Review, Qualcomm is hoping to have a software platform ready for researchers and startups by next year.

Qualcomm isn't the only company working on building a brain-like computer system. IBM has a project known as SyNAPSE that relates to objects and ideas, rather than the typical if-this-then-that computer processing model. This new architecture would someday allow a computer to efficiently recognize a friendly face in a crowd, something that takes significant computing power with today's current technology. Modeling new technology after the human brain may be the next big evolutionary step in creating more powerful computers.

Article from

Thursday, October 10, 2013

Small cubes that self-assemble

Article from

Small cubes with no exterior moving parts can propel themselves forward, jump on top of each other, and snap together to form arbitrary shapes.

In 2011, when an MIT senior named John Romanishin proposed a new design for modular robots to his robotics professor, Daniela Rus, she said, “That can’t be done.”

Two years later, Rus showed her colleague Hod Lipson, a robotics researcher at Cornell University, a video of prototype robots, based on Romanishin’s design, in action. “That can’t be done,” Lipson said.

In November, Romanishin — now a research scientist in MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) — Rus, and postdoc Kyle Gilpin will establish once and for all that it can be done, when they present a paper describing their new robots at the IEEE/RSJ International Conference on Intelligent Robots and Systems.

Known as M-Blocks, the robots are cubes with no external moving parts. Nonetheless, they’re able to climb over and around one another, leap through the air, roll across the ground, and even move while suspended upside down from metallic surfaces.

Inside each M-Block is a flywheel that can reach speeds of 20,000 revolutions per minute; when the flywheel is braked, it imparts its angular momentum to the cube. On each edge of an M-Block, and on every face, are cleverly arranged permanent magnets that allow any two cubes to attach to each other.

A prototype of a new modular robot, with its innards exposed and its flywheel — which gives it the ability to move independently — pulled out. Photo: M. Scott Brauer

“It’s one of these things that the [modular-robotics] community has been trying to do for a long time,” says Rus, a professor of electrical engineering and computer science and director of CSAIL. “We just needed a creative insight and somebody who was passionate enough to keep coming at it — despite being discouraged.”


Embodied abstraction

As Rus explains, researchers studying reconfigurable robots have long used an abstraction called the sliding-cube model. In this model, if two cubes are face to face, one of them can slide up the side of the other and, without changing orientation, slide across its top.

The sliding-cube model simplifies the development of self-assembly algorithms, but the robots that implement them tend to be much more complex devices. Rus’ group, for instance, previously developed a modular robot called the Molecule, which consisted of two cubes connected by an angled bar and had 18 separate motors. “We were quite proud of it at the time,” Rus says.

According to Gilpin, existing modular-robot systems are also “statically stable,” meaning that “you can pause the motion at any point, and they’ll stay where they are.” What enabled the MIT researchers to drastically simplify their robots’ design was giving up on the principle of static stability.

“There’s a point in time when the cube is essentially flying through the air,” Gilpin says. “And you are depending on the magnets to bring it into alignment when it lands. That’s something that’s totally unique to this system.”

That’s also what made Rus skeptical about Romanishin’s initial proposal. “I asked him build a prototype,” Rus says. “Then I said, ‘OK, maybe I was wrong.’”

Read More

Geoff Hinton - Recent Developments in Deep Learning

Geoff Hinton presents as part of the UBC Department of Computer Science's Distinguished Lecture Series, May 30, 2013.
Professor Hinton was awarded the 2011 Herzberg Canada Gold Medal for Science and Engineering, among many other prizes. He is also responsible for many technological advancements impacting many of us (better speech recognition, image search, etc.)

Wednesday, October 9, 2013

Disney develops way to 'feel' touchscreen images

Disney researchers have found a way for people to "feel" the texture of objects seen on a flat touchscreen.

The technique involves sending tiny vibrations through the display that let people "feel" the shallow bumps, ridges and edges of an object.

The vibrations fooled fingers into believing they were touching a textured surface, said the Disney researchers.

The vibration-generating algorithm should be easy to add to existing touchscreen systems, they added.

Developed by Dr Ali Israr and colleagues at Disney's research lab in Pittsburgh, the vibrational technique re-creates what happens when a finger tip passes over a real bump.

"Our brain perceives the 3D bump on a surface mostly from information that it receives via skin stretching," said Ivan Poupyrev, head of the interaction research group in Pittsburgh.

To fool the brain into thinking it is touching a real feature, the vibrations imparted via the screen artificially stretch the skin on a fingertip so a bump is felt even though the touchscreen surface is smooth.

The researchers have developed an underlying algorithm that can be used to generate textures found on a wide variety of objects.

A video depicting the system in action shows people feeling apples, jellyfish, pineapples, a fossilised trilobite as well as the hills and valleys on a map.

The more pronounced the feature, the greater the vibration is needed to mimic its feel.

The vibration system should be more flexible than existing systems used to give tactile feedback on touchscreens, which typically used a library of canned effects, said Dr Israr.

"With our algorithm we do not have one or two effects, but a set of controls that make it possible to tune tactile effects to a specific visual artefact on the fly," he added.

Saturday, October 5, 2013

Need software libraries and tools for your research ?

Attend the ACM MM'13 Open Source Software Competition:

  • M. Hildebrand, M. Brinkerink, R. Gligorov, M. Van Steenbergen, J. Huijkman, and J. Oomen: Waisda? Video Labeling Game
  • C.-Y. Huang, D.-Y. Chen, C.-H. Hsu, and K.-T. Chen: GamingAnywhere: An Open-Source Cloud Gaming Testbed

  • J. Wagner, F. Lingenfelser, T. Baur, I. Damian, F. Kistler, and E. Andre: The Social Signal Interpretation (SSI) Framework – Multimodal Signal Processing and Recognition in Real-Time

  • F. Eyben, F. Weninger, F. Groß, and B. Schuller: Recent Developments in openSMILE, the Munich Open-Source Multimedia Feature Extractor

  • I. Tsampoulatidis, D. Ververidis, P. Tsarchopoulos, S. Nikolopoulos, I. Kompatsiaris, and N. Komninos: ImproveMyCity – An open source platform for direct citizen-government communication

  • M. Lux: LIRE: Open Source Image Retrieval in Java

  • L. Tsochatzidis, C. Iakovidou, S. Chatzichristofis, and Y. Boutalis: Golden Retriever – A Java Based Open Source Image Retrieval Engine

  • R. Aamulehto, M. Kuhna, and P. Oittinen: Stage Framework – An HTML5 and CSS3 Framework for Digital Publishing

  • D. Bogdanov, N. Wack, E. Gómez, S. Gulati, P. Herrera, O. Mayor, G. Roma, J. Salamon, J. Zapata, and X. Serra: ESSENTIA: an Audio Analysis Library for Music Information Retrieval

  • C. Flynn, D. Monaghan, and N.E. O Connor: SCReen Adjusted Panoramic Effect – SCRAPE

  • H. Yviquel, A. Lorence, K. Jerbi, G. Cocherel, A. Sanchez, and M. Raulet: Orcc: Multimedia development made easy

NTT DoCoMo Intelligent Glass Demo at CEATEC 2013