Pages

Saturday, October 31, 2009

IAPR International Workshops on Structural and Syntactic Pattern Recognition (SSPR 2010) and Statistical Techniques in Pattern Recognition (SPR 2010)

A satellite event of the
20th International Conference of Pattern Recognition, ICPR 2010

The next joint Statistical Pattern Recognition and Structural and Syntactic Pattern Recognition Workshops (organised by TC1 and TC2 of the International Association for Pattern Recognition, (IAPR) will be held at Cesme Altin Yunus Hotel, Cesme, Turkey prior to ICPR 2010 (which itself will be held in Istanbul). The joint workshops aim at promoting interaction and collaboration among researchers working in areas covered by TC1 and TC2. We are also keen to attract participants working in fields that make use of statistical, structural or syntactic pattern recognition techniques (e.g. image processing, computer vision, bioinformatics, chemo-informatics, machine learning, document analysis, etc.). Those working in areas which can make methodological contributions to the field, e.g. methematicians, statisticians, physicists etc, are also very welcome.

The workshop will be held in Cesme, which is a seaside resort on the Aegean coast of Turkey. There area has many interesting attractions including excellent beaches, interesting fishing villages, and nearby archaeological remains and historical sites. These include Cesme castle and the remains of the ancient Greek city of Erythrae. Cesme can be reached by bus from the airport at Izmir, which has good flight connections to Istanbul.

SSPR Topics

Structural Matching
Syntactic Pattern Recognition
Image Understanding
Shape Analysis
Graphical Models
Graph-Based Models
Spectral Methods for Graph Based Representations
Probabilistic and Stochasitc Structural Models for Pattern Recognition
Structural Learning in Spatial or Spatio-Temporal Signals
Kernel Methods for Structured Data
Image and Video Analysis
Intelligent Sensing Systems
Spatio-Temporal Pattern Recognition
SSPR Methods in Computer Vision
Multimedia Signal Analysis
Image Document Analysis
Structured Text Analysis and Understanding
Novel Applications

SPR Topics

Density Estimation
Large Margin Classifiers
Kernel Methods
Ensemble Methods and Multiple Classifiers
Bayesian Methods
Gaussian Processes
Dimensionality Reduction
Independent Component Analysis
Cluster Analysis
Unsupervised Learning
Data Visualization
Semi-Supervised Learning
Model Selection
Hybrid methods
Comparative Studies
Speech and Image Analysis
Novel Applications

Provisional Dates

Submission of papers: 1st February 2010
Decisions: 1st April 2010
Camera ready copy: 1st May 2010
Workshops: 18-20 August 2010

The 11th European Conference on Computer Vision (ECCV 2010)

The 11th European Conference on Computer Vision (ECCV 2010) will be hosted by the Foundation for Research and Technology-Hellas (FORTH), Crete, Greece from Sunday, 5 September 2010 to Saturday, 11 September 2010.

ECCV is a selective single-track conference on computer vision. High quality previously unpublished research contributions are sought on any aspect of computer vision.
Topics include, but are not limited to:

  • Sensors and Early Vision
  • Image Features
  • Color and Texture
  • Segmentation and Grouping
  • Image-Based Modeling
  • Illumination and Reflectance Modeling
  • Motion and Tracking
  • Stereo and Structure from Motion
  • Shape Representation
  • Object Recognition
  • Video Analysis
  • Event Detection and Recognition
  • Face Detection and Recognition
  • Gesture Recognition
  • Statistical Models and Visual Learning
  • Medical Image Analysis
  • Active and Robot Vision
  • Image and Video Retrieval
  • Cognitive & Biologically inspired Vision
  • Vision Systems Engineering & Performance Evaluation

ECCV2010 will also include Tutorials, Workshops, Demonstrations and Industrial Exhibitions.

Conference Papers

  • Abstracts: March 10th, 2010
  • Full Papers: March 17th, 2010
  • Notification: June 1st, 2010
  • Camera-Ready: June 15th, 2010
http://www.ics.forth.gr/eccv2010/intro.php

2010 IEEE International Symposium on Biomedical Imaging (ISBI)

To be held 14-17 April 2010, Rotterdam, The Netherlands

Four-page paper submission deadline: 2 November 2009 !!!

NEW: Best Student Paper Awards (see below)

See http://www.biomedicalimaging.org/ for details

The IEEE International Symposium on Biomedical Imaging (ISBI) is the premier forum for the presentation of technological advances in theoretical and applied biomedical imaging. ISBI 2010 will be the seventh meeting in this series. The previous meetings have played a leading role in facilitating interaction between researchers in medical and biological imaging. The 2010 meeting will continue this tradition of fostering crossfertilization among different imaging communities and contributing to an integrative approach to biomedical imaging across all scales of observation.

ISBI is a joint initiative of the IEEE Engineering in Medicine and Biology Society (EMBS) and the IEEE Signal Processing Society (SPS). The 2010 meeting will feature an opening morning of tutorials, followed by a scientific program of plenary talks, special sessions, and oral and poster presentations of peer-reviewed contributed papers.

Confirmed plenaries: New clinical imaging technologies (Richard Ehman); Challenges in bioimage informatics (Jason Swedlow); Molecular imaging and applications (Clemens Lowik); Challenges in biomedical image analysis (Milan Sonka). Special Sessions: fMRI & DTI (Carl-Fredrik Westin); High-field clinical MRI (Andrew Webb); Fluorescence guided surgery (Vasilis Ntziachristos); Whole-body imaging and analysis (Faiza Admiraal); Histological and intravital microscopy (Tom Vercauteren); Ultrasound imaging and analysis (Hans Bosch); Multi-parameter biomedical optical imaging and analysis (Atam Dhawan and Metin Gurcan); Computer aided diagnosis (Nico Karssemeijer). Tutorials: Biomedical image registration (Gustavo Rohde and Graeme Penney); Optical microscopy and deconvolution (Hans van der Voort and Erik Manders); Ultrasound imaging and therapeutics (Elisa Konofagou and Constantin Coussios); Machine learning for biomedical image analysis (Marco Loog and David Tax).

Contributed Program:

High-quality papers are solicited describing original contributions to the mathematical, algorithmic, and computational aspects of biomedical imaging, from nano- to macro-scale. Topics of interest include image formation and reconstruction, computational and statistical image processing and analysis, dynamic imaging, visualization, image quality assessment, and physical, biological, and statistical modeling. Papers on molecular, cellular, anatomical, and functional imaging modalities and applications are welcomed. All accepted papers will be published in the proceedings of the symposium and will be available online through the IEEE Xplore database.

Best Student Paper Awards:

ISBI 2010 awards a prize for best student paper, as judged by a special award committee. At most three papers will be awarded a prize of 500 euro each. A paper is eligible if the primary author is a student at the time of paper submission and this person will indeed present the paper at the symposium if it is accepted.

Important Dates:

Deadline for submission of 4-page paper: November 2, 2009 Notification of acceptance/rejection: January 15, 2010 Submission of final accepted 4-page paper: February 15, 2010 Deadline for author registration: February 15, 2010 Deadline for early registration: March 1, 2010

Friday, October 30, 2009

VISIGRAPP - The International Joint Conference on Computer Vision, Imaging

The purpose of the International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications is to bring together researchers and practitioners on the areas of computer vision, imaging, computer graphics and Information Visualization, interested in both theoretical advances and applications in these fields. Computer Vision, Imaging, Computer Graphics and Information Visualization are well known areas which are becoming more and more interrelated with important interdisciplinary work, often as a result of an iterative combined process of image analysis and synthesis with models created in one of the fields being used to improve models created in another.
The VISIGRAPP component conferences are specialized in the following topics: GRAPP is structured along four main tracks, covering different aspects related to Computer Graphics, from Modelling to Rendering, including Animation and Interactive Environments, IMAGAPP covers theory, applications and technologies related to image display, colour coding, medical imaging, remote sensing, business document processing, digital fabrication, printing and electronic devices, VISAPP has also four main tracks, namely: Image Formation and Processing, Image Analysis, Image Understanding and Motion, Tracking and Stereo Vision and IVAPP structured along several topics related to Information Visualization.

Conference Co-chairs
Paul Richard, University of Angers, France
José Braz, Polytechnic Institute of Setϊbal, Portugal

Program Co-chairs
Peter Sturm, INRIA Grenoble - Rhone-Alpes, France
Adrian Hilton, University of Surrey, U.K.

VISAPP
International Conference on Computer Vision Theory and Applications

http://visapp.visigrapp.org

IMAGAPP
International Conference on Imaging Theory and Applications

http://imagapp.visigrapp.org

GRAPP
International Conference on Computer Graphics Theory and Applications

http://grapp.visigrapp.org

IVAPP
International Conference on Information Visualization Theory and Applications

http://ivapp.visigrapp.org

http://www.visigrapp.org/

CIVR 2010

Image and video storage and retrieval continues to be one of the most exciting and fastest-growing research areas in the field of multimedia technology.
However, opportunities for the exchange of ideas between different groups of researchers, and between researchers and potential users of image/video
retrieval systems, are still limited. The International Conference on Image and Video Retrieval (CIVR) series of conferences was originally set up to illuminate
the state of the art in image and video retrieval between researchers and practitioners throughout the world. This conference aims to provide an international
forum for the discussion of challenges in the fields of image and video retrieval. CIVR2010 is seeking original high quality submissions addressing innovative
research in the broad field of image and video retrieval. A unique feature of the conference is the high level of participation from practitioners. Applications
papers and presentations suitable for a wide audience are therefore particularly welcome.

Topics of interest include but are not limited to:

•  Content-based indexing, search, and retrieval of images & videos

•  Semantic-based indexing, search, and retrieval of images & videos

•  Affect-based indexing, search, and retrieval of images & videos

•  Advanced descriptors and similarity metrics for images, audio and video

•  Learning and relevance feedback in image/video retrieval

•  3D images and models

•  Ontologies for annotation and search of images and videos

•  Fusion of multimedia

•  Image/video summarization and visualization

•  Evaluation of image and video retrieval systems

•  Database architectures for image/video retrieval

•  Novel image data management systems and applications

•  High performance image/video indexing algorithms

•  Image/video search and browsing on the Web

•  Retrieval from multimodal lifelogs

•  Applications in satellite, forensic, (bio-)medical image and video collections

•  HCI issues in image and video retrieval

•  Novel image data management systems and applications

•  Neural network techniques for image & video classification and retrieval

Important Dates: 

•  Paper submission deadline: February 1, 2010

•  Notification of acceptance: March 15, 2010

•  Camera ready papers due: April 1, 2010

http://www.civr2010.org/index.htm

Tuesday, October 27, 2009

Adobe Photoshop Lightroom 3

Adobe Photoshop Lightroom software is a digital darkroom and efficient assistant designed for serious amateur and professional photographers. With Lightroom, you can organize, enhance, and showcase your images all from one fast and nimble application that’s available on Macintosh and Windows® platforms.

  • Manage your growing photo collection in a visual library that makes it quick and easy to organize, find, and select your images.
  • Get the absolute best from every shot—whether raw, JPEG, or TIFF—using state-of-the-art nondestructive editing tools.
  • And when you're ready, showcase your images with the impact they deserve using customizable print layouts, powerful slideshows, web gallery creation tools, and connection to online photo sharing sites (may require third-party plug-ins).
The Adobe Photoshop Lightroom 3 Beta

Lightroom 3 beta builds on the vision of the very first Lightroom beta. From day 1, Lightroom was designed for photographers and by photographers to help them focus on what they love—the art and craft of photography. Lightroom provides photographers with an elegant and efficient way to work with their growing digital image collections, bring out the best in their images, and make an impact from raw capture to creative output, all while maintaining the highest possible quality each step of the way.

We're offering a public beta of the next release of Lightroom to give you a chance to preview some of the new features and enhancements in the upcoming version. It's an opportunity for you to evaluate a select portion of the new features planned for Lightroom 3, to help the team discover and address issues if any, and to send feedback that the Lightroom team can use to make Lightroom 3 an even better digital darkroom and more efficient assistant for you.

Download and Discuss

For the development of this latest release, we’ve focused on what we believe are the fundamental priorities of our customers: performance and image quality. Lightroom has been rebuilt from its core to provide a performance architecture that meets the needs of photographers today and into the future. The raw processing engine has also received an overhaul in order to ensure that you’re maximizing the potential in your images in terms of sharpening and noise reduction. And, a number of other new areas have also had new features added and enhanced. Like any beta, Lightroom 3 beta is unfinished, which means some of the features we have planned are not in this release, and some of the features in the beta are not yet complete.

Some of the new features included for you to play with in the Lightroom 3 beta are:

  • Brand new performance architecture, building for the future of growing image libraries
  • State-of-the-art noise reduction to help you perfect your high ISO shots
  • Watermarking tool that helps you customize and protect your images with ease
  • Portable sharable slideshows with audio—designed to give you more flexibility and impact on how you choose to share your images, you can now save and export your slideshows as videos and include audio
  • Flexible customizable print package creation so your print package layouts are all your own
  • Film grain simulation tool for enhancing your images to look as gritty as you want
  • New import handling designed to make importing streamlined and easy
  • More flexible online publishing options so you can post your images online to certain online photo sharing sites directly from inside Lightroom 3 beta (may require third-party plug-ins)*

For more details on the new functionality in the Lightroom 3 beta:

Download now

http://labs.adobe.com/technologies/lightroom3/

Monday, October 26, 2009

Numenta Vision Toolkit

The Vision Toolkit allows you to create your own image recognition system using Numenta's HTM technology.
It's a powerful application that anyone can use.

Download Beta 2

Download The Vision Toolkit is free and runs on Windows and Mac OS X.
This is a beta release. Please read the tutorial to learn about the Toolkit and its current limitations.

Using the Vision Toolkit

Step 1: Collect images for training
If you want to distinguish chairs from sofas, find lots of chair and sofa pictures.
You can collect images quickly using the Toolkit's web search feature. 
Step 2: Train
Once you have enough images, click the Train button.
Step 3: Recognize new images
Import a new image, and the Toolkit will try to recognize its category.
For lots more information, see our tutorial and video.
You can upload your vision system to Numenta's free Web Services to share it online.
You can also use our API to incorporate image recognition into your web or mobile app (e.g. an iPhone game).

http://www.numenta.com/vision/vision-toolkit.php

IEEE -Thematic Meetings in Signal Processing (THEMES)

The IEEE Signal Processing Society is initiating a new technical series called IEEE -Thematic Meetings in Signal Processing (THEMES). IEEE-THEMES is a one-day event and will be held for the first time in 2010 in conjunction with ICASSP in Dallas, Texas, USA. IEEE-THEMES is organized in a single track to cover intensively one focus area each meeting. THEMES 2010 will focus on Signal and Information Processing for Social Networks.

Accepted papers will be published in an issue of the IEEE Journal of Selected Topics in Signal Processing. Submission procedures of the IEEE Journal on Selected Topics in Signal Processing (J-STSP) should be followed by submitting authors for IEEE-THEMES. For more instructions please visit www.ieee-themes.org.

There are currently two major trends towards social networks where signal and information processing are playing an increasing role:

Mobile sensors:
As pointed out in a recent Nature article , the single, most important source of data is the ubiquitous mobile phone. Every time a person uses a mobile phone, a few bits of information can be collected; including geographic information, physical activity; the phones signal processing hardware can analyze the users speaking patterns.

Internet-base social communities:
We are witnessing the emergence of large-scale social network communities such as Napster, Facebook, Twitter, and YouTube where millions of users form a dynamically-changing infrastructure to share content. Such proliferation and introduction of the new concept of web-based social networking creates a technological revolution not only for the personal and entertainment purposes, but also for many new applications of government/school/industry/research that bring new experiences to users.

In both cases, the massive content production poses new challenges to the scalable and reliable sharing of (multimedia) content over large and heterogeneous networks. While demanding effective management of enormous amounts of unstructured content that users create, share, link and reuse, this also raises critical issues of intellectual property protection and privacy. In large-scale social networks, millions of users actively interact with each other, and such user dynamics not only influence each individual user but also affect the system performance. To provide a predictable and satisfactory level of service, it is important to analyze the impact of human factors on multimedia social networks, and to provide important guidelines to better design of multimedia systems. Similarly, economists are making progress toward understanding social learning, asking how networked agents can form a consensus in their estimates or actions given state measurements.

The goal of THEMES is to encourage researchers from different areas (signal processing, information management, computer sciences, and psycho-sociology) to come together to explore and understand the impact of signal and information.

RELEVENT TOPICS INCLUDE BUT ARE NOT LIMITED TO:

Behavior modeling and analysis in social networks: to improve our understanding of (large-scale) human behavior in social networks and to model user dynamics to demonstrate how such understanding can help improve information and communication system performance.

Collective intelligence and consensus formation in large scale networks which consist of both humans and machines.

Cognitive modeling, machine learning/understanding, and synthesizing social phenomenon and events, including, e.g., game theoretic and Bayesian social learning; analysis of social learning phenomenon such as rational herding.

Securing mechanism and privacy in social networks: to understand how interactions in social networks may post security/privacy issues and how to deal with them, and how to develop trust/belief model/evaluation/framework.

Applications of social networks: to understand how the concepts of social networks can help improve our modeling and analysis of traditional problems where interactions of multiple users and systems can be considered as social networks. Examples include:

  • Peer-to-peer streaming, signal processing, and communications
  • Multi-user information theory
  • Multi-user rate and resource allocation
  • Mobile, sensor, or human networks
Database and content retrieval

Orasis

Orasis is a Biologically-Inspired Image Processing software, developed by Vassilios Vonikakis and is partially based on his PhD research. The main objective of Orasis is to make photographs look closer to the human perception of the scene, by compensating for High-Dynamic Range conditions, color casts, reduced local contrast and noise. This software was formerly named "PhotoEnhancer", but the name was dropped due to copyright issues.

background3

Orasis can enhance your images in the following 4 ways:

1. Enhancement of the under/overexposed image regions, without affecting the correctly exposed ones. Many times the image captured by a camera and the image in our eyes are dramatically different. Especially when there are shadows or highlights in the same scene. In these cases our eyes can distinguish many details in the shadows or the highlights, while the image captured by the camera suffers from loss of visual information in these regions. Orasis employs characteristics of the ganglion cells of the Human Visual System, exhibiting improved local equalization of brightness and contrast.

2. Enhancement of the local contrast. The transmittance of a scene plays a very important role to the overall quality of the image. Smoke, fog or the atmosphere, can decrease the local contrast of images, e.g. aerial photos, photos with extended use of zoom etc. Orasis employs biologically inspired algorithms which compensate for this phenomenon and improve the overall clarity of the image.

3. Color correction. The Human Visual System exhibits some degree of color constancy. This means that the perception of colors is not affected considerably by the presence of unknown light sources in a scene. On the other hand, cameras are affected much more by the color of the scene’s light. Unless a proper white balance setting is selected, images taken under incandescence light will look yellowish or images taken in the sunset will look reddish etc. Orasis employs many algorithms which correct the overall colors of the scene, by removing automatically color casts. Among them, there are also local color correction algorithms, which can compensate for the effects of many light sources in the scene.

4. Noise reduction. Enhancing underexposed image regions, will inevitably result to noise appearance in these areas. Trying to reduce this noise, will affect the whole image, deteriorating the overall appearance. Until now, the only way to deal with this was a manual application of denoising filters to these specific image regions, something which is quite time consuming. Orasis employs noise reduction algorithms which affect only the underexposed image regions, leaving intact the correctly exposed ones. With the press of a button, the noise in the underexposed image regions will be removed without affecting other parts of the image.

Aiming in the above four directions, Orasis attempts to bridge the gap between "what you see" and "what the camera sees". The final result is a lot closer to the human perception of the scene, than the original captured image, revealing visual information that otherwise wouldn't be available to the human observer.

http://sites.google.com/site/vonikakis/software

Wednesday, October 7, 2009

Miruko wearable gaming eyeball robot turns the creep factor up significantly

Miruko is the creepiest gaming device we've seen in a while -- but it's also downright awesome. A robotic interface boasting WiFi and a built-in camera, it's designed to be worn and used in augmented reality, real life gaming situations, able to detect things -- like monsters -- that are invisible to the human eye. Once the robot detects the presence of said monster (or zombie), it fixes its gaze on the object, allowing the gamer to follow its line of sight and then.. you know, destroy it -- using an iPhone camera. It's also capable of locating and locking in on specific objects and faces, making it really useful in hunting down whatever imaginary creatures that have been following you lately. Check the coolness in the video after the break, but keep in mind -- we've been able to see the invisible monsters all along.

 

Read More

This Is a Photoshop and It Blew My Mind

PhotoSketch is an internet-based program that can take the rough, labeled sketch on the left and automagically turn it into the naff montage on the right. Seems unbelievable but—as the video shows—it works.

Very special thanks to Filipe Coelho

Monday, October 5, 2009

The 4th International Conference on Multimedia and Ubiquitous Engineering (MUE 2010)

Recent advances in pervasive computers, networks, telecommunication, and information technology, along with the proliferation of multimedia-capable mobile devices, such as laptops, portable media players, personal digital assistants, and cellular telephones, have stimulated the development of intelligent and pervasive multimedia applications in a ubiquitous environment. The new multimedia standards (for example, MPEG-21) facilitate the seamless integration of multiple modalities into interoperable multimedia frameworks, transforming the way people work and interact with multimedia data. These key technologies and multimedia solutions interact and collaborate with each other in increasingly effective ways, contributing to the multimedia revolution and having a significant impact across a wide spectrum of consumer, business, healthcare, education, and governmental domains. This conference provides an opportunity for academic and industry professionals to discuss recent progress in the area of multimedia and ubiquitous environment including models and systems, new directions, novel applications associated with the utilization and acceptance of ubiquitous computing devices and systems.

The goals of this conference are to provide a complete coverage of the areas outlined and to bring together the researchers from academic and industry as well as practitioners to share ideas, challenges, and solutions relating to the multifaceted aspects of this field.

The conference includes, but is not limited to, the areas listed below:

  • Track 1: Ubiquitous Computing and Beyond
  • Ubiquitous Computing and Technology
  • Context-Aware Ubiquitous Computing
  • Parallel/Distributed/Grid Computing
  • Novel Machine Architectures
  • Semantic Web and Knowledge Grid
  • Smart Home and Generic Interfaces
  • Track 2: Multimedia Modeling and Processing
  • AI and Soft Computing in Multimedia
  • Computer Graphics and Simulation
  • Multimedia Information Retrieval (images, videos, hypertexts, etc.)
  • Internet Multimedia Mining
  • Medical Image and Signal Processing
  • Multimedia Indexing and Compression
  • Virtual Reality and Game Technology
  • Current Challenges in Multimedia
  • Track 3: Ubiquitous Services and Applications
  • Protocols for Ubiquitous Services
  • Ubiquitous Database Methodologies
  • Ubiquitous Application Interfaces
  • IPv6 Foundations and Applications
  • Smart Home Network Middleware
  • Ubiquitous Sensor Networks / RFID
  • U-Commerce and Other Applications
  • Databases and Data Mining
  • Track 4: Multimedia Services and Applications
  • Multimedia RDBMS Platforms
  • Multimedia in Telemedicine
  • Multimedia Embedded Systems
  • Entertainment Industry
  • E-Commerce and E-Learning
  • Novel Multimedia Applications
  • Computer Graphics
  • Track 5: Multimedia and Ubiquitous Security
  • Security in Commerce and Industry
  • Security in Ubiquitous Databases
  • Key Management and Authentication
  • Privacy in Ubiquitous Environment
  • Sensor Networks and RFID Security
  • Multimedia Information Security
  • Forensics and Image Watermarking
  • Track 6: Other IT and Multimedia Applications

http://www.cis.uab.edu/kddm/MUE2010/callforpapers.html

iCanSee

iCanSee is the first application of its kind designed to turn any iPhone into a magnifying glass. iCanSee on-screen magnification controls even adapt to varied environmental situations including low-light. Once the application is launched, iPhone then magnifies anything held in front of it by as much as four times the original size, helping users read the fine print on labels, menus, contracts, books or other materials discreetly without the need for separate reading glasses. iCanSee is now available through Apple’s iTune app store.

 

“We’re extremely proud to launch iCanSee, a premier accessibility application for the iPhone,” explains Steven Diaz, founder of Espada Entertainment Inc. “Users have enjoyed not only the practicality but also the novelty of iCanSee, which has a quirky and offbeat appeal among younger users. Moving forward, we hope to reach many more untapped markets by fulfilling the need for both practical and entertainment-driven mobile phone applications.”

A graduate of Orlando’s FullSail University with honors in Software Engineering and Game Development, Mr. Diaz completed a rigorous academic curriculum and earned his Bachelor’s degree in just two years. He credits this accomplishment to hard work, a passion for his field of study along with the unyielding support of friends and family. Steven manages day-to-day operations at Espada Entertainment Inc., responsible for product development and technical direction.

Visit www.espadaentertainment.com to learn more about Espada Entertainment and the iCanSee iPhone application.

Friday, October 2, 2009

Seam carving for content aware image resizing: MATLAB implementation & tutorial

Article from http://danluong.com/2007/12/21/seam-carving-matlab-implementation-tutorial/

Having recently come across a paper by Shai Avidan and Ariel Shamir, and watching the totally awesome video about seam carving, I decided to try my hand at implementing their algorithm for myself. I’ve included a quick overview of what seam carving is, the MATLAB program I have created, a summary of the algorithm itself with some implementation details, as well as some notes on how to run my program, so without further ado…

The Program

You can download the zip file with all of my code here:

matlab-seam-carving.zip

If you would like to run this program on your copy of MATLAB, you will need to have the image processing toolbox. I will try to remove this dependency in a later update.

You are free to modify my code and add new features, etc, and redistribute it as you see fit, but I would like to hear about it, and be given credit where it is due.

Features and functionality

The program allows the user to resize an image by removing a continuous path of pixels (a seam) vertically or horizontally from a given image. A seam is defined as a continuous path of pixels running from the top to the bottom of an image in the case of a vertical seam, while a horizontal seam is a continuous line of pixels spanning from left to right in an image. An example of a seam overlaid on an image is shown in Figure 1.

Image with vertical seam

Figure 1: Image with vertical seam

The GUI of the program I completed is included as Figure 2. The main overall functionalities include (in order from the top to bottom in the GUI) opening an image file, resetting the state of the program to initial values before the image was resized in any way, the removal of a single vertical running seam from the image, removal of a single horizontal seam from the image, input boxes for custom image resizing using repeated seam removals and/or insertions (maximum image size is 2xCurrentsize-1 in the horizontal and vertical directions), a listbox for user chosen display of the color RGB image, the gradient image, or the energymap image, and a checkbox to show the seam removed on one of the three chosen types of image chosen from the listbox.

http://danluong.com/2007/12/21/seam-carving-matlab-implementation-tutorial/

Image Retrieval Systems

Untitled - 1

1.CityU MIRROR
MIRROR (MPEG-7 Image Retrieval Refinement based On Relevance feedback) is a platform for content-based image retrieval (CBIR) research and development using MPEG-7 technologies.

    MIRROR supports several MPEG-7 visual descriptors:

    • Color Descriptors:
      • Dominant Color Descriptor (DCD)
      • Scalable Color Descriptor (SCD)
      • Color Layout Descriptor (CLD)
      • Color Structure Descriptor (CSD)
    • Texture Descriptors:
      • Edge Historgram Discriptor (EHD)
      • Homogeneous Texture Descriptor (HTD)

The system core is based on MPEG-7 Experimentation Mode (XM) with web-based user interface for query by image example retrieval. A new Merged Color Palette approach for DCD similarity measure and relevance feedback are also developed in this system. The system is highly modularized, new algorithms, new ground truth set, and even new image database can be added easily.

 
2.IBM Research - Intelligent Information Management Department
The Intelligent Information Management Department at the IBM T. J. Watson Research Center is addressing technical challenges in database systems and information management.  The department includes the Database Research Group and Intelligent Information Analysis Group.  The department exploring novel techniques for indexing, analyzing, fusing, searching and exploiting structured data and unstructured information in various scientific and business contexts.
 
3.Document Image Retrieval System with Word Recognition I
In this web site a Document Image Retrieval System (DIRS) is presented. The used technique encounters the document retrieval problem using a word matching procedure. This technique performs the word matching directly in the document images bypassing OCR and using word-images as queries. The entire system consists of the Offline and the Online procedures. In the Offline procedure, the document images are analyzed and the results are stored in a database. Three main stages, the preprocessing, the word segmentation and the feature extraction stages, constitute the offline procedure. A set of features, capable of capturing the word shape and discard detailed differences due to noise or font differences are used for the word-matching process. The Online procedure consists of four components: the creation of the query image, the preprocessing stage, the feature extraction stage, and finally, the matching procedure.
 
4.Content-Based Image Retrieval  - SIMPLIcity 
This content-based image search engine was developed at Stanford University between 1999 and 2000. The line of research is on-going at Penn State. Main publications include
  • Jia Li, James Z. Wang, ``Automatic linguistic indexing of pictures by a statistical modeling approach,'' IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 9, pp. 1075-1088, 2003. (download)
  • James Z. Wang, Jia Li, Gio Wiederhold, ``SIMPLIcity: Semantics-sensitive Integrated Matching for Picture LIbraries,'' IEEE Trans. on Pattern Analysis and Machine Intelligence, vol 23, no.9, pp. 947-963, 2001. (download)
The current database has about 200,000 images from COREL CD-ROM Collection.
 
5.Behold | Search High Quality Flickr Images
Behold is a search engine for high-quality Flickr images. It aims to answer your queries based on what is inside the images -- at the pixel level. It offers a completely new way to search for images, using techniques of computer vision. It is different to standard image search engines, such as Flickr or Google, because those search through images using only image tags and filenames.

Behold looks for high quality images, so you don't have to sift through hundreds of poorly taken pictures to find a good one. Behold uses both aesthetic and technical quality indicators to find some of the best images available online.

Behold draws computational power from Amazon Elastic Compute Cloud (EC2) to handle large volumes of images.

 
6.ALIPR - Automatic Photo Tagging and Visual Image Search
The ALIPR (pronounced a-lip-er), launched officially on November 1, 2006, is a machine-assisted image tagging and searching service being developed at Penn State by Professors Jia Li and James Z. Wang. They started their work as early as in 1995 while they were both with the Stanford University. They attempted to develop computer systems to manage millions of images by the pixel content. They have developed the WIPE (TM) system, the first good-accuracy image-based parental control filter for Web images, in 1997. They have also developed the SIMPLIcity (TM) image similarity search engine, handling millions of images in real-time, in 1999 (note: this is unrelated to the later Picture Simplicity - Picasa work by Google).

7.alphaWorks : IBM Multimedia Analysis and Retrieval System
IBM Multimedia Analysis and Retrieval System (IMARS) is a powerful system that can be used to automatically index, classify, and search large collections of digital images and videos. IMARS works by applying computer-based algorithms that analyze visual features of the images and videos, and subsequently allows them to be automatically organized and searched based on their visual content. In addition to search and browse features, IMARS also:
  • Automatically identifies, and optionally removes, exact duplicates from large collections of images and videos
  • Automatically identifies near-duplicates
  • Automatically clusters images into groups of similar images based on visual content
  • Automatically classifies images and videos as belonging or not to a pre-defined set (hereafter called taxonomy) of semantic categories (such as ‘Landmark’, ‘Infant’, etc.)
  • Performs content-based retrieval to search for similar images based on one or more query images
  • Tags images to create user defined categories within the collection
  • Performs text based and metadata based searches.

8.retrievr - search by sketch / search by image
retrievr is an experimental service which lets you search and explore in a selection of Flickr images by drawing a rough sketch.
Currently the index contains many of Flickr's most interesting images. If you'd like to have your images (or the images for a specific tag) added, please let me know. A submission interface is planned!

9.Pixolu - find what you imagine
pixolu is a prototype of an image search system combining keyword search with visual similarity search and semi-automatically learned inter-image relationships. Enter a search term and pixolu searches the image indexes of Yahoo and Flickr. Compared to other image search systems pixolu retrieves more images in an initial phase. Due to a visually sorted display up to several hundred images can be easily inspected, which in most cases is sufficient to get good representations of the entire result set. The user can quickly identify images, which are good candidates for his/her desired search result.In the next step the selected candidate images are used to refine the result by filtering out visually non-similar images from an even larger result set. In addition pixolu learns the inter-image relationships from the candidate sets of different users. This helps pixolu to suggest other images that are semantically similar to the candidate images.

10.imprezzeo | find the right image
Imprezzeo is an image-based search technology company. Our technology allows users to search for images using other images as examples rather than textual search terms. Those images might be scenes, landmarks, objects, graphics, people or even personalities. Irrespective of the size of the collection, Imprezzeo Image Search helps you find the right image, fast.
By delivering a more satisfying search experience, Imprezzeo helps content providers reduce abandoned search sessions, increase usage and improve customer loyalty.
 
11.CoPhIR - COntent-based Photo Image Retrieval
The CoPhIR (Content-based Photo Image Retrieval) Test-Collection has been developed to make significant tests on the scalability of the SAPIR project infrastructure (SAPIR: Search In Audio Visual Content Using Peer-to-peer IR) for similarity search.
CoPhIR is now available to the research community to try and compare different indexing technologies for similarity search, with scalability being the key issue.
The organizations (universities, research labs, etc.) interested in building experimentations on it should sign the enclosed CoPhIR Access Agreement and the CoPhIR Access Registration Form, sending the original signed documents to us by mail. Please follow the instruction in the section “How to get CoPhIR Test Collection”. You will then receive Login and Password to download the required files. 

12.img(Anaktisi)
In this web-site a new set of feature descriptors is presented in a retrieval system. These descriptors have been designed with particular attention to their size and storage requirements, keeping them as small as possible without compromising their discriminating ability. These descriptors incorporate color and texture information into one histogram while keeping their sizes between 23 and 74 bytes per image. Also, in this web-site an Auto Relevance Feedback (ARF) technique is introduced which is based on the proposed descriptors. The goal of the proposed Automatic Relevance Feedback (ARF) algorithm is to optimally readjust the initial retrieval results based on user preferences. During this procedure the user selects from the first round of retrieved images one as being relevant to his/her initial retrieval expectations. Information from these selected images is used to alter the initial query image descriptor.

13.Google Similar Images
Google Similar Images is an experimental service from Google Labs that lets you find images that are similar with an image you select.
"Similar Images allows you to search for images using pictures rather than words. Click the Similar images link under an image to find other images that look like it."

14.img(Rummager)
img(Rummager) brings into effect a number of new as well as state of the art descriptors. The application can execute an image search based on a query image, either from XML-based index files, or directly from a folder containing image files, extracting the comparison features in real time. In addition the img(Rummager) application can execute a hybrid search of images from the application server, combining keyword information and visual similarity. Also img(Rummager) supports easy retrieval evaluation based on the normalized modified retrieval rank (NMRR)
and average precision (AP).

15.Lire Demo
The LIRE (Lucene Image REtrieval) library provides a simple way to retrieve images and photos based on their color and texture characteristics. LIRE creates a Lucene index of image features for content based image retrieval (CBIR). Three of the available image features are taken from the MPEG-7 Standard: ScalableColor, ColorLayout and EdgeHistogram a fourth one, the Auto Color Correlogram has been implemented based on recent research results. Furthermore simple methods for searching the index and result browsing are provided by LIRE. The LIRE library and the LIRE Demo application as well as all the source are available under the Gnu GPL license.

Wiamis2010

The International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS) is one of the main international forum for the presentation and discussion of the latest technological advances in interactive multimedia services. The objective of the workshop is to bring together researchers and developers from academia and industry working in all areas of image, video and audio applications, with a special focus on analysis.

After Louvain (1997), Berlin (1999), Tampere (2001), London (2003), Lisboa (2004), Montreux (2005), Incheon (2006), Santorini (2007), Klagenfurt (2008) and London (2009), WIAMIS 2010 will be held in Desenzano del Garda, Italy.

Topics of interest include, but are not limited to:

  • Multimedia content analysis and understanding
  • Content-based browsing, indexing and retrieval of images, video and audio
  • Content-based copy detection
  • Emotional based content classification and organization
  • 2D/3D feature extraction
  • Advanced descriptors and similarity metrics for audio and video
  • Relevance feedback and learning systems
  • Segmentation of objects in 2D/3D image sequences
  • Motion analysis and tracking
  • Video analysis and event recognition
  • Analysis for coding efficiency and increased error resilience
  • Analysis and tools for content adaptation
  • Multimedia content adaptation tools, transcoding and transmoding
  • Content summarization and personalization strategies
  • End-to-end quality of service support for Universal Multimedia Access
  • Semantic mapping and ontologies
  • Multimedia analysis for new and emerging applications
  • Multimedia analysis hardware and middleware
  • Semantic web and social networks
  • Advanced interfaces for content analysis and relevance feedback
  • Applications

http://www.ing.unibs.it/wiamis2010/

ESO's GigaGalaxy Zoom: The Sky, from Eye to Telescope

Through three giant images, the GigaGalaxy Zoom project reveals the full sky as it appears with the unaided eye from one of the darkest deserts on Earth, then zooms in on a rich region of the Milky Way using a hobby telescope, and finally uses the power of a professional telescope to reveal the details of an iconic nebula.

In the framework of the International Year of Astronomy 2009 (IYA2009) ESO's GigaGalaxy Zoom project aimed at connecting the sky as seen by the unaided eye with that seen by hobby and professional astronomers. The project reveals three amazing, ultra-high-resolution images of the night sky that online stargazers can zoom in on and explore in an incredible level of detail.

The GigaGalaxy Zoom project thus illustrates the vision of IYA2009, which is to help people rediscover their place in the Universe through the day- and night-time sky.

Most of the photographs comprising the three images were taken from two of ESO's observing sites in Chile, La Silla and Paranal. The wonderful quality of the images is a testament to the splendour of the night sky at these ESO sites, which are the most productive astronomical observatories in the world.

The renowned astrophotographers Serge Brunier and Stéphane Guisard, who are members of the The World at Night (TWAN) IYA2009 project, captured two of the GigaGalaxy Zoom images.

The first image by Brunier aims to present the sky as people have experienced it the world over, though in the far greater detail offered by top-notch stargazing conditions and with the view from both hemispheres. As such, the image provides a magnificent 800-million pixel panorama of the whole Milky Way.

http://www.gigagalaxyzoom.org

Flickroom

Flickroom is an Adobe AIR based application that provides the rich browsing experience Flickr users have long deserved. The dark theme ensures that your photographs look better than ever before!
You can now receive instant notifications for any activity on your photostream, upload photos by just drag-and-drop, add comments, mark faves, add notes, tweet about your photos and also chat with other Flickroom users.

http://www.flickroom.org/beta/