Monday, March 28, 2011
Sunday, March 27, 2011
"Microsoft Research has just published a scientific paper (PDF) and a video showing how the Kinect body tracking algorithm works — it's almost as impressive as some of the uses the Kinect has been put to. This article summarizes how Kinect does it. Quoting: '... What the team did next was to train a type of classifier called a decision forest, i.e. a collection of decision trees. Each tree was trained on a set of features on depth images that were pre-labeled with the target body parts. That is, the decision trees were modified until they gave the correct classification for a particular body part across the test set of images. Training just three trees using 1 million test images took about a day using a 1000-core cluster.'"
In the late 19th and early 20th century, enigmatic photographer T. Enami (1859-1929) captured a number of 3D stereoviews depicting life in Meiji-period Japan.
A stereoview consists of a pair of nearly identical images that appear three-dimensional when viewed through a stereoscope, because each eye sees a slightly different image. This illusion of depth can also be recreated with animated GIFs like the ones here, which were created from Flickr images posted by Okinawa Soba. Follow the links under each animation for the original stereoviews and background information.
Saturday, March 26, 2011
The 6th International Conference on Embedded and Multimedia Computing (EMC-11), technically co-sponsored by FTRA, will be held in Enshi, China on August 11-13, 2011.
EMC-11 will be the most comprehensive conference focused on the various aspects of advances in Embedded and Multimedia (EM) Computing.
EMC-11 will provide an opportunity for academic and industry professionals to discuss the latest issues and progress in the area of EM. In addition, the conference will publish high quality papers which are closely related to the various theories and practical applications in EM.
In addition, the conference will publish high quality papers which are closely related to the various theories and practical applications in human-centric computing. Furthermore, we expect that the conference and its publications will be a trigger for further related research and technology improvements in this important subject.
The EMC-11 is the next event, in a series of highly successful International Conference on Embedded and Multimedia Computing, previously held as EMC-10 (Cebu, Philippines, Aug. 2010), EM-Com 2009 (Korea, Dec. 2009), UMC-08 (Australia, Oct. 2008), ESO-08(China, Dec. 2008), UMS-08 (Korea, April, 2008), UMS-07(Singapore, Jan. 2007), ESO-07(Taiwan, Dec. 2007), ESO-06(Korea, Aug. 2006).
Furthermore, we expect that the conference and its publications will be a trigger for further related research and technology improvements in this important subject. Each paper will be reviewed by at least three reviewers. The conference proceedings will be published by IEEE Press (IEEE eXpress Conference Publishing group) and all papers of the proceedings will be included in the IEEE Xplorer.
Friday, March 25, 2011
Two Ohio State University engineers inspect a lens that enables microscopes to capture 3-D images. Shown from left to right are inventors Lei Li, a postdoctoral researcher, and Allen Yi, associate professor of integrated systems engineering. Credit: Photo by Kevin Fitzsimons, courtesy of Ohio State University.
Engineers at Ohio State University have invented a lens that enables microscopic objects to be seen from nine different angles at once to create a 3D image.
Other 3D microscopes use multiple lenses or cameras that move around an object; the new lens is the first single, stationary lens to create microscopic 3D images by itself.
Allen Yi, associate professor of integrated systems engineering at Ohio State, and postdoctoral researcher Lei Li described the lens in a recent issue of the Journal of the Optical Society of America A.
Yi called the lens a proof of concept for manufacturers of microelectronics and medical devices, who currently use very complex machinery to view the tiny components that they assemble.
Though the engineers milled their prototype thermoplastic lens on a precision cutting machine, the same lens could be manufactured less expensively through traditional molding techniques, Yi said.
"Ultimately, we hope to help manufacturers reduce the number and sizes of equipment they need to miniaturize products," he added.
Researchers at Ohio State University have invented a 3-D microscope lens that gathers images of tiny objects from nine different angles at once. Here the lens captures a ballpoint pen tip that measures about 1 millimeter across. Credit: Image courtesy of Ohio State University.
The prototype lens, which is about the size of a fingernail, looks at first glance like a gem cut for a ring, with a flat top surrounded by eight facets. But while gemstones are cut for symmetry, this lens is not symmetric. The sizes and angles of the facets vary in minute ways that are hard to see with the naked eye.
"No matter which direction you look at this lens, you see a different shape," Yi explained. Such a lens is called a "freeform lens," a type of freeform optics.
Freeform optics have been in use for more than a decade. But Lei Li was able to write a computer program to design a freeform lens capable of imaging microscopic objects.
Then Yi and Li used a commercially available milling tool with a diamond blade to cut the shape from a piece of the common thermoplastic material polymethylmethacrylate, a transparent plastic that is sometimes called acrylic glass. The machine shaved bits of plastic from the lens in increments of 10 nanometers, or 10 billionths of a meter – a distance about 5,000 times smaller than the diameter of a human hair.
The final lens resembled a rhinestone, with a faceted top and a wide, flat bottom. They installed the lens on a microscope with a camera looking down through the faceted side, and centered tiny objects beneath the flat side.
Each facet captured an image of the objects from a different angle, which can be combined on a computer into a 3D image.
The engineers successfully recorded 3D images of the tip of a ballpoint pen – which has a diameter of about 1 millimeter – and a mini drill bit with a diameter of 0.2 millimeters.
A lens invented at Ohio State University enables microscopes to capture 3-D images of tiny objects. Credit: Photo by Kevin Fitzsimons, courtesy of Ohio State University.
"Using our lens is basically like putting several microscopes into one microscope," said Li. "For us, the most attractive part of this project is we will be able to see the real shape of micro-samples instead of just a two-dimensional projection."
In the future, Yi would like to develop the technology for manufacturers. He pointed to the medical testing industry, which is working to shrink devices that analyze fluid samples. Cutting tiny reservoirs and channels in plastic requires a clear view, and the depths must be carved with precision.
Computer-controlled machines – rather than humans – do the carving, and Yi says that the new lens can be placed in front of equipment that is already in use. It can also simplify the design of future machine vision equipment, since multiple lenses or moving cameras would no longer be necessary.
Other devices could use the tiny lens, and he and Li have since produced a grid-shaped array of lenses made to fit an optical sensor. Another dome-shaped lens is actually made of more than 1,000 tiny lenses, similar in appearance to an insect's eye.
With a few snapshots, you can build a detailed virtual replica.
Capturing an object in three dimensions needn't require the budget of Avatar. A new cell phone app developed by Microsoft researchers can be sufficient. The software uses overlapping snapshots to build a photo-realistic 3-D model that can be spun around and viewed from any angle.
"We want everybody with a cell phone or regular digital camera to be able to capture 3-D objects," says Eric Stollnitz, one of the Microsoft researchers who worked on the project.
To capture a car in 3-D, for example, a person needs to take a handful of photos from different viewpoints around it. The photos can be instantly sent to a cloud server for processing. The app then downloads a photo-realistic model of the object that can be smoothly navigated by sliding a finger over the screen. A detailed 360 degree view of a car-sized object needs around 40 photos, a smaller object like a birthday cake would need 25 or fewer.
If captured with a conventional camera instead of a cell phone, the photos have to be uploaded onto a computer for processing in order to view the results. The researchers have also developed a Web browser plug-in that can be used to view the 3-D models, enabling them to be shared online. "You could be selling an item online, taking a picture of a friend for fun, or recording something for insurance purposes," says Stollnitz. "These 3-D scans take up less bandwidth than a video because they are based on only a few images, and are also interactive."
To make a model from the initial snapshots, the software first compares the photos to work out where in 3-D space they were taken from. The same technology was used in a previous Microsoft research project, PhotoSynth, that gave a sense of a 3-D scene by jumping between different views (see video). However, PhotoSynth doesn't directly capture the 3-D information inside photos.
"We also have to calculate the actual depth of objects from the stereo effect," says Stollnitz, "comparing how they appear in different photos." His software uses what it learns through that process to break each image apart and spread what it captures through virtual 3-D space (see video, below). The pieces from different photos are stitched together on the fly as a person navigates around the virtual space to generate his current viewpoint, creating the same view that would be seen if he were walking around the object in physical space.
"This is an interesting piece of software," says Jason Hurst, a product manager with 3DMedia, which makes software that combines pairs of photos to capture a single 3-D view of a scene. However, using still photos does have its limitations, he points out. "Their method, like ours, is effectively time-lapse, so it can't deal with objects that are moving," he says.
3DMedia's technology is targeted at displays like 3-D TVs or Nintendo's new glasses-free 3-D handheld gaming device. But the 3-D information built up by the Microsoft software could be modified to display on such devices, too, says Hurst, because the models it builds contain enough information to create the different viewpoints for a person's eyes.
Hurst says that as more 3-D-capable hardware appears, people will need more tools that let them make 3-D content. "The push of 3-D to consumers has come from TV and computer device makers, but the content is lagging," says Hurst. "Enabling people to make their own is a good complement."
Sunday, March 20, 2011
Our cities are filled with buildings, roads, cars, buses, trains, bikes, parks and gardens. They are crisscrossed with power, water, sewage and transport systems. They are built by engineers, architects, planners, technologists, doctors, designers and artists. Our cities are shaped by our environment, our society and our culture. And each and every part is built on mathematics.
Join Marcus du Sautoy, mathematician, best-selling author, broadcaster and the Charles Simonyi Professor for the Public Understanding of Science, on a mathematical adventure in the city. Marcus and his team of mathemagicians are constructing walking tours of the city — but we need your help!
Enter our competition and shine a mathematical spotlight on your city. If you know a good location that tells a fun mathematical story — a piece of interesting architecture, a mathematical sculpture or the maths behind something more mundane, such as traffic lights — then enter our competition and tell us all about it.
Winning entries will become part of our virtual mathscape of cities around the world and will help Marcus and his team develop their walking tours. And of course — you can win great prizes! Including:
- a year’s subscription to Nature, kindly provided by Nature Publishing Group;
- best-selling popular science books, including the “Last word” series kindly provided byNew Scientist;
- going down in mathematical history by naming a mathematical object;
- and showcasing your entry with other finalists at an event in Oxford, and meeting Marcus and other mathematical explorers!
Monday, March 14, 2011
Change log 2011-03-15 1.0.4090
1. SURF descriptor is now supported in img(Rummager). The implementation was based on EMGU CV cross platform .Net wrapper.
2. TD-TREE for local descriptor matching
3. Visual words descriptor using SURF descriptor
4. Generate custom visual dictionaries
5. New fusion methods
- Nanson Rule
- Borda Count Min
- Borda Count MAX
- Fuzzy Ruled Based Fusion *
- Borda Pessimism **
- Borda Optimism **
- Borda Neutral **
7. Batch mode is up to 10 times faster
8. Fixed several bugs
9. Save batch mode results in TREC format - and use TRECFiles Evaluator to evaluate the results
10. Relevance Feedback is now working again
11. img(Finder) is now working again
* A novel, simple, and efficient method for rank-based late fusion of retrieval result-lists. The approach taken is rule-based, employs a fuzzy system, and does not require training data.
** M. Zarghami. Soft computing of the borda count by fuzzy linguistic quantifiers. Appl. Soft Comput., 11(1):1067–1073, 2011.
*** V. S. Kalogeiton, D. P. Papadopoulos, S. A. Chatzichristofis and Y. S. Boutalis, “A NOVEL VIDEO SUMMARIZATION METHOD BASED ON COMPACT COMPOSITE DESCRIPTORS AND FUZZY CLASSIFIER”, «1st International Conference for Undergraduate and Postgraduate Students in Computer Engineering, Informatics, related Technologies and Applications», pp. 237-246, October 14 to 15, 2010, Patra, Greece.
Coming Soon:Sift Descriptor
Sift visual words
Saturday, March 12, 2011
Predicting Future Appearance: New Computer-Based Technique Ages Photographic Images of People's Faces
ScienceDaily (Mar. 11, 2011) — A Concordia graduate student has designed a promising computer program that could serve as a new tool in missing-child investigations and matters of national security. Khoa Luu has developed a more effective computer-based technique to age photographic images of people's faces -- an advance that could help to identify missing kids and criminals on the lam.
"Research into computer-based age estimation and face aging is a relatively young field," says Luu, a PhD candidate from Concordia's Department of Computer Science and Software Engineering whose master's thesis explores new and highly effective ways to estimate age and predict future appearance. His work is being supervised by professors Tien Dai Bui and Ching Suen.
Best recorded technique
"We pioneered a novel technique that combines two previous approaches, known as active appearance models (AAMs) and support vector regression (SVR)," says Luu. "This combination dramatically improves the accuracy of age-estimation. In tests, our method achieved the promising results of any published approach."
Most face-aged images are currently rendered by forensic artists. Although these artists are trained in the anatomy and geometry of faces, they rely on art rather than science. As a result, predicted faces drawn by artists can differ widely.
Face changes at different stages
"Our approach to computerized face aging relies on combining existing techniques," says Luu. "The human face changes in different ways at different stages of life. During the growth and development stage, the physical structure of the face changes, becoming longer and wider; in the adult aging phase, the primary changes to the face are in soft tissue. Wrinkles and lines form, and muscles begin to lose their tone."
All this information has to be incorporated into the computer algorithm. Since there are two periods with fundamentally different aging mechanisms, Luu had to construct two different 'aging functions' for this project.
To develop his face aging technique, Luu first used a combination of AAMs and SVR methods to interpret faces and "teach" the computer aging rules. Then, he input information from a database of facial characteristics of siblings and parents taken over an extended period. Using this data, the computer then predicts an individual's facial appearance at a future period.
"Our research has applications in a whole range of areas," says Luu. "People in national security, law enforcement, tobacco control and even in the cosmetic industry can all benefit from this technology."
This study was supported by the Natural Sciences and Engineering Research Council of Canada and the Vietnamese Ministry of Education and Training.
Very special thanks to G. Sirakoulis
Friday, March 4, 2011
The 2nd Summer School on Social Media Retrieval (http://www.s3mr.eu/) will be held during June 26 – 01 July, 2011 in Antalya, Turkey. The event is co-organized by the EU PetaMedia Network of Excellence and EIT ICT LABS.
APPLICATION DEADLINE: March 22, 2011.
Multimedia content has become ubiquitous on the web, creating new challenges for indexing, access, search and retrieval. At the same time, much of this content is made available on content sharing websites such as YouTube or Flickr, or shared on social networks like Facebook. In such environments, the content is usually accompanied with metadata, tags, ratings, comments, information about the uploaders and their social network, etc. Analysis of these "social media" shows a great potential in improving the performance of traditional multimedia analysis, indexing and retrieval approaches by bridging the semantic gap between the "objective" multimedia content analysis and "subjective" users' needs and impressions. The integration of these aspects however is non-trivial and has created a vibrant, interdisciplinary field of research.
Based on the great success of its previous edition in Interlaken, Switzerland (http://www.s3mr.eu/2010/), the 2nd Summer School on Social Media Retrieval (S3MR) aims at bringing together young researchers from neighboring disciplines, offering
(1) Lectures delivered by experts from academia and industry providing a clear and in-depth summary of state-of-the-art research in social media retrieval,
(2) Collaborative projects in small groups providing hands-on experience on integrative work on selected problems from the field.
* Social data analysis
* Multimedia content analysis
* Automatic multimedia annotation/tagging
* Multimedia indexing/search/retrieval
* Implicit media tagging
* Collaborative tagging
We are offering a limited number of grants, covering complete summer school costs as well as accommodation in the 5 star summer school venue on a full board basis.
For more information and subscription, please visit our webpage:
This award will be presented every year to a researcher whose PhD thesis has made contributions and has the potential of very high impact in multimedia computing, communication and applications. The goal will be to evaluate contributions towards advances in multimedia including multimedia processing, multimedia systems, multimedia network protocols and services, multimedia applications and interfaces. The award will recognize members of the SIGMM community and their research contributions in their PhD theses as well as the potential of impact of their PhD theses in multimedia area.
The selection committee will focus on candidates’ contributions as judged by innovative ideas and potential impact resulting from their PhD work. The award includes a $500 honorarium, an award certificate of recognition, and an invitation for the recipient to receive the award at a current year’s SIGMM-sponsored conference, the ACM International Conference on Multimedia (ACM Multimedia). A public citation for the award will be placed on the SIGMM website, in the SIGMM Records e-newletter as well as in the ACM e-newsletter.
The award honorarium, the award plaque of recognition and travel expenses to the ACM International Conference on Multimedia will be fully sponsored by the SIGMM budget.
Nominations will be solicited by March 30th, 2011 deadline, with decision made by July 30, in time to allow the above recognition and award presentation at ACM Multimedia in that Fall (October/November). The PhD thesis to be nominated for the award must be deposited at the nominee’s Academic Institution between January and December of the previous year of nomination. Nominations for the award must include:
- A statement summarizing the candidate’s PhD thesis contributions and potential impact, and justification of the nomination (two pages maximum);
- PhD thesis (upload at: http://sigmm.utdallas.edu:8080/drupal/) (open early March 2011);
- Curriculum Vitae of the nominee;
- Three endorsement letters supporting the nomination including the significant PhD thesis contributions of the candidate. Each endorsement should be no longer than 500 words with clear specification of nominee PhD thesis contributions and potential impact on the multimedia field;
- A concise statement (one sentence) of the PhD thesis contribution for which the award is being given. This statement will appear on the award certificate and on the website.
The PhD thesis should be uploaded to http://sigmm.utdallas.edu:8080/drupal/. You will receive a Paper ID after the submission. The other materials should be emailed to the committee chair with the title "[YourPaperID] PhD Thesis Award Submission".
The nomination rules are:
- The nominee can be any member of the scientific community.
- The nominator must be a SIGMM member.
- No self-nomination is allowed.
The submission process of nominations will be preceded by the call for nominations. The call of nominations will be widely publicized by the SIGMM awards committee and by the SIGMM Executive Board at the different SIGMM venues, such as during the SIGMM premier ACM Multimedia conference (at the SIGMM Business Meeting), on the SIGMM web site, via SIGMM mailing list, and via SIGMM e-newsletter between September and December of the previous year.