Tuesday, February 10, 2009

Some Thoughts on Compound Video Coding

Article from: http://masterxinli.wordpress.com

While at Sharp, I worked on compound image coding problem - a compound image consists of the mixture of photographic pictures, graphics and texts. Djvu and PDF have become the standard document images formats. In the past four years, especially due to the increasing popularity of YouTube, more and more video clips are available online. I have noticed that there seems to be a need for the study of compound video coding - the counterpart of compound image coding.

The compound nature of video source is particularly valid in applications related to distance learning (mixture of PPT slides and classroom experience), multimedia presentation (mixture of text slides and graphic/motion pictures) and gaming (screenshot of video games). But the definition of compound source can be generalized to incorportate  more traditional view - e.g., foreman sequence is compound because it contains the mixture of slow and fast camera motion; flower-garden is compound in the sense of mixing objects at varying scene depths (layered representation is the essential idea underlying MRC adopted by djvu image coding algorithm). Of course,  segmentation will likely be the main technical challenge again in compound video coding. But from a system perspective, unifying coding with analysis is desirable because it supports both higher coding efficiency and content-based retrieval.

