By Tony Tung and Takashi Matsuyama
This paper presents a novel approach that achieves 3D video understanding. 3D video consists of a stream of 3D models of subjects in motion. The acquisition of long sequences requires large storage space (2 GB for 1 min). Moreover, it is tedious to browse data sets and extract meaningful information. We propose the topology dictionary to encode and describe 3D video content. The model consists of a topology-based shape descriptor dictionary which can be generated from either extracted patterns or training sequences. The model relies on 1) topology description and classification using Reeb graphs, and 2) a Markov motion graph to represent topology change states. We show that the use of Reeb graphs as the high-level topology descriptor is relevant. It allows the dictionary to automatically model complex sequences, whereas other strategies would require prior knowledge on the shape and topology of the captured subjects. Our approach serves to encode 3D video sequences, and can be applied for content-based description and summarization of 3D video sequences. Furthermore, topology class labeling during a learning process enables the system to perform content-based event recognition. Experiments were carried out on various 3D videos. We showcase an application for 3D video progressive summarization using the topology dictionary.