Effective Descriptors for Human Action Retrieval from 3D Mesh Sequences
Journal article, Peer reviewed
MetadataVis full innførsel
Two novel methods for fully unsupervised human action retrieval using 3D mesh sequences are presented. The first achieves high accuracy but is suitable for sequences consisting of clean meshes, such as artificial sequences or highly post-processed real sequences, while the second one is robust and suitable for noisy meshes, such as those that often result from unprocessed scanning or 3D surface reconstruction errors. The first method uses a spatio-temporal descriptor based on the trajectories of 6 salient points of the human body (i.e. the centroid, the top of the head and the ends of the two upper and two lower limbs) from which a set of kinematic features are extracted. The resulting features are transformed using the wavelet transformation in different scales and a set of statistics are used to obtain the descriptor. An important characteristic of this descriptor is that its length is constant independent of the number of frames in the sequence. The second descriptor consists of two complementary sub-descriptors, one based on the trajectory of the centroid of the human body across frames and the other based on the Hybrid static shape descriptor adapted for mesh sequences. The robustness of the second descriptor derives from the robustness involved in extracting the centroid and the Hybrid sub-descriptors. Performance figures on publicly available real and artificial datasets demonstrate our accuracy and robustness claims and in most cases the results outperform the state-of-the-art.