Human action recognition in videos using stable features
Abstract
Human action recognition is still a challenging problem and researchers are focusing to investigate this problem using different techniques. We propose a robust approach for human action recognition. This is achieved by extracting stable spatio-temporal features in terms of pairwise local binary pattern (P-LBP) and scale invariant feature transform (SIFT). These features are used to train an MLP neural network during the training stage, and the action classes are inferred from the test videos during the testing stage. The proposed features well match the motion of individuals and their consistency, and accuracy is higher using a challenging dataset. The experimental evaluation is conducted on a benchmark dataset commonly used for human action recognition. In addition, we show that our approach outperforms individual features i.e. considering only spatial and only temporal feature.