Disam: Density Independent and Scale Aware Model for Crowd Counting and Localization

People counting in high density crowds is emerging as a new frontier in crowd video surveillance. Crowd counting in high density crowds encounters many challenges, such as severe occlusions, few pixels per head, and large variations in person's head sizes. In this paper, we propose a novel Density Independent and Scale Aware model (DISAM), which works as a head detector and takes into account the scale variations of heads in images. Our model is based on the intuition that head is the only visible part in high density crowds. In order to deal with different scales, unlike off-the-shelf Convolutional Neural Network (CNN) based object detectors which use general object proposals as inputs to CNN, we generate scale aware head proposals based on scale map. Scale aware proposals are then fed to the CNN and it renders a response matrix consisting of probabilities of heads. We then explore non-maximal suppression to get the accurate head positions. We conduct comprehensive experiments on two benchmark datasets and compare the performance with other state-of-theart methods. Our experiments show that the proposed DISAM outperforms the compared methods in both frame-level and pixel-level comparisons.

Utgiver

Institute of Electrical and Electronics Engineers (IEEE)

Tidsskrift

Proceedings of IEEE international conference on image processing