Sparse coded handcrafted and deep features for colon capsule video summarization

Capsule endoscopy, which uses a wireless camera to take images of the digestive track, is emerging as an alternative to traditional wired colonoscopy. A single examination produces a sequence of approximately 50,000 frames. These sequences are manually reviewed, which is time consuming and typically takes about 45-90 minutes and requires the undivided concentration of the reviewer. In this paper, we propose a novel capsule video summarization framework using sparse coding and dictionary learning in feature space. Video frames are clustered into superframes based on power spectral density, and cluster representative frames are used for video summarization. Handcrafted and deep features that are extracted for representative frames are sparse coded using a learned dictionary. Sparse coded features are later used for training SVM classifier. The proposed method was compared with state-of-the-art methods based on sensitivity and specificity. The achieved results show that our proposed framework provides robust capsule video summarization without losing informative segments.

Utgiver

Institute of Electrical and Electronics Engineers (IEEE)

Tidsskrift

Computer-Based Medical Systems