Improving Automatic Polyp Detection Using CNN by Exploiting Temporal Dependency in Colonoscopy Video
Peer reviewed, Journal article
MetadataShow full item record
Original versionIEEE journal of biomedical and health informatics. 2019, 24 (1), 180-193. 10.1109/JBHI.2019.2907434
Automatic polyp detection has been shown to be difficult due to various polyp-like structures in the colon and high interclass variations in polyp size, color, shape, and texture. An efficient method should not only have a high correct detection rate (high sensitivity) but also a low false detection rate (high precision and specificity). The state-of-the-art detection methods include convolutional neural networks (CNN). However, CNNs have shown to be vulnerable to small perturbations and noise; they sometimes miss the same polyp appearing in neighboring frames and produce a high number of false positives. We aim to tackle this problem and improve the overall performance of the CNN-based object detectors for polyp detection in colonoscopy videos. Our method consists of two stages: a region of interest (RoI) proposal by CNN-based object detector networks and a false positive (FP) reduction unit. The FP reduction unit exploits the temporal dependencies among image frames in video by integrating the bidirectional temporal information obtained by RoIs in a set of consecutive frames. This information is used to make the final decision. The experimental results show that the bidirectional temporal information has been helpful in estimating polyp positions and accurately predict the FPs. This provides an overall performance improvement in terms of sensitivity, precision, and specificity compared to conventional false positive learning method, and thus achieves the state-of-the-art results on the CVC-ClinicVideoDB video data set.