Using Deep Convolutional Networks to Detect Roads in Aerial Images

This thesis presents a brief introduction to aerial road detection and semantic segmentation of images. Datasets based on aerial imagery is often automatically generated from existing map data, which causes the dataset to be afflicted by label noise. Supervised training with datasets containing inconsistent labeling will penalize the classifier for making correct predictions, and can impact the resulting performance. The thesis investigates different approaches to decrease the impact of noisy labels in deep neural networks. This includes the bootstrapping method which modifies the loss function, and adjusting the training regime through the use of curriculum learning.

The bootstrapping method incorporates the predictions of the model in the cross-entropy loss function. This loss function modifies the label targets through a convex combination between the prediction and the label.

The thesis investigates curriculum learning and its impact on classifier accuracy. A curriculum strategy is first defined, which estimates the difficulty of every example. The classifier is then trained by presenting "easier" examples at the beginning, and then gradually introduce "harder" examples to the training set. This thesis proposes a curriculum strategy based on estimating inconsistency between a prediction made by a teacher model and the corresponding label.

The results from this thesis demonstrate that curriculum learning can improve generalization accuracy for the road detection task, and that a curriculum strategy based on estimating inconsistency is valid. Applying the bootstrapping loss function showed some robustness to the label noise present in aerial image datasets. However, this result was not statistically significant.

Publisher

NTNU