Implementation and Adaptation of a System for Automatic Classification of Birdsong.

The background for this master thesis is a collaboration between Able Magic and NTNU. At the request of Able Magic there has been conducted a feasibility study by NTNU, which has led to the development of a bird classification system. This system is based on identification of bird song and are aimed at 21 different Norwegian bird species. Able Magic wants to create a commercial version of this system implemented as an application for the smart phone market.

The system is based on model based segmentation, modeling each bird as a Gaussian Mixture Model (GMM). These models are trained with the Expectation Maximization algorithm on labeled training data. Furthermore, these models are used in the identification of the bird song. Hence, one can say that the system is divided into two parts, namely training and identification.

This thesis is divided into two main tasks. The first task is to convert the basis system created by NTNU into a production model ready for commercial use. The initial system is designed using three different development tools, respectively, Hidden Markov Model Toolkit, Praat and Matlab. Matlab is not free and the scripts developed in Matlab therefore needed to be converted to another language which is free. After converting the different scripts and restructuring them, the production model was created by combining the scripts to a complete system. Now the system consists of three scripts, where one is handling the training process and another one is handling the identification process. In addition there is one script training a linear classifier used in the identification. All three scripts are implemented in Perl.

The second task of the thesis is aimed at improving the system performance by looking at the possibility of doing adaptation of training data. This is an effort of trying to improve the existing models. With this adaptation experiment the possibility of adapting data automatically without any manual labeling of the data is going to be investigated. The adaptation experiment has been performed by training GMMs with a given amount of training data, and then introducing more data used to train new GMMs. The new training data can either be labeled by the system output when doing identification with the initial models (automatically), or using the manual labeling which is known to be correct. By comparing the results obtained by these two methods it is possible to tell if the automatic adaptation provides sufficient results.

From the adaptation experiments it is seen that the number of misclassified birds is increasing when using the new adaptation models generated by automatic labeling of the adaptation data. There is an increase of about 2.5\% in the error rate when using these models compared to when using the initial models trained with less data. This shows that there is difficult to perform the adaptation of data automatically with the current system performance and amount of available data.

Utgiver

NTNU