Comparison of Principal Component Analysis and Spectral Angle Mapping for Identification of Materials in Terahertz Transmission Measurements
MetadataVis full innførsel
The terahertz range of the electromagnetic spectrum ranges from 0.1 to 10 THz, and has some unique properties which make it interesting for security applications. The identification of a range of dangerous substances is possible using THz radiation, because many of these materials feature characteristic absorption lines in this regime. Another property is the ability to penetrate common sealing materials, such as paper, plastic and cloth, enabling the possibility for identification of concealed substances. This thesis compares two methods, namely principal component analysis (PCA) and spectral angle mapping (SAM), for identification of different materials acting as simulants for dangerous substances. PCA is a method which transforms a number of correlated variables into a smaller number of uncorrelated variables, called principal components. The original data is projected on to these, forming a new coordinate system where the original data is expressed in an optimal way, using much fewer dimensions. SAM is a spectral recognition technique, which calculates the dot product between an unknown spectrum, and a reference spectrum, both treated as vectors. Measurements on samples containing Tartaric acid, Lactose and RDX (an explosive) were carried out using Terahertz time-domain spectroscopy, and the spectral fingerprints were obtained, and used for training each algorithm. Two spectral characteristics were considered: The absorption spectrum itself, and its derivative, both investigated for two different window widths. Four terahertz images for testing the algorithms were acquired, one using no barrier, and three using either paper, plastic or a piece of cloth for covering the samples. Also tested was the ability to recognize a material when its sample properties differ from those used for training the algorithms, by looking at four different Tartaric acid samples. The algorithms were implemented using MATLAB, and compared using ROC curves. The performance of PCA showed that careful consideration must be taken when choosing the number of principal components, and that the optimal number differs depending on spectral characteristic. In general, very good results were obtained when appropriate windowing was applied, and the best overall performance resulted from applying the narrower window, both for PCA and SAM. A true positive rate above 0.9 with a false positive rate of less than 0.2 could be obtained, regardless of barrier, also in the case of Tartaric acid. For PCA, these results were obtained using the absorption spectrum, while for SAM, this was the case regardless of spectral characteristic. The paper and plastic barriers were not challenging for either algorithm, and using these yielded essentially the same results as using no barrier in most cases. There were some differences in the performance of PCA and SAM, but these were small. The most challenging barrier was the cloth, for which classification using SAM with the absorption spectrum was slightly better than PCA, but the advantage was small.