Searching and Classifying non-textual information
MetadataShow full item record
This dissertation contains a set of contributions that deal with search or classification of non-textual information. Each contribution can be considered a solution to a specific problem, in an attempt to map out a common ground. The problems cover a wide range of research fields, including search in music, classifying digitally sampled music, visualization and navigation in search results, and classifying images and Internet sites. On classification of digitally sample music, as method for extracting the rhythmic tempo was disclosed. The method proved to work on a large variety of music types with a constant audible rhythm. Furthermore, this rhythmic properties showed to be useful in classifying songs into music groups or genre. On search in music, a technique is presented that is based on rhythm and pitch correlation between the notes in a query theme and the notes in a set of songs. The scheme is based on a dynamic programming algorithm which attempts to minimize the error between a query theme and a song. This operation includes finding the best alignment, taking into account skipped notes and additional notes, use of different keys, tempo variations, and variances in pitch and time information. On image classification, a system for classifying whole Internet sites based on the image content, was proposed. The system was composed of two parts; an image classifier and a site classifier. The image classifier was based on skin detection, object segmentation, and shape, texture and color feature extraction with a training scheme that used genetic algorithms. The image classification method was able to classify images with an accuracy of 90%. By classifying multi-image Internet web sites this accuracy was drastically increased using the assumption that a site only contains one type of images. This assumption can be defended for most cases. On search result visualization and navigation, a system was developed involving the use of a state-of-the-art search engine together with a graphical front end to improve the user experience associated with search in unstructured data. Both structured and unstructured data with the help of entity extraction can be indexed in a modern search engine. Combining this with a multidimensional visualization based on heatmaps with navigation capabilities showed to improve the data value and search experience on current search systems.
Has partsArentz, W.A.; Olstad, B. Classifying offensive sites. Journal of Computer Vision and Image Understanding. 94: 295-310, 2004.
Arentz, W.A.; Øhrn, A.. Multidimensional Visualization and Navigation in Search Results. 8th International Conference on Knowledge Based Intelligent Information & Engineering Systems, 2004.