Classification of feature-vectors using KNN classifier.
The KNN class contains the classifier. It can classify() new datapoints as soon as it is properly trained using the train() method. The test() method provides a way to classify many vectors at once, and return the classifiers accuracy compared to a gold standard.
Author: | Kjetil Valle <kjetilva@stud.ntnu.no> |
---|
K-nearest neighbors classifier.
Classifier for labeled data in feature-vector format. Supports k-nearest classification against trained data samples, and 1-nearest classification against class centroids.
Classifies a list of query cases.
When classifying only those features that are active are used, all other features are ignored. The set of active features can be changed by set_active_features().
Feature matrix qs is similar to that used in train(), i.e a NxM matrix where N is number of features and M documents.
Returns classification of each of the input cases.
Changes the set of active feature.
Takes a list of features to make active. Could either be a list of feature indices, or boolean list with length equal to number of features where true == active. If None, all features are activated.
Tests this classifier against a set of labeled data.
It is assumed that the classifier has been trained before this method is called.
features is a NxM (features x documents) feature matrix, and gold a list of labels belonging to each of the documents in the feature matrix.
Returns the accuracy of the classifier over the training data.
Trains the KNN on a set of data.
Uses NxM feature matrix features with M samples, each of N features. See output from data.read_files().
The list of labels correspond to each of the M samples.