Textual similarity functions for the KNN method.

Some of them can be only be applied to IEText objects (or its subclasses) because require information stored in the tokens:

Cosine jcolibri.method.retrieve.KNNretrieval.similarity.local.textual.CosineCoefficient
Dice jcolibri.method.retrieve.KNNretrieval.similarity.local.textual.DiceCoefficient
Jaccard jcolibri.method.retrieve.KNNretrieval.similarity.local.textual.JaccardCoefficient
Overlap jcolibri.method.retrieve.KNNretrieval.similarity.local.textual.OverlapCoefficient

There is a similarity function that uses Apache Lucene to compare texts. This function can be applied to any Text subclass as not require any kind of extracted information

LuceneTextSimilarity jcolibri.method.retrieve.KNNretrieval.similarity.local.textual.LuceneTextSimilarity

Finally, the compressionbased package stores some similarity measures using compression implemented by Derek Bridge.

CompressionBased jcolibri.method.retrieve.KNNretrieval.similarity.local.textual.compressionbased.CompressionBased
NormalisedCompression jcolibri.method.retrieve.KNNretrieval.similarity.local.textual.compressionbased.NormalisedCompression

Tests 13 and 16 show how to use these similarity measures.