Some of them can be only be applied to IEText objects (or its subclasses) because require information stored in the tokens:
There is a similarity function that uses Apache Lucene to compare texts. This function can be applied to any Text subclass as not require any kind of extracted information
LuceneTextSimilarity | jcolibri.method.retrieve.KNNretrieval.similarity.local.textual.LuceneTextSimilarity |
Finally, the compressionbased package stores some similarity measures using compression implemented by Derek Bridge.
Tests 13 and 16 show how to use these similarity measures.