Package jcolibri.method.retrieve.NNretrieval.similarity.local.textual

Textual similarity functions for the KNN method.

See:
          Description

Class Summary
CosineCoefficient Cossine Coefficient Similarity.
DiceCoefficient Dice Coefficient Similarity.
JaccardCoefficient Jaccard Coefficient Similarity.
LuceneTextSimilarity Computes the similarity between two texts using Lucene.
OverlapCoefficient Overlap Coefficient Similarity.
TextualSimUtils Utilities to compute textual similarities
TextualSimUtils.WeightedString Represents a string with an asssociated weight.
 

Package jcolibri.method.retrieve.NNretrieval.similarity.local.textual Description

Textual similarity functions for the KNN method.

Some of them can be only be applied to IEText objects (or its subclasses) because require information stored in the tokens:

Cosine jcolibri.method.retrieve.KNNretrieval.similarity.local.textual.CosineCoefficient
Dice jcolibri.method.retrieve.KNNretrieval.similarity.local.textual.DiceCoefficient
Jaccard jcolibri.method.retrieve.KNNretrieval.similarity.local.textual.JaccardCoefficient
Overlap jcolibri.method.retrieve.KNNretrieval.similarity.local.textual.OverlapCoefficient

There is a similarity function that uses Apache Lucene to compare texts. This function can be applied to any Text subclass as not require any kind of extracted information

LuceneTextSimilarity jcolibri.method.retrieve.KNNretrieval.similarity.local.textual.LuceneTextSimilarity

Finally, the compressionbased package stores some similarity measures using compression implemented by Derek Bridge.

CompressionBased jcolibri.method.retrieve.KNNretrieval.similarity.local.textual.compressionbased.CompressionBased
NormalisedCompression jcolibri.method.retrieve.KNNretrieval.similarity.local.textual.compressionbased.NormalisedCompression

Tests 13 and 16 show how to use these similarity measures.


GAIA - Group for Artificial Intelligence Applications
http://gaia.fdi.ucm.es