jcolibri.extensions.textual.carrot2
Class CarrotClusterer

java.lang.Object
  extended by jcolibri.extensions.textual.carrot2.CarrotClusterer

public class CarrotClusterer
extends java.lang.Object

Clusters documents using the Carrot2 framework. This framework uses Lucene to index and retrieve relevant documents for a query, and then cluster them assigning a "descriptive label" for each one.

To learn how to use this class see the TestCarrot example.

Version:
1.0
Author:
Juan A. Recio-García
See Also:
LuceneIndex, TestCarrot

Constructor Summary
CarrotClusterer(LuceneIndex index, java.lang.String[] searchFields)
          Creates a Carrot Clusterer for the given Lucene Index.
CarrotClusterer(LuceneIndex index, java.lang.String[] searchFields, int maxclusters)
          Creates a Carrot Clusterer for the given Lucene Index that returns a maximum number of documents in each search.
 
Method Summary
 CarrotClusteringResult cluster(java.lang.String query)
          Clusters the documents for the given query.
 CarrotClusteringResult cluster(java.lang.String query, int maxResults)
          Clusters the documents for the given query, retrieving a maximum of documents from Lucene.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

CarrotClusterer

public CarrotClusterer(LuceneIndex index,
                       java.lang.String[] searchFields)
Creates a Carrot Clusterer for the given Lucene Index.

Parameters:
index - Index of documents
searchFields - Fields where search inside the document. Each lucene index is divided in several fields an the search can be performed in some of them.

CarrotClusterer

public CarrotClusterer(LuceneIndex index,
                       java.lang.String[] searchFields,
                       int maxclusters)
Creates a Carrot Clusterer for the given Lucene Index that returns a maximum number of documents in each search.

Parameters:
index - Index of documents
searchFields - Fields where search inside the document. Each lucene index is divided in several fields an the search can be performed in some of them.
maxclusters - Max number of clusters to return (approximately).
Method Detail

cluster

public CarrotClusteringResult cluster(java.lang.String query)
Clusters the documents for the given query.


cluster

public CarrotClusteringResult cluster(java.lang.String query,
                                      int maxResults)
Clusters the documents for the given query, retrieving a maximum of documents from Lucene.


GAIA - Group for Artificial Intelligence Applications
http://gaia.fdi.ucm.es