Package jcolibri.extensions.textual.IE.representation

Represents a Text that will be processed to extract its information.

See:
          Description

Class Summary
IEText Represents a Textual attribute that will be processed to extract information.
Paragraph Represents a paragraph of the text.
Sentence Represents a sentence of the text.
Token A token represents an elementary piece of text.
 

Package jcolibri.extensions.textual.IE.representation Description

Represents a Text that will be processed to extract its information.
A text is composed by paragraphs, paragraphs by sentences and sentences by tokens:

Tokens represent a word in the text. These objects store information like:

The organization in paragraphs, sentences and tokens is performed by specific methods depending on the chosen organization.

The information extracted from the text is stored in the IEtext object. There are different kinds of information that will be obtained by dedicated methods:

Phrases and Features are stored using the objects implemented in the info subpackage. That package stores three objects that aid in the representation of the extracted information:

Following picture illustrates the hole organization:


GAIA - Group for Artificial Intelligence Applications
http://gaia.fdi.ucm.es