The objects of this package store auxiliar information extracted from texts.
The information extracted from the text is stored in the IEtext object. There are
different kinds of information that will be obtained by dedicated methods:
- Phrases identified in the text.
- Features: identifier-value pairs extracted from the text.
- Topics: combining phrases and features a topic can be associated to a text. A topic is a classification of the text.
Phrases and Features are stored using the objects implemented in the info subpackage. That package
stores three objects that aid in the representation of the extracted information:
- PhraseInfo: stores extracted phrases.
- FeatureInfo: stores extracted features.
- WeightedRelation: represents a weighted relation between two tokens. These relations are found by the glossary and thesaurus methods.
Following picture illustrates the hole organization: