Functions for creating frequency based feature vector from text.
The function of interest is text_to_vector(), which creates term frequency (TF) or term frequency-inverse document frequency (TF-IDF) vectors from lists of documents. Results are output in form of a term-document matrix.
Author: | Kjetil Valle <kjetilva@stud.ntnu.no> |
---|
Create dictionaries of term frequencies based on documents
Metric must be either FrequencyMetrics.TF or FrequencyMetrics.TF_IDF.
Create frequency based feature-vector from text
Metric must be either FrequencyMetrics.TF or FrequencyMetrics.TF_IDF.