Package org.carrot2.text.preprocessing
package org.carrot2.text.preprocessing
Text preprocessing components.
-
ClassDescriptionPerforms basic preprocessing steps on the provided documents.Performs a complete preprocessing on the provided documents.Assigns document to label candidates.Applies basic filtering to words and phrases to produce candidates for cluster labels.Formats cluster labels for final rendering.Extracts frequent phrases from the provided document.Iterates over tokenized documents in
PreprocessingContext
.Document preprocessing context provides low-level (usually integer-coded) data structures useful for further processing.Information about all fields processed for the input documents.Sparse array encoding utilities.