Class DocumentAssigner

java.lang.Object
org.carrot2.attrs.AttrComposite
org.carrot2.text.preprocessing.DocumentAssigner
All Implemented Interfaces:
AcceptingVisitor

public class DocumentAssigner extends AttrComposite
Assigns document to label candidates. For each label candidate from PreprocessingContext.AllLabels.featureIndex an BitSet with the assigned documents is constructed. The assignment algorithm is rather simple: in order to be assigned to a label, a document must contain at least one occurrence of each non-stop word from the label.

This class saves the following results to the PreprocessingContext :

This class requires that InputTokenizer, CaseNormalizer, StopListMarker, PhraseExtractor and LabelFilterProcessor be invoked first.