Class BasicPreprocessingPipeline

java.lang.Object
org.carrot2.attrs.AttrComposite
org.carrot2.text.preprocessing.BasicPreprocessingPipeline
All Implemented Interfaces:
AcceptingVisitor, ContextPreprocessor

public class BasicPreprocessingPipeline extends AttrComposite implements ContextPreprocessor
Performs basic preprocessing steps on the provided documents. The preprocessing consists of the following steps:
  1. InputTokenizer
  2. CaseNormalizer
  3. LanguageModelStemmer
  4. StopListMarker