Class PreprocessingContext.AllPhrases

java.lang.Object
org.carrot2.text.preprocessing.PreprocessingContext.AllPhrases
Enclosing class:
PreprocessingContext

public class PreprocessingContext.AllPhrases extends Object
Information about all frequently appearing sequences of words found in the input documents. Each entry in each array corresponds to one sequence.

All arrays in this class have the same length and values across different arrays correspond to each other for the same index.

  • Field Details

    • wordIndices

      public int[][] wordIndices
      Pointers to PreprocessingContext.AllWords for each word in the phrase sequence.

      This array is produced by PhraseExtractor.

    • tf

      public int[] tf
      Term frequency of the phrase.

      This array is produced by PhraseExtractor.

    • tfByDocument

      public int[][] tfByDocument
      Term frequency of the phrase for each document. The encoding of this array is similar to PreprocessingContext.AllWords.tfByDocument: consecutive pairs of: document index, frequency.

      This array is produced by PhraseExtractor. The order of documents in this array is not defined.

  • Constructor Details

    • AllPhrases

      public AllPhrases()
  • Method Details