Package org.carrot2.clustering.lingo
Class ClusterBuilder
java.lang.Object
org.carrot2.attrs.AttrComposite
org.carrot2.clustering.lingo.ClusterBuilder
- All Implemented Interfaces:
AcceptingVisitor
Builds cluster labels based on the reduced term-document matrix and assigns documents to the
labels.
-
Field Summary
FieldsModifier and TypeFieldDescriptionPercentage of overlap between two cluster's document sets at which to merge the clusters.The method of assigning documents to labels when forming clusters.Weight of multi-word labels relative to one-word labels.Phrase length at which the overlong multi-word labels should start to be penalized.Phrase length at which the overlong multi-word labels should be removed completely.Fields inherited from class org.carrot2.attrs.AttrComposite
attributes
-
Constructor Summary
Constructors -
Method Summary
Methods inherited from class org.carrot2.attrs.AttrComposite
accept
-
Field Details
-
phraseLabelBoost
Weight of multi-word labels relative to one-word labels. Low values will result in more one-word labels being produced, higher values will favor multi-word labels. -
phraseLengthPenaltyStart
Phrase length at which the overlong multi-word labels should start to be penalized. Phrases of length smaller thanphraseLengthPenaltyStart
will not be penalized. -
phraseLengthPenaltyStop
Phrase length at which the overlong multi-word labels should be removed completely. Phrases of length larger thanphraseLengthPenaltyStop
will be removed. -
clusterMergingThreshold
Percentage of overlap between two cluster's document sets at which to merge the clusters. Low values will result in more aggressive merging, which may lead to irrelevant documents in clusters. High values will result in fewer clusters being merged, which may lead to very similar or duplicated clusters. -
labelAssigner
The method of assigning documents to labels when forming clusters.
-
-
Constructor Details
-
ClusterBuilder
public ClusterBuilder()
-