Adaptive Feature Classifier Properties Window

You need to retrain the project before any changes made to these settings can take affect.

Text Filtering

This group has the following settings:

Use digits

This setting controls whether the classifier uses digits as features or ignores them during text filtering. (Default: Cleared)

Min. word length

All words that are shorter than this value are ignored during text filtering. Independently of word length, features with a very low or high frequency are also not taken into account. (Default: 3)

Training

This group has the following settings:

Max. number of features

Limits the maximum number of internally generated features per class. (Default: 5000)

Min. feature length

Specifies the minimum number of characters that should be used for a feature. This value cannot be smaller than the Min. word length. (Default: 3)

Max. feature length

Specifies the maximum number of characters that are used for a feature. Should not be larger than 64 characters. (Default: 50)

Automatic selection of Min. feature frequency

Enables the Min. feature frequency to be set automatically. If this setting is selected, you cannot manually assign a Min. feature frequency value. (Default: Cleared)

Min. feature frequency

Specifies how often a substring is displayed inside the training set of a class to be used as a feature for content classification. (Default: 2)

Start features at beginning of words

Specifies that a feature substring needs to start at the beginning of a word. If not checked, the substring can start anywhere. (Default: Selected)

Max. words per feature (0-n)

Limits the number of words per feature. A value of zero means unlimited words, although the total number of characters of the words per feature cannot exceed the "Max. feature length" property. (Default: 2)

Use fuzzy string match

Enables matching fuzziness with the disadvantage of slower classification performance. (Default: Cleared)

Fuzzy length (5-10)

Configures the fuzzy string comparison. (Default: 5)

Automatic selection of Min. class entropy

Enables the Min. class entropy to be set automatically. If this setting is selected, you cannot manually assigned a Min. class entropy value. (Default: Cleared)

Min. class entropy (0.0 - 1.0)

Controls the importance of a feature, depending on the number of classes where it is displayed. A value of 1.0 requires that a feature is displayed only inside the sample documents of a single class; otherwise, it is not used for classification. The lower the value, the more classes can contain the feature inside the training set. (Default: 0.600)

The following buttons are available at the bottom of this window:

Button

Description

OK

Closes the window and saves your changes.

Cancel

Closes the window without saving your changes.

Apply

Applies your changes without closing the window.

Related topics: