General tab - Properties of Trainable Evaluator window

Trainable Evaluator icon If you want quick and good generic extraction results with a relatively small training set, the highly optimizable Trainable Evaluator, is recommended. This evaluator is used to compare alternatives from other locators to determine which of those alternatives match a specific set of criteria. This evaluator relies on alternatives from the input locator. It also learns from false alternatives to improve training.

The Trainable Evaluator trains the correct alternative from the input locators for each subfield. This training information is applied during extraction, and for each subfield, the best alternative is chosen from their respective input locators.

During configuration, it is important to add representative training documents for each subfield.

The input locators should not restrict the delivered alternatives to the correct values. The Trainable Evaluator also needs false alternatives to enhance the extraction results. Using the same input locator for different subfields is possible and can also enhance the extraction results.

The table has the following columns:

Name

Add a descriptive name for the subfield.

Input Locator

Select the input locator from the list. This list includes already existing locators and evaluators only, and must be defined above this Trainable Evaluator in the project hierarchy.

Micro Layout

This setting indicates where this evaluator searches for useful keywords in relation to a candidate. During training, the candidate is the trained value. At runtime, the candidate refers to each of the input locator alternatives as they are evaluated.

The following values are available for this setting:

  • Table based. (Default: Selected)

    Keywords are expected at the top or to the left of the candidate (N and W).

    Use this value when searching for content located in tables.

  • Left top oriented.

    Keywords are expected at the top and left of the candidate (N, NW, and W).

    For example, if the keyword is located in the middle of the page, the candidate should be found in the top left quadrant of the document. Use this value when searching for content located in captions or before colons.

  • Symmetric.

    Keywords are expected in all directions around the candidate.

    Use this value when searching for content that is located in running text.

  • Line oriented.

    Keywords are expected to the left and to the right of the candidate. (W and E).

    Use this value when searching for content in the same line.

If you change the value for the Micro Layout for a subfield that has already been trained, you must train the project again to ensure that the knowledge base has the correct information.

Use the following buttons to manage your subfields:

Button

Description

Add subfield icon   Add

Adds a new subfield row to the list where you can configure it as needed.

Delete icon   Delete

Deletes the selected subfield from the list.

Move Up Icon   Up

Moves the selected subfield up one position in the list.

This and the following setting is for organizational purposes only. The order of subfields does not affect the output of this evaluator.

Move down icon   Down

Moves the selected subfield down one position in the list.

The following buttons are available at the bottom of this window:

Button

Description

Close

Closes the window and saves your changes.

Test icon   Test

Tests the locator settings. The results are displayed on the Test Results tab that is displayed automatically when you click this button.

Depending on the locator method, this button may have additional modes if the locator uses other locators as input.

Help icon  Help

Displays the help for the open window.

Related topics: