Evaluation Settings tab - Properties of Format Locator window
The format locator works with format definitions such as pattern matching (regular
expressions and
simple expressions) and
advanced algorithms (Levenshtein and
trigrams). The
format definitions in partnership with
dictionaries and
keywords are used to extract data from documents, without the need to define zones. The locator
runs on a full or partial page read of the document to extract the data using searches that are specific to the data, not the
document layout. The locator evaluates the found alternatives and the data output.
Use this tab to configure format evaluations.
- Keywords
-
Use the following buttons to manage the keywords:
Button
Description
Delete Removes the selected keyword from the keywords table.
Add Creates a new keyword using the specified keyword settings and adds it to the table below.
Modify Changes the selected keyword and updates the keywords table with the new settings.
Clear Removes each of the settings from the selected keyword so you can start again.
This group has the following settings:
- Ignore these characters
-
The characters typed here are ignored when searching for keywords. The default values are !/().:, and ; for this setting.
This setting does not work when a dictionary is inserted in the same format locator.
- Keyword
-
Type a word or phrase that is likely to be found on a document. You can also select a dictionary if the list of keywords is long and in database format. The presence or absence of this keyword is used to rank the results that match the formats.
- Match each word exactly (not fuzzy)
-
Enable this setting if you want to match a keyword phrase exactly. At least one of the words in the phrase should exactly match words on the document. Any recognition errors can lower the confidence of the desired alternative. (Default: Cleared)
If you are processing PDF documents, no recognition is performed, and the embedded text is used instead. This means that you can use this setting without fear of your alternatives receiving lower confidences because there is no chance of recognition errors.
- Match all words as a phrase
-
Enable this setting if you want to search a document for each of the words included in a phrase. If the phrase contains recognition errors, it can be included in the result, but with a lower confidence. (Default: Cleared)
- Weight
-
Type in a value or use the slider to specify a weight. A positive number indicates that a keyword is near the desired value and a negative number indicates that any matches near that keyword should be excluded from the results. The value for this setting is set to 100 by default.
- Distance
-
Select one of the three distance settings of near, medium, and far, to indicate how far your keyword is located from the desired result. The value for this setting is set to Near by default.
- Keyword in relation to match
-
Select the direction of the keyword in relation to the desired result. The eight directions include W, NW, N, NE, E, SE, and SW. The W, NW, and N directions are selected by default.
Use the following buttons to manage your keywords:
The following buttons are available at the bottom of this window:
|
Button |
Description |
|---|---|
| Close |
Closes the window and saves your changes. |
|
Tests the locator settings. The results are displayed on the Test Results tab that is displayed automatically when you click this button. Depending on the locator method, this button may have additional modes if the locator uses other locators as input. |
|
|
Displays the help for the open window. |
Related topics:
-
More information on configuring a Format Locator