Advanced tab - Properties of Text Content Locator window
This locator method finds data in unstructured documents that have no consistent layout. This enables you extract data from
contracts, correspondence, or even essays and manuscripts. This locator works best for semi-structured documents, and is designed
for documents that unstructured text made up of sentences. You can extract data from unstructured documents with moderate success,
and the results should improve as you add more training documents.
Use this tab to configure training for online learning. The following settings are available:
- Collect online documents for training
-
Select this setting if you want sample documents to be used to train this locator. (Default: Selected)
If you already have around 500 training documents, clear this setting for the best results. The performance of your project with more than 500 training documents may not improve extraction results, but it does increase the time it takes to train your project. The negligible increase in extraction results is overshadowed by the negative impact of the training time required.
- Maximum number of training documents
-
Type a value or use the arrows to select the maximum number of training documents that are collected for this locator. (Default: 500)
For the best results, do not increase this number unless you are able to dedicate a lot of time to training your project regularly.
The following buttons are available at the bottom of this window:
|
Button |
Description |
|---|---|
| Close |
Closes the window and saves your changes. |
|
Tests the locator settings. The results are displayed on the Test Results tab that is displayed automatically when you click this button. Depending on the locator method, this button may have additional modes if the locator uses other locators as input. |
|
|
Displays the help for the open window. |
Related topics:
-
More information about configuring a Text Content Locator