Notes for Reuse in Project Builder Topics
To save time selecting individual documents, click
Select All
or Ctrl + A to select all documents.
After changing the properties of your classifiers, or after adding or deleting documents from your training set, you must retrain your project.
If a classification rule is applied to a document, a special icon is displayed next to the document in the Classification Results window. A tooltip for the icon explains the applicable rule.
As long as a child item within a class is selected, the class selection is implied. If you select a locator for a class and then open your Classification Training document set, any classification training documents for the class of the locator are displayed.
The selected Classification behavior setting in the Content Classification group on the Project Settings - Classification tab directly effects how instructions are processed for your project. If you select the default Classify first page only value, your configured instructions look on the first page only. Similarly, if you select the Do not use content classification value, your instructions are not processed at all.
The only fields that you can configure for Correction are fields that are populated by an Advanced Zone Locator or an OCR Voting Evaluator.
Only a Table Locator, an Advanced Table Locator, a Predefined Document Type Locator, a Line Item Matching Locator, and a Standard Evaluator support table fields.
This shortcut key only works outside the Script Code window.
When you want to generate a classification or separation benchmark, first store the assigned class structure on disk using the Sort Documents on Disk by Class setting from the Benchmark Sets shortcut menu.
If you overwrite an inherited locator locally in a child class, make sure that the local locator has the same subfields as the locator in the parent class. Otherwise, an inherited assignment to a field might not be resolved. This can lead to an extraction error.
For example, there is an Advanced Zone Locator defined in a parent class that creates two subfields S_A and S_B. They are assigned to the fields F_A and F_B, respectively. The parent class has a child class that contains a locator that is locally overwritten with a format locator called “FormatLoc.” The format locator is assigned to Field F_B. If you extract a document that is classified as the child class, you see an error because Field F_A inherits a locator assignment (to S_B). But locally, the format locator no longer provides this subfield.
To use the automatic table detection to find data on invoices, use the default global columns. The automatic algorithm cannot calculate with user-defined columns.
When applying localization, select a font that can properly depict special characters such as Chinese or Cyrillic.
This setting is available only when the selected locator is an Advanced Zone Locator.
If the inserted value exceeds the number of documents for a class all documents for that class are copied to the first subset and the second subset does not contain any of those documents.
Documents that have no class assigned because they are not classified yet or cannot be classified are considered in the same way. That means they belong to the same "class."
When using the FineReader 12.4, your production machine that runs the Transformation Server module is limited to a maximum of 40 cores.
This column is displayed for the training document sets only.
The A2iA Zone Locator does not support Unicode characters for fields. For example, this means that Japanese and Chinese characters are not supported in the project path and for the A2iA field names.
If you are using document separation, you cannot use dynamic classifiers. So, if you are using document separation, classification online learning can only collect documents automatically. This means that in order to improve classification, the training documents need to be imported and then your project retrained and published before the collected documents can improve classification.
This type of format definition does not support Arabic or any other right-to-left text.
You require a separate license to use this recognition engine.
Communication between the Tungsten Clarity recognition profile and its server use port 443. This port must be open in your Firewall settings.
If you perform recognition using Tungsten Clarity during testing in the Transformation Designer or at runtime, one volume license per page is consumed each time an internet connection is made. If you want to avoid consuming too many runtime licenses during project configuration and testing, Tungsten Automation recommends that you test Tungsten Clarity on selected images only, and that you perform OCR on other test images or training images using another recognition engine.
During configuration, the best practice is to run this recognition engine a few times without a fallback recognition engine configured. This ensures that everything is working as expected and that the proper internet access if available.
The FineReader recognition engine will be deprecated in the next release of Tungsten TotalAgility. As a result, Tungsten Automation recommends that you use the OmniPage recognition profile for both page and zone recognition for all new projects. If you have existing projects that use one or more FineReader profiles, it is also recommended that you modify those projects to use a comparable OmniPage profile.
The FineReader recognition engine will be deprecated in the next release of Tungsten TotalAgility. As a result, Tungsten Automation recommends that you use the OmniPage recognition profile for both page and zone recognition for all new projects. If you have existing projects that use one or more FineReader profiles, it is also recommended that you modify those projects to use a comparable OmniPage profile.
The RecoStar recognition engine will be deprecated in the next release of Tungsten TotalAgility. As a result, Tungsten Automation recommends that you use the OmniPage recognition profile for both page and zone recognition for all new projects. If you have existing projects that use one or more RecoStar profiles, it is also recommended that you modify those projects to use a comparable OmniPage profile.
The RecoStar and FineReader recognition engines will be deprecated in the next release of Tungsten TotalAgility. As a result, Tungsten Automation recommends that you use the OmniPage recognition profile for both page and zone recognition for all new projects. If you have existing projects that use one or more RecoStar or FineReader profiles, it is also recommended that you modify those projects to use a comparable OmniPage profile.
For the best results, Train your project before converting the training documents to bitonal format. This ensures that any quality lost during conversion does not negatively affect the training results.
Similarly, ensure that all configuration and testing is complete before converting any Test Sets. This ensures that you are using the best quality documents to configure and test your extraction results.