Inferred or derived fields
These are fields that are not printed on the document, but based on information available, the locator may be able to extract that information. Because the extracted data is not printed on the document, the highlighted result does not necessarily correspond to a useful location on the document.
Extracting inferred fields may not always work. Auto extraction was designed so it extracts data based on the recognition results from the document only. So, the more inference needed, the less obvious the data is in the document, meaning there is less of a chance that data can be derived successfully.
For example, you have a utility bill and you want to know utility type. This information is not necessarily printed on the document, but there it may be inferred based on what information is printed on the document. Words such as power, wattage, sewage, etc. can help determine the utility type.
Subfield Name: Utility Type
Description: The utility type of the bill. This can be Gas, Electric, or Water. If no value is found, return Unknown.
If a utility bill includes more than one utility type, this will not work. A field cannot return two different pieces of information. The first utility type listed on the form is always returned and any subsequent utility types are ignored.
Other examples of inferred fields include:
-
The language of a document.
-
The country, if an address is printed on a document without the country. You can be as specific as you want here. For example, if you want to use the 3-character ISO country code, specify that in your description.
Similar to country is currency. This can be the symbol or the code. Specify exactly what you want in the description.
-
The time zone of the individual the document references.
For example, you want to extract the amount that is outstanding on a document., but include the currency symbol in the output. Neither the symbol nor the country are not printed on the document, but this information may be derived by the information that is printed on the document.
Subfield Name: Owed Amount
Description: The amount owed on the document. Return the currency symbol for the country where the bill recipient lives and include two decimal places in the result.
The following are examples of how you can use a description and inferred information for formatting:
-
The 3-character country code of the country of the account holder.
-
The total tax amount paid on this bill. The result should include the currency symbol of the country where the bill recipient lives and there should be three decimal places in the result.
-
The time zone of the bill recipient. Use the full name of the time zone with the code in parenthesis after the full name.
As with all descriptions, experiment with your wording through testing.
Related topics: