Representations
A representation is the result of an OCR conversion of the source file. More than one full text conversion is possible and can be stored under representations. This object may have the following properties:
- Pages
-
Each page in the representation is listed as created from the original source document. Normally there is a 1:1 relationship between XDoc pages and CDoc pages.
- TextLines
-
Text lines consisting of Words generated from OCR results.
- Blocks
-
Information found when extracting tables in combination with knowledge bases.
- Boxes
-
Information about handwritten and typewritten text found when using the Mixed Print recognition engine.
- Barcodes
-
Details about any recognized bar codes.
- Words
-
Indexed list of Words with characters, bounding rectangles and additional properties that are generated from character recognition.
- Paragraphs
-
If enabled in the project, an index of the recognized paragraphs is listed.
- Tables
-
Details about all tables found on a document, including table header information, row, column, and cell data.
- KeyValuePredictions
-
If key value pairs are enabled, this representation shows all recognized keys and their values. The keys are highlighted on the image in yellow and their values are highlighted in blue.