Representations

A representation is the result of an OCR conversion of the source file. More than one full text conversion is possible and can be stored under representations. This object may have the following properties:

Pages

Each page in the representation is listed as created from the original source document. Normally there is a 1:1 relationship between XDoc pages and CDoc pages.

TextLines

Text lines consisting of Words generated from OCR results.

Blocks

Information found when extracting tables in combination with knowledge bases.

Boxes

Information about handwritten and typewritten text found when using the Mixed Print recognition engine.

Barcodes

Details about any recognized bar codes.

Words

Indexed list of Words with characters, bounding rectangles and additional properties that are generated from character recognition.

Paragraphs

If enabled in the project, an index of the recognized paragraphs is listed.

Tables

Details about all tables found on a document, including table header information, row, column, and cell data.

KeyValuePredictions

If key value pairs are enabled, this representation shows all recognized keys and their values. The keys are highlighted on the image in yellow and their values are highlighted in blue.