Extract paragraphs sample script
Out-of-the-box KTM and KTA don’t have a locator that can extract a paragraph from a document, and especially with 2 column documents this can be a challenging task. This document provides you with instructions how to create a script locator that allows you to extract paragraphs that contains a specific keyword, or keywords in the first line. The example used in this document is to find and extract the paragraph that describes the definitions of the “effective date” and “term” of an agreement
The sample script code included allows to extract paragraphs, in single-column and two-column documents. It draws a rectangle between two anchors and the width of the text column, and returns the text in that rectangle. The keyword that identifies the paragraph must be found in the first line of that paragraph. The first line of the paragraph may be indented. The paragraph may be numbered (as “X.” or “X.Y.”), or instead it has a header that a locator can find.
Limitations: The script does not capture the second part of a paragraph, if the paragraph continues on the next column, or next page. Also, make sure that the document isn’t skewed (if necessary, use VRS to correct for skewed text), as skewed text leads to words being extracted in the wrong order.
Be able to catalog contacts and other document types where paragraphs provide a definition.
Script code, for the extraction in KTA, KTM, RPA.
the extracted paragraphs delivered to fields
This is a documented sample script, with no formal support, or guarantees of any kind.