Features

The sample script code included allows to extract paragraphs, in single-column and two-column documents. It draws a rectangle between two anchors and the width of the text column, and returns the text in that rectangle. The keyword that identifies the paragraph must be found in the first line of that paragraph.  The first line of the paragraph may be indented. The paragraph may be numbered (as “X.” or “X.Y.”), or instead it has a header that a locator can find.

Limitations: The script does not capture the second part of a paragraph, if the paragraph continues on the next column, or next page. Also, make sure that the document isn’t skewed (if necessary, use VRS to correct for skewed text), as skewed text leads to words being extracted in the wrong order.


Benefits

Be able to catalog contacts and other document types where paragraphs provide a definition.


Resources

Script code, for the extraction in KTA, KTM, RPA.

Inputs

n/a


Outputs

the extracted paragraphs delivered to fields


Additional Information

This is a documented sample script, with no formal support, or guarantees of any kind.


Required Software / Applications

KTA, KTM, or RPA/DTS

Geographic Availability

tested with contracts in US English language only

Language Availability

tested with contracts in US English language only, should work with other languages