Guidelines for Using the Extract Content Activity When Using the PaperVision Capture Integration with Sys.tm


Version: Sys.tm: All Versions, Sys.tm: Flows

Article ID: SYS00001

Guidelines for Using the Extract Content Activity When Using the PaperVision Capture Integration with Sys.tmmain image

Description

This article clarifies how OCR data is handled depending on whether text recognition occurs in PaperVision Capture or within Sys.tm, ensuring the AI Query activity can properly access the text layer.

Summary

The following is only relevant when pushing from and pulling back into PaperVision Capture.

When to Include the Extract Content Activity in a Sys.tm Flow:
When processing files through PaperVision Capture’s Open Text or Nuance Full-Text OCR, those steps should occur before the Sys.tm Flow custom code step. This ensures that PaperVision Capture’s OCR data is pushed to Sys.tm instead of requiring a separate activity to create the data. For the AI Query activity to access that text layer, it must be extracted first by using the Extract Content activity.

When the Extract Content Activity Is Not Needed in a Sys.tm Flow:
If Sys.tm is responsible for creating the searchable documents, the Text Recognition activity should be first step in the Flow. This configuration stores the text layer on the backend automatically, allowing the AI Query activity to access it through the FileIds variable. Within the PaperVision Capture job, the Metadata tab of the custom code step would then use that variable to retrieve the OCR data directly.