Understanding Xtractor

With its core functionality being auto-indexing,Xtractor is the Barcode and OCR recognition Engine for the DocuPhase document management platform. Xtractor automatically indexes Documents previously converted from paper to electronic format contained in the DocuPhasedatabase.

Furthermore, Xtractor searches DocuPhase’s Application tables to find new, unprocessed files with the .tiff (image) extension and/or image-based PDFs*. When Xtractorencounters a TIFF file with proper pre-defined attributes, it loads the first page of the TIFF file.

Xtractorincludes the following components and features:

  • A designer component used to establish “zones” within the pages of TIFF documents where OCR/Barcode technology can be applied to read essential information. Such information is to identify document types and index fields of a document being processed, using a feature known as “Zone Recognition”.
  • Capture features that include image enhancement controls, correction tools and color image support to assure the quality of scanned documents and to improve the accuracy of returned results. The enhancement functions also minimize the file size of scanned documents, increasing the number of documents that can be saved in the available storage capacity
  • XtractorService runs in the background behind the “scenes” to process TIFF documents applying the Xtractordefinitions and OCR/Barcode technology: extracting data and automatically updating document indexes for TIFF-image documents using the data it extracts
  • Processing of single-page, multi-page TIFF documents, as well as split compound TIFF documents separating them into multiple separate documents each with their own set of index values
  • All features are accessible through an icon-driven user interface
  • Updates index values for the documents it processes by reading values from first page of the document image itself. Sub-page processing allows zones to be read from other pages
  • Automated Indexing Rules include format restrictions; exceptions are routed to the appropriate department for review and correction. Verification capabilities enable operators to easily move from one document to another, making corrections possible at any stage plus the movement of cumbersome forms to research queues for future review
  • Sub-page Processing that allows you to split (i.e. divide) compound TIFF documents
  • Cover Page Handling Options (i.e., cover page retention, auto-removal of cover page, and automatic movement of cover page to end of document)

 

 

 

*Throughout this topic, image files are referred to as TIFF-type images.
However, files processed in Xtractorinclude image-based PDF files.

Also See...