Unstructured Document Searching

Remember that an unstructured document only has document identifiers on the first page of each logical document and subsequent pages can have any random information. Because of this, care must be taken when attempting to find the next document definition after an unstructured document definition node.

If you select a document definition node that is defining an unstructured document type, there must be another document definition defined after it . This is because since iDox assumes that for unstructured documents an undefined page belongs to the unstructured document, it will not prompt you to add a new document definition for the undefined page.

Instead it will search until it reaches the end of the sample file or defined document definition.

Unstructured Document Searching Example

Situation:

You have created an unstructured document definition that defines a mortgage application document type.  There are 1000 pages within the sample PDF file, the first 50 pages of which belong to the mortgage document type.  The remaining pages belong to different undefined document types.

Problem:

You select the mortgage document definition node (which displays page 1 of the sample PDF file) and click the Next Document Definition button.  Starting at page 2, iDox searches all document definition nodes and attempts to find a match for the page being searched.  Since the selected document definition node is an unstructured document and there are no other document definition nodes created, iDox assumes that all pages within the sample PDF file belong to the mortgage document definition. iDox will not prompt you at “page 51”, asking whether to add a new document definition.

Solution Approach:

To resolve this, you must ensure that at least one document definition is defined for a page (in this case, “page 51”) after a selected unstructured document definition.

This will allow iDox to stop the search because it will recognize a page that belongs to something else.

Solution with iDox:

If the mortgage document definition was defined on “page 20” and you clicked Previous Document Definition, normal processing would occur.

If no matching document definition node was found, you will be prompted whether to add one.  If a document definition node is found, it will automatically be selected and the identified page will be displayed.

 

Stopping a Search

If you wish to stop a document definition search from continuing, you can click the “Stop Search” Stopbutton toolbar button.