Document Identifier Properties Form

 

The Document Identifier Properties Form contains the following fields:

Literal
An exact word or phrase located within the user-drawn zone to populate the Value field. The literal will automatically populated if a zone (or “inner zone”) was drawn around a valid literal value.

You can also manually enter in a literal value. As you enter a value, iDox checks the contents of the zone to ensure that the literal value being typed exist within the zone. If it does not, an error message will display in the status bar.

The Extract button prompts iDox to re-extract a literal value from the zone. If you entered a mask value and decided to use a literal instead, clicking the Extract button will retrieve the literal value from the zone without requiring you to exit the Properties form and redraw the zone.

Mask
Instructs iDox to recognize a specific pattern within the user-drawn zone and translate the pattern into an identifier value.

Mask values (symbols) include:

 .        Represents any alpha/numeric character.

@         Represents a single alpha character.

#         Represents a single number.

@(n)   Represents n number of characters.

#(n)    Represents n number of numbers.

@*       Represents a continuous series of alpha characters until the first non-alpha character.

#*       Represents a continuous series of numbers until the first non-numeric character.

.*         Represents a continuous series of alpha/numeric character until the end.

 

@^      To use the @ sign as a literal or within a regular expression.

Example: [^@^] not a @ sign.

#^       To use the # sign as a literal or within a regular expression. Example: [^#^] not a # sign.

Rounded Rectangle:    Comment:  Multiple mask patterns may be entered, separated by a ^ character.
 
The two Mask Patterns shown above instruct iDox to extract any three alpha characters OR any three-digit numbers to be used as a document identifier value.

 

As mask symbols are entered, iDox will search the zone and extract a translated value.

 

Once a unique value that matches the entered mask symbol cannot be found, an error message will display in the status bar.

 

In addition to mask symbols, a combination of literal values and mask symbols can be entered.

 

 

Value
This is the resulting identifier value from the Literal field or the Mask field that will be displayed on the document identifier tree node.

Required
Selecting this option indicates that this document identifier must be present within the defined zone in order for this page to be associated with the document definition.

Required Option Rules:

1.  If a document definition only contains one document identifier, the document identifier by default is required and this option has no effect.

2.  If a document definition contains more than one document identifier, selecting this option will create an “and/or” condition in determining whether a page should be associated with the document definition.

3.  All document identifiers that are marked required must be on a page (within the defined zones) for that page to be associated with a document definition.

4.  If there are one or more document identifiers that are not required, then at least one of those identifiers must be on a page (within the defined zones) for that page to be associated with a document definition.

 

 

Examples of Required Option Rules

Document Definition “Invoice”

Identifier value “A” is required.

Identifier value “B” is required.

Identifier value “C” is required.

In order for a page to be associated with “Invoice”, it must have document identifier values “A” and “B” and “C” within the defined zones. (Rule #3).

Document Definition “Purchase Order”

Identifier value “A” is required.

Identifier value “B” is required.

Identifier value “C” is not required.

In order for a page to be associated with “Purchase Order”, it must have document identifier values “A” and “B” and “C” within the defined zones. Even though value “C” is not required, at least one non-required identifier must be present thereby making it required (Rule #4).

 

Document Definition “Mortgage”

Identifier value “A” is required.

Identifier value “B” is not required.

Identifier value “C” is not required.

In order for a page to be associated with “Mortgage”, it must have document identifier value “A” and (“B” or “C”) within the defined zones. (Rule #4).

Document Definition “Letter”

Identifier value “A” is not required

Identifier value “B” is not required.

Identifier value “C” is not required.

In order for a  page to be associated with “Letter”, it must have document identifier value “A” or “B” or “C” within the defined zones.  (Rule #4).

 

 

The following table presents a PDF processing Situation, Problem and Solution example.

PDF Processing Situation, Problem & Solution

Situation:

Within a PDF file to be processed, there are delinquency letters that are written in English and Spanish. Both letters have the literal, “Account Number” in English within the same identifier zone. The remaining content of the letters is written in English or Spanish. In addition to the delinquency letters, there are other document types within the PDF file that also have “Account Number” within the same identifier zone.

Problem:

How do you 1) isolate the delinquency letters from other document types that have “Account Number” within the same identifier zone and 2) plus  instruct iDox to associate both the English and Spanish letters with the same document definition “Delinquency Letter”?

Solution Approach:

To make the delinquency letters unique from other document types that also have identifier value “Account Number” as a document identifier, it will be necessary to add additional document identifiers to the Delinquency Letter document definition. Since any remaining content on the delinquency letters is written in English or Spanish, at least two document identifiers will be added to the document definition (one in English and one in Spanish).

Solution Example:

Document Definition “Delinquency Letter”

Document Identifier Value “Account Number” is required.

Document Identifier Value “Your account is overdue” is not required.

Document Identifier Value “Su cuenta es atrasada” is not required.

In order for a page to be associated with “Delinquency Letter,” it must have document identifier value, “Account Number” and “Your account is overdue” orSu cuenta es atrasada” within the defined zones.

Since there are other document types that also have document identifier, “Account Number,” it is important that the document definition node and related Binder Document node, “Delinquency Letter” be placed above the other document definition nodes. This is necessary because iDox searches from the first document definition defined to the last when searching for document definitions. The first document definition in which the page matches the document identifiers is the definition that iDox associates with the page.

See section: Document Definition Filtering.