OCR Settings
To edit OCR settings
- Open the Import Agent Configuration Utility.
- Select Profile from the menu, and then select Options.
- Select the General tab.
- Under Laserfiche Distributed Computing Cluster, select the Enable checkbox if you want to connect the Import Agent service to a Laserfiche Distributed Computing Cluster (DCC) installation. When enabled, Import Agent will send OCR requests to the specified DCC Scheduler instead of processing OCR on the Import Engine Computer. Enabling a Distributed Computing Cluster can improve the speed at which OCR is performed.
- Scheduler: Specify the name of the computer where the Laserfiche Distributed Computing Cluster Scheduler is installed.
- Port: Specify the port to use when communicating with the Distributed Computing Cluster Scheduler. By default, DCC uses port 8108.
- Under OCR, select or clear the following options.
- Language: Select a language to help optimize the character recognition.
- Decolumnize text: Select this option to convert multiple columns of generated text into a single column. Clearing the checkbox will preserve column formatting in the OCRed text, even if that separates words and sentences.
- Perform image enhancement: Images will be temporarily enhanced prior to OCR processing to optimize the processing. Click Configure open the Image Clean-up Options dialog box and configure the desired temporary image enhancements.
- Deskew image: Straighten crooked images.
- Despeckle image: Remove undesired noise from an image.
- Specify the maximum size of noise to be removed (in pixels): Size is specified as both width and height. For example, setting this option to 2 will remove all noise that is equal to or smaller than a 2 pixel x 2 pixel square.
- Rotate image: Rotates images to an orientation that is appropriate for OCR processing. After the OCR process is performed, the image will return to its original orientation.
- Automatically: The direction in which text flows on the page will be detected. The image will be rotated so the text flows horizontally (left to right).
- By this amount: The amount by which the image will be rotated. An image can be rotated by 90, 180, or 270 degrees.
- Line removal: Remove lines from an image.
- Horizontal: Removes horizontal lines from the image. Characters that are damaged due to the line removal will be repaired.
- Vertical: Removes vertical lines from the image. Characters that are damaged due to the line removal will be repaired.
- Optimization priority: Choose one of the following:
- Speed: Reduces the amount of time it takes to OCR. Generated text may be less accurate. Best for documents with clear text.
- Balance: Strikes a balance between speed and accuracy. Best for documents with average text.
- Accuracy: Increases OCR quality. Processing time will also be increased. Best for documents with less clear text.
Note: These settings apply to all profiles.