Retrieving Text from an Electronic File

Each electronic document is associated with an electronic file, which can consist of text, images, and/or other data. Laserfiche can recognize the text stream in an electronic file and store it as text pages in the electronic document. Note, however, that if text appears as part of a graphic, the text will not be retrieved. (If this is the case, you can use Laserfiche Snapshot in the Laserfiche Windows client to create image pages of the document and then OCR those pages.) Once text has been saved, you can index it and perform text searches on the content of the electronic file.

Note: If an electronic document is indexed but has not had text extracted from it, and if an IFilter for that electronic file type exists on the computer hosting the Laserfiche Full-Text Indexing and Search Service, the document will be text searchable, but no text will display when you open the electronic document as pages in Laserfiche. We recommend extracting text from such documents if possible, as it will improve search performance.

Note: When text is extracted directly from a PDF form, only the standardized form text will be included. The text input by the user in the form's fields will not be included in the extracted text. This text can be generated by OCRing the image pages created from PDFs rather than extracting the text from the electronic document.

To retrieve text from electronic files

  1. In the folder browser, select electronic document(s) you want to retrieve text from.
  2. From the toolbar, click Generate Text, or select Generate Searchable Text from the Tasks drop-down menu (this will open the Generate Searchable Text dialog box).
  3. Select the OCR / Extract Text and Index entire documentcheckboxes.
  4. Click OK.

To retrieve text from electronic files

  1. In the folder browser, select one or more electronic documents.
  2. Select Generate Searchable Text from the Tasks menu to open the Generate Searchable Text dialog box.
  3. If the electronic document already has images pages, choose how you want those pages to be handled.
  4. Click OK.
Note: Text can only be retrieved from certain types of electronic files. To view a list of them, click More Info from the Generate Searchable Text dialog box. This will open the ClosedOCR and Text Extraction Information dialog box; look under the Electronic Documents and Text Extraction section for information on the types of electronic document files that can have their text retrieved. If your file type is not included, contact your administrator to see if an IFilter for your document type can be installed. For more information, see IFilter.

Related Topics