Search Engine Text Extraction

In some cases, you may have electronic files that are capable of having text extracted from them, but that have not had this procedure done. They may be legacy files that never had the Extract Text process run on them or they may have been imported without having had text extracted. If the computer that the Laserfiche Full-Text Search and Indexing Service is installed on has the proper IFilter to extract text from these documents, the Laserfiche Search Engine will automatically extract text from the documents at the time of indexing. This will not create a text file in the Laserfiche repository for that document. Instead, search engine text extraction saves the document's text in the index files, and uses it only for the purposes of making the documents full-text searchable. To create text files associated with the document, you either need to extract text from the documents or Snapshot and OCR them. To take advantage of search engine text extraction, however, simply make sure that an IFilter for that file type is installed on the Laserfiche Full-Text Indexing and Search Engine computer, and index the electronic document.

The search engine will use the text file associated with a document if one exists. Search engine text extraction is only used if no such file exists. In most cases, we recommend generating text on electronic documents in client applications rather than relying on the search engine, as search engine text extraction can slow indexing and must be re-done whenever the document is reindexed.

Note: Search engine text extraction has no effect on imaged documents.

Information about IFilters is available in Knowledge Base article 1011194 INFO: Retrieving Text from Electronic Files with IFilters. Note that you will need to install the correct IFilters on the computer on which the Laserfiche Full-Text Indexing and Search Service is installed. Search engine text extraction will only work for file types that have text associated with them. Files that have no text (for instance, images, or PDFs containing no text) cannot be made searchable in this way.

Note: Before determining what IFilters you might need, we recommend consulting the Electronic Text Extraction list in the Laserfiche desktop or Web Administration Console to view what types of files the Laserfiche Full-Text Indexer can extract text from by default. If a file type is on the list, the Laserfiche Full-Text Indexer can extract text from and index those types of files.

Note: If you are using Laserfiche QuickReindex to index your documents, IFilters must be installed on both the Laserfiche Full-Text Indexing and Search Engine computer and the Laserfiche Server computer.