Indexed Searches and Indexing

Searches performed by the Laserfiche Full-Text Indexing and Search Service are referred to as indexed, or full-text, searches. Indexed searches are performed using the Laserfiche Full-Text Indexing and Search Engine. (Non-indexed searches are performed using the search functionality of your repository's database management system.) Because indexed searches are performed using the search engine, they support features such as root word search, fuzzy search, and search context hits. The distinction is invisible to most users, who can simply perform their searches in the Search Pane without needing to know whether it is an indexed or non-indexed search.

The following information will be searched using indexed searching by default:

Note: In the Laserfiche client applications, entry name, extension information, and digital signature certificate users are indexed, but will be searched using SQL search by default; users can use advanced search syntax to search these fields via indexed searching as well.

Users can choose to perform any of these searches (other than document text searches) as either indexed or SQL searches using Advanced Search Syntax. See Advanced Search in the Laserfiche User Guide for more information.

Note: Unlike the other indexed search types, document text search must be performed as an indexed search. Users cannot choose to perform document text searches as non-indexed searches using Advanced Search Syntax, as they can for other search types.

In order to perform an indexed search for a document, the document must contain text, and the text must be indexed. Text can be generated by OCRing an imaged document, or by extracting text from an electronic document. (In some situations, the Laserfiche Full-Text Indexing and Search Service can perform text extraction on electronic documents for search purposes even if the Generate Text process has not been run on the document. See Search Engine Text Extraction for more information.) When a document is indexed, certain other text content such as metadata and annotation comments, the text from sticky note, callout, and text box annotations, and indexed fields will also be indexed.

The indexing process keeps track of the words contained in each document and the location of each word on the page. This allows the search engine to efficiently locate search terms, and also allows each instance of a word or phrase to be highlighted when viewing a particular search result. Indexing can be configured to happen automatically for all new documents, or can be performed on a per-document basis from the Laserfiche web or Windows client or on a larger scale from the administration console. Indexing is managed and performed by the Laserfiche Full-Text Search and Indexing Service. It receives requests to index documents, such as those requested by Laserfiche client applications or by the Laserfiche desktop or Web Administration Console. Requests from the Laserfiche web or Windows client are handled in the order in which they are received. The manner in which requests from the Laserfiche Administration Console are handled depends on the type of indexing that will be performed. If you are indexing a volume, all qualified documents will be added to the end of the index queue. If you are reindexing the repository, the index queue will be cleared and all qualified documents will be added to the index queue. Additionally, reindexing a repository will clear all previous index information.

A search cannot be performed while documents are being indexed. However, this does not mean that the search function will not be available for long periods of time. When a request to search the repository is received, the Laserfiche Full-Text Indexing and Search Engine will finish indexing the document that it is currently working on. Then it will perform the requested full-text search on those documents that have already been indexed. Once the search has been completed, Laserfiche will resume indexing the remaining documents. The results for a full-text search performed during indexing will not include documents that have not been indexed. Therefore, the possibility that a full-text search may not include all possible search results dramatically increases when the entire repository is being reindexed. It is a good idea to schedule your indexing for off-peak hours.

Once you have indexed a document, it will automatically be re-added to the indexing queue if its contents change.  For instance, if you index a document and then modify its text files or add a version comment, it will automatically be re-indexed.  The same is true if you add new pages and OCR those pages. If you index a document that contains no text and then add text, the document will also be automatically re-indexed. This is why documents can be indexed even if they do not contain indexable content: the indexed document will be automatically reindexed when content is added to make it searchable.