Search and Indexing

Laserfiche allows users to search for entries based on almost any property of the entry, from entry names and document text to metadata and creation date. For users, all searches are available in a single cohesive search interface. On the back end, however, different types of searches are performed differently by the Laserfiche Server, and it is important to understand how these searches are performed to administer your repository effectively, because different features and functionality are available for the different search methods.

Searches performed by the Laserfiche Full-Text Indexing and Search Service are referred to as indexed, or full-text, searches. Indexed searches are performed using the Laserfiche Full-Text Indexing and Search Engine. Because indexed searches are performed using the search engine, they support features such as root word search, fuzzy search, and search context hits. The distinction is invisible to most users, who can simply perform their searches in the Search Pane without needing to know whether it is an indexed or non-indexed search.

Indexed Searches

The following information will be searched using indexed searching by default:

  • Document text, generated by OCRing or extracting text from documents
  • Attachment annotation text
  • Text box, callout text box, and sticky note annotation text
  • Annotation comments
  • Link group comments
  • Version comments
  • Digital signature signing reasons
  • Tag comments
  • Fields that have been marked as indexed by an administrator (see Indexing Fields, below, for more information)

All other search types are non-indexed, or database, searches.

Note: In Laserfiche client applications, entry name, extension information, and digital signature certificate users are indexed, but will be searched using SQL search by default; users can use advanced search syntax to search these fields via indexed searching as well.

Users can choose to perform any of these searches (other than document text searches) as either indexed or SQL searches using Advanced Search Syntax. See Advanced Search in the Laserfiche User Guide for more information.

Note: Unlike the other indexed search types, document text search must be performed as an indexed search. Users cannot choose to perform document text searches as non-indexed searches using Advanced Search Syntax, as they can for other search types.

In order to perform an indexed search for a document, the document must contain text, and the text must be indexed. Text can be generated by OCRing an imaged document, or by extracting text from an electronic document.

The indexing process keeps track of the words contained in each document and the location of each word on the page. This allows the search engine to efficiently locate search terms, and also allows each instance of a word or phrase to be highlighted when viewing a particular search result. Indexing can be configured to happen automatically for all new documents, or can be performed on a per-document basis from the Laserfiche web or Windows client or on a larger scale from the administration console. Indexing is managed and performed by the Laserfiche Full-Text Search and Indexing Service. It receives requests to index documents, such as those requested by Laserfiche client applications or by the Laserfiche desktop or Web Administration Console. Requests from the Laserfiche web or Windows client are handled in the order in which they are received. The manner in which requests from the Laserfiche Administration Console are handled depends on the type of indexing that will be performed. If you are indexing a volume, all qualified documents will be added to the end of the index queue. If you are reindexing the repository, the index queue will be cleared and all qualified documents will be added to the index queue. Additionally, reindexing a repository will clear all previous index information.

A search cannot be performed while documents are being indexed. However, this does not mean that the search function will not be available for long periods of time. When a request to search the repository is received, the Laserfiche Full-Text Indexing and Search Engine will finish indexing the document that it is currently working on. Then it will perform the requested full-text search on those documents that have already been indexed. Once the search has been completed, Laserfiche will resume indexing the remaining documents. The results for a full-text search performed during indexing will not include documents that have not been indexed. Therefore, the possibility that a full-text search may not include all possible search results dramatically increases when the entire repository is being reindexed. It is a good idea to schedule your indexing for off-peak hours.

Once you have indexed a document, it will automatically be re-added to the indexing queue if its contents change.  For instance, if you index a document and then modify its text files or add a version comment, it will automatically be re-indexed.  The same is true if you add new pages and OCR those pages. If you index a document that contains no text and then add text, the document will also be automatically re-indexed. This is why documents can be indexed even if they do not contain indexable content: the indexed document will be automatically reindexed when content is added to make it searchable.

Non-Indexed Searches

All searches other than those listed as indexed searches are non-indexed searches. For example, creation date, volume, and tag searches are non-indexed searches. In addition, fields that have not been indexed by an administrator are also non-indexed searches. Non-indexed searches are performed by the SQL search functionality of your database management system.

Because non-indexed searches do not use the Laserfiche Full-Text Indexing and Search Engine, they can be performed even when the search engine is offline, or when the search catalog is not present. They are also not able to take advantage of search engine features such as fuzzy search, root word search, context hits, and relevancy rankings.

Although document text search is always an indexed search, all other indexed searches may also be performed as non-indexed searches using Laserfiche Advanced Search Syntax. See Advanced Search in the Laserfiche User Guide for more information.

Indexing Repository Content

Documents must be indexed in order to be full-text searchable, and in order to perform indexed searches for document metadata. There are several ways to index documents and document content in Laserfiche:

  • Automatically indexing all new documents. Automatic document indexing ensures that all repository content containing text is full-text searchable. When this option is enabled, all new documents will be indexed, and users will not be able to choose not to index a document. This option is enabled by default, and in most cases, this is the best way to ensure that your repository content is searchable.

    To determine whether documents will be automatically indexed upon creation

    1. Start the Laserfiche Administration Console.select the desired Laserfiche repository and sign in in as a user who has been granted the Configure Search/Index privilege.
    2. Select the Index node.
    3. From the Action menu, point to All Tasks, and select Properties. The Indexing Properties dialog box will appear. Make sure the General tab is visible.
    4. Select or clear Always index on document creation.
    5. Click OK to save your changes.
  • Indexing individual documents. In Laserfiche client applications, you can manually indicate that a document should be indexed when it is created in the repository, either in the New Document or the Import to Laserfiche dialog box. You can also choose to index a particular document or set of documents through the Generate Searchable Text dialog box.
  • Indexing an entire repository or volume. This option is useful if you have many documents that need to be indexed, or if you need to reindex your repository. See Reindexing a Repository or Volume for more information.
  • Indexing a field. Fields can be indexed to allow them to be searched with indexed searches. When a field is marked as indexed, all instances of that field in the repository will be indexed as well, as long as the document itself is indexed. See Indexing Fields for more information on when to index a field.

Once a document has been indexed, it will remain indexed. If its contents change, it will automatically be re-added to the indexing queue to keep the search index up to date.

In addition to document text, the following informaition is indexed when a document is indexed: annotation and metadata comments; the text of callout text, text box, and sticky note annotations; and the text of indexed fields. If a document contains this information but is not indexed, it will also not be searchable.

Reindexing a Repository or Volume

A repository can be reindexed at any time, regardless of the number of documents waiting to be indexed. Reindexing a repository is useful when the search index configuration has been modified or if the repository contains many documents that have never been indexed.

To reindex a repository

To reindex a volume

  1. Start the Laserfiche Administration Console.
  2. In the console tree, expand the desired Laserfiche Server.
  3. Select the desired Laserfiche repository.
  4. If security has been enabled on the selected repository, log in as any user who has been granted the Manage Volumes privilege for the specified repository.
  5. Select the Volumes node.
  6. Select the volume to reindex.
  7. Right-click the volume or open the Action menu and point to All Tasks, then Reindex Volume. Select one of the following:
    • Reindex whole volume: The whole volume will be indexed, regardless of whether the documents were previously indexed.
    • Only previously indexed documents: Only the documents in the volume that have already been indexed will be reindexed.
  8. Volume indexing will begin.

To view your search engine's indexing status

To pause or resume indexing

Search and Indexing Architecture

The Full-Text Indexing and Search Service is an independent service that handles full-text searches of the repository. See Services for more information.

The Laserfiche Full-Text Indexing and Search Service can be installed on the same computer as the Laserfiche Server or on a different computer. You can only use one Full-Text Indexing and Search Service per repository, but you can install separate services for each repository. It must run as a user with sufficient rights to access the Laserfiche Server, and it must be able to communicate with the Laserfiche Server using the specified port; see Firewalls for more information.

Search Catalog

The search catalog contains search index files, which enable full-text searches and contain index configuration information. Search index files are stored on the same computer as your Laserfiche Full-Text Indexing and Search Service, in a folder you specify during repository or catalog creation. By default, the folder is named SEARCH and stored under the repository directory.

When you create a repository, a search catalog will be automatically created for you. . You may need to create a search catalog if you are restoring from backup and did not back up your index catalog, or if you deleted your catalog for troubleshooting purposes.

To create a new search index catalog

In general, it should not be necessary to delete your search catalog, but you may wish to do so if you want to change your stop words and completely reindex, or for troubleshooting purposes.

Warning: When your search catalog is deleted, users will not be able to run full-text searches on your repository. You will need to create a new catalog, and reindex your repository. See Reindexing a Repository or Volume for more information.

To delete your search index catalog 

You can attach or detach a search catalog from the repository. In general, this is done to move a repository or search service from one location to another, or to reattach a catalog when restoring from backup. If you need to relocate your search catalog files, either to another location on the same computer or because you are moving your search engine to a new computer, it is recommended that you detach them, move them, and then reattach them.

Warning: When your index catalog is detached, users will not be able to run full-text searches on your repository. You must attach it before these searches will function; it is not necessary to reindex in this case.

To detach your search index catalog:

To attach your search index catalog: 

Indexing Fields

There are two ways to perform field searches: as non-indexed searches (in which case the database management system's SQL search is used) or as indexed searches (in which case the Laserfiche Full-Text Indexing and Search Engine is used). By default, all fields are non-indexed, but in some cases you may choose to index fields.

You can choose to index a field when you create or modify the field.

When to Index Fields

In general, fields should be indexed in the following circumstances:

  • Fields that will contain long, nonstandardized content, such as notes or commentary, should be indexed to take advantage of search stemming, fuzzy search, and context hits.
  • If a field has an width greater than 100 characters, as the full-text search engine can handle these fields more efficiently than non-indexed search.
  • If your repository contains more than one million total field values (fields applied to entries and containing a value), indexing commonly-searched text fields can improve search performance.
  • Number, date, time, and date/time fields should be indexed if they will primarily be searched in conjunction with a text field that is also indexed.

Fields should not be indexed in the following situations:

  • If the field will generally contain relatively short, consistent values, such as names, department names, or document types, it is more efficient to leave the field unindexed.
  • If a number, date, time, or date/time field will not primarily be searched in conjunction with a text field that is also indexed, it should not be indexed.
  • If you want to take advantage of case and accent sensitivity options for a field, you should not index the fields, as these options do not apply to indexed searches.

Note: Using fuzzy search and search stemming features with field searches can dramatically increase the number of results for a search. Larger numbers of search results can lead to slower performance.

Scheduled Indexing

Indexing can be a resource-intensive process; in some cases, especially with high-load repositories, users may experience lower performance during indexing. If that is the case, you may wish to schedule your indexing to exclude certain peak business hours. For more information, see Scheduled Indexing.

Stop Words

Prior to indexing, you can determine which words will be excluded from full-text search results. These excluded words are known as stop words. Stop words are everyday words that do not add meaning to the search being performed, such as "the", "a", and "and". Searching for the word "the", for example, would not produce meaningful results.

Stop words are categorized by language. Each language will have a different set of default stop words, and the lists can be modified independently of one another.

Note: If the only search term in a full-text search is a stop word, the word will still be used to search. If there are more than one search term, no stop words will be searched, even if all the terms are stop words.

To configure stop words