Previous Next  Up  Table of Contents  EagleFiler Home

8.10   How does indexing in EagleFiler work?

EagleFiler maintains indexes in order to quickly search the contents of your library. EagleFiler can search thousands of records in a fraction of a second since it only needs to read the optimized index files, not the files for each individual record. The types of searches and their syntax are described in the Searching section. Normally, indexing is not something that you need to worry about, since EagleFiler handles it automatically. This section describes how indexing works in case you need to customize it or fix a problem.

What’s Indexed

EagleFiler indexes the text contents of all the files, e-mail messages, and notes in your library. This includes text that Spotlight wouldn’t see, such as files attached to e-mail messages, invisible Web content, PDF annotations, and Skim notes attached to PDF files. EagleFiler has built-in support for indexing common file formats. It also supports custom file formats via Spotlight importer plug-ins.

Index Storage

As shown in the Library Folders section, each EagleFiler library has a .eflibrary package. The indexes for the library are stored inside this package. This means that:

You may prefer to exclude EagleFiler’s indexes from backups in order to save disk space or bandwidth. To exclude the indexes from Time Machine backups, use the ExcludeIndexesFromBackup option in the Esoteric Preferences. To exclude the indexes from other types of backups, set your backup software to skip files whose names end with .efindex or .efmailtoc.

Updating Indexes

In order for your searches to be accurate, EagleFiler needs to make sure that the index is updated whenever the contents of a file changes:

Each time you open a library, EagleFiler scans all of the files to make sure they are up-to-date in the index. This allows EagleFiler to detect files that were modified outside of EagleFiler and also complete any indexing work that was in progress when you closed the library. Usually, most of the files were already indexed, so this scan does not take very long.

Mailbox files are not editable, so EagleFiler knows that the messages in them never change. Once a mailbox has been completely indexed, EagleFiler marks the mailbox as done. This speeds up opening the library, since EagleFiler will not have to check whether each message is up-to-date in the index. Additionally, the index file is treated as read-only so that it will not need to be recopied during a backup or sync.

Word vs. Phrase Indexing

When rebuilding an index (see below) you can choose (per-library) whether EagleFiler should index words or phrases. Indexing for word searches is much faster, but searching for multiple words will find all the documents that contain those words anywhere in the document. Indexing for phrase searches lets you search for a group of words that appear near each other in a document, however indexing will take longer and the index files will be several times as large. Phrase indexing is the default.

Rebuilding Indexes

If you hold down the Command and Option keys when opening a library, EagleFiler will show the Rebuild Indexes dialog. When you rebuild an index, EagleFiler deletes the old index file and builds a new one from scratch. This may take a long time, but it can be useful because:

Damaged Indexes

If EagleFiler is unable to open an index, it treats it as damaged. It will move the old index file to the Damaged folder inside the Indexes folder in the .eflibrary package and log the error to Console. Then it will create a new index file and begin rebuilding it.

If you believe that the index file was not actually damaged, you can close the library in EagleFiler and move the index file back out of the Damaged folder. Otherwise, you can delete the files in the Damaged folder to save disk space.

Excluding Files From Indexing

Occasionally, a file may be unreadable by EagleFiler or damaged such that it causes the indexer to hang or crash. To prevent such files from causing problems for the rest of your library, you can exclude them from indexing. You can exclude a file from indexing by assigning it the ef_noindex tag.

You can exclude files attached to e-mail messages from indexing by entering their names using Terminal. For example, the command:

defaults write com.c-command.EagleFiler UnindexedAttachmentNames -array "smime.p7s" "PGP.sig"

will tell EagleFiler not to index files named smime.p7s or PGP.sig. This can also speed up indexing if you know that you will never want to search the contents of attachments with those names.

Troubleshooting

If EagleFiler hangs during indexing, you can use Activity Monitor to record what EagleFiler and its helper processes are doing. The eftexttool process is in charge of reading files to extract their text content. The efindextool process is in charge of updating the index files for your library. Please see Sending in a “Sample” Report for more information. The information in the sample report will show what EagleFiler was doing and perhaps which file it was working on.

If EagleFiler seems to be reindexing your files and you don’t know why, it may help to enable debug logging to Console. For example, the log may show that certain files had never been indexed or that they were in the index but had since been modified. It can also reveal when a particular file, or type of file, is taking a long time to index. To enable logging for file indexing, click here; to disable it, click here. To enable logging for message indexing, click here; to disable it, click here.

If you want to entirely disable indexing for all files, click here; to re-enable it, click here.

Previous Next  Up  Table of Contents  EagleFiler Home