Contents  EagleFiler Manual  Translate  Technical Support

8.10   How does indexing in EagleFiler work?

EagleFiler maintains indexes in order to quickly search the contents of your library. EagleFiler can search thousands of records in a fraction of a second since it only needs to read the optimized index files, not the files for each individual record. The types of searches and their syntax are described in the Searching section. Normally, indexing is not something that you need to worry about, since EagleFiler handles it automatically. This section describes how indexing works in case you need to customize it or fix a problem.

What’s Indexed

EagleFiler indexes the text contents of all the files, e-mail messages, and notes in your library. This includes text that Spotlight wouldn’t see, such as files attached to e-mail messages, invisible Web content, PDF annotations, and Skim notes attached to PDF files. EagleFiler has built-in support for indexing common file formats. It also supports custom file formats via Spotlight importer plug-ins.

Index Storage

As shown in the Library Folders section, each EagleFiler library has a .eflibrary package. The indexes for the library are stored inside this package. This means that:

You may prefer to exclude EagleFiler’s indexes from backups in order to save disk space or bandwidth. To exclude the indexes from Time Machine backups, use the ExcludeIndexesFromBackup option in the Esoteric Preferences. To exclude the indexes from other types of backups, set your backup software to skip files whose names end with .efindex or .efmailtoc.

Updating Indexes

In order for your searches to be accurate, EagleFiler needs to make sure that the index is updated whenever the contents of a file changes:

Each time you open a library, EagleFiler scans all of the files to make sure they are up-to-date in the index. This allows EagleFiler to detect files that were modified outside of EagleFiler and also complete any indexing work that was in progress when you closed the library. Usually, most of the files were already indexed, so this scan does not take very long.

Mailbox files are not editable, so EagleFiler knows that the messages in them never change. Once a mailbox has been completely indexed, EagleFiler marks the mailbox as done. This speeds up opening the library, since EagleFiler will not have to check whether each message is up-to-date in the index. Additionally, the index file is treated as read-only so that it will not need to be recopied during a backup or sync.

Word vs. Phrase Indexing

When rebuilding an index (see below) you can choose (per-library) whether EagleFiler should index words or phrases. Indexing for word searches is much faster, but searching for multiple words will find all the documents that contain those words anywhere in the document. Indexing for phrase searches lets you search for a group of words that appear near each other in a document, however indexing will take longer and the index files will be several times as large. Phrase indexing is the default.

Rebuilding Indexes

If you hold down the Command and Option keys when opening a library, EagleFiler will show the Rebuild Indexes dialog. When you rebuild an index, EagleFiler deletes the old index file and builds a new one from scratch. This may take a long time, but it can be useful because:

Damaged Indexes

If EagleFiler is unable to open an index, it treats it as damaged. It will move the old index file to the Damaged folder inside the Indexes folder in the .eflibrary package and log the error to Console. Then it will create a new index file and begin rebuilding it.

If you believe that the index file was not actually damaged, you can close the library in EagleFiler and move the index file back out of the Damaged folder. Otherwise, you can delete the files in the Damaged folder to save disk space.

Excluding Files From Indexing

Occasionally, a file may be unreadable by EagleFiler or damaged such that it causes the indexer to hang or crash. To prevent such files from causing problems for the rest of your library, you can exclude them from indexing. You can exclude a file from indexing by assigning it the ef_noindex tag.

It is also possible to exclude certain filenames and extensions from indexing, to work around buggy Spotlight importers. For example, this command will exclude the file foo.xls:

defaults write com.c-command.EagleFiler SpotlightImporterSkippedNames -array-add foo.xls

and this command will exclude all .xls files:

defaults write com.c-command.EagleFiler SpotlightImporterSkippedExtensions -array-add xls

You can exclude files attached to e-mail messages from indexing by entering their names using Terminal. For example, the command:

defaults write com.c-command.EagleFiler UnindexedAttachmentNames -array "smime.p7s" "PGP.sig"

will tell EagleFiler not to index files named smime.p7s or PGP.sig. The command:

defaults write com.c-command.EagleFiler UnindexedAttachmentExtensions -array dwg

will tell EagleFiler not to index files with type .dwg. This can be useful for working around a slow or buggy Spotlight importer plug-in.

Either can also speed up indexing if you know that you will never want to search the contents of attachments with those names.

Troubleshooting

Hangs During Indexing

If EagleFiler hangs during indexing, you can use Activity Monitor to record what EagleFiler and its helper processes are doing. The eftexttool process is in charge of reading files to extract their text content. The efindextool process is in charge of updating the index files for your library. (These may appear in Activity Monitor as WashFramework.) Please see Sending in a “Sample” Report for more information. The information in the sample report will show what EagleFiler was doing and perhaps which file it was working on.

Reindexing Due to New Files Being Created

One case to be aware of is that if you copy your entire library (or sync it to a new Mac with Dropbox or restore it from backup) the filesystem will show that the files have been touched (the “attribute modification date” or “ctime” has changed), so EagleFiler will go through and reindex all of the files. However, the files will be fully searchable during this reindexing because the old index entries are still in place until they are updated.

To change the default behavior so that EagleFiler ignores the “attribute modification date” for indexing purposes, click here; or click here to restore it.

Unexpected Reindexing

If EagleFiler seems to be reindexing your files and you don’t know why, it may help to enable debug logging to Console. For example, the log may show that certain files had never been indexed or that they were in the index but had since been modified. It can also reveal when a particular file, or type of file, is taking a long time to index. To enable logging for file indexing, click here; to disable it, click here. To enable logging for message indexing, click here; to disable it, click here.

Disabling Indexing

If you want to entirely disable indexing for all files, click here; to re-enable it, click here.

Slow Searches or Indexing

Indexed searches in EagleFiler are normally very fast, but they can be slowed down by large numbers of mailboxes, certain unusual queries, or fragmented index files.

Not Finding Files

If an indexed search is not finding a file that you think it should, first create a new test library and import the file into it. If the file is findable in the test library, you probably just need to rebuild the indexes of your main library. Otherwise, you can report the test library to technical support.

     Contents  EagleFiler Manual  Translate  Technical Support