Results 1 to 2 of 2

Thread: Index Issues

  1. #1
    New Member
    Join Date
    Nov 2014
    Posts
    1

    Default Index Issues

    Hi. I've been importing a lot of email from Apple Mail and have several indexing questions.

    I've read the manual (and other threads here) about the indexing options. My initial imports were set to the default phrase indexing option, and after about 200,000 messages the indexing during import began to take hours. I then switched to word index and noticed a change from an estimate of 14 hours to import the next file to 5. Several files later the imports were estimated at 4 hours yet took 5-6 each time. This file now has ~663,500 messages and the .eflibrary file is 5.06GB.

    I started a second file for a separate group of Mail folders and immediately set it to word indexing before any import was done. This file continues to import at a much faster speed than my initial file. The latest folder of ~14,000 messages took ~15 minutes to index. This file currently has ~219,000 records and its .eflibrary file is 175MB.

    According to the manual:

    If you hold down the Command and Option keys when opening a library, EagleFiler will show the Rebuild Indexes dialog. When you rebuild an index, EagleFiler deletes the old index file and builds a new one from scratch.
    I followed these instructions for both files yet I'm unconvinced all records in the first file were actually included in the new index type of word searching. Does the total number of messages in a file impact the time required to index? If not, then similar size imports should approximately take the same time to import, and that's not what I'm experiencing.

    I just started another reindex of the first file, selecting Records, Notes and Messages to be reindexed by Word. The .eflibrary file reset as expected, which assures me that it is building a new index of all. However, EF estimates that this process will require ~20 hours to complete. Does that make sense to you? If it can import ~14,000 per 15 minutes (as done in the second file) then shouldn't it take ~11 hours to complete 663,500?

    Is there a practical limit to the number of email messages per file?

    Suggestions?

  2. #2

    Default

    Quote Originally Posted by Ken555 View Post
    I've read the manual (and other threads here) about the indexing options. My initial imports were set to the default phrase indexing option, and after about 200,000 messages the indexing during import began to take hours. I then switched to word index and noticed a change from an estimate of 14 hours to import the next file to 5. Several files later the imports were estimated at 4 hours yet took 5-6 each time. This file now has ~663,500 messages and the .eflibrary file is 5.06GB.
    Those sound like abnormally long indexing times. What kind of Mac are you using?

    Quote Originally Posted by Ken555 View Post
    I followed these instructions for both files yet I'm unconvinced all records in the first file were actually included in the new index type of word searching.
    If you contact me via e-mail, I can send you a test version of EagleFiler that will log some information such as which type of indexing is being used for each file and how many documents are in the index.

    Quote Originally Posted by Ken555 View Post
    Does the total number of messages in a file impact the time required to index?
    Yes, and especially for phrase indexing. It’s more work to update an index file that has more records, and file fragmentation can also play a role.

    Quote Originally Posted by Ken555 View Post
    However, EF estimates that this process will require ~20 hours to complete. Does that make sense to you? If it can import ~14,000 per 15 minutes (as done in the second file) then shouldn't it take ~11 hours to complete 663,500?
    The estimate may not be accurate. When starting from scratch, it’s based on the assumption that each mailbox will take the same amount of time to index as the mailbox that’s currently being indexed. However, 20 hours does seem like too long for that number of messages.

    Quote Originally Posted by Ken555 View Post
    Is there a practical limit to the number of email messages per file?
    Per library, no. Per mailbox, indexing does get slower if you have more than 100,000 or so messages per index (especially with phrase indexing). On the other hand, once the index is built it will never need to be updated, and searching will be faster than with multiple smaller indexes.

Similar Threads

  1. Does EagleFiler index non-Skim PDF annotations?
    By squibbly in forum EagleFiler
    Replies: 1
    Last Post: 03-03-2014, 12:25 PM
  2. index voodoopad document
    By adn in forum EagleFiler
    Replies: 1
    Last Post: 05-31-2009, 09:46 PM
  3. Word vs Phrase Index
    By wjshack in forum EagleFiler
    Replies: 2
    Last Post: 01-22-2009, 04:55 PM
  4. Feature suggestions: encryption and index
    By brab in forum EagleFiler
    Replies: 2
    Last Post: 05-30-2007, 02:35 PM
  5. Index hanging on a file
    By blodwyn in forum EagleFiler
    Replies: 16
    Last Post: 05-30-2007, 12:17 PM

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •