EF as PDF organizer, and small feature request: Search on filename

Looking through the documentation for EF, it appears that it isn’t possible at this point to search in the filename field. Would this be easy to add?

I’m currently trying out a library as a collection of my myriad journal articles (PDFs), and in the process of organizing them initially, it would be quite useful to be able to filter on certain filenames (since I have a kind of naming convention I have generally used) to help me get things into their folders properly. I suspect that search anywhere might cover the filename field, but it would be better for the specific thing I’m trying to do to be able to search solely on the filename.

One of the big draws for me in using EF this way is the fact that it automatically detects if the library already contains the PDF file I’m trying to add, so I don’t have to think, I can just dump my folders of collected PDFs on it and, generally, get no serious duplication. I look forward to integrating this into my normal routines, once I have this initial sorting done.

(I have about 4K PDF files, around 5GB, in there at the moment, and I am finding that it moves relatively slowly as I sort things, however – when I move something into a folder, I often get a minute-long beachball while it reindexes. Is there any way that this can be threaded out more aggressively? I know that the indexing happens as a background process [efindexer, I can see it in top], but I nevertheless am still getting beachballs, so perhaps there’s still a single thread waiting for responses from the background thread? I’m using a 2GHz iMac G5, so it should be powerful enough to do the indexing quickly. I know responsiveness is one of the areas already planned for future development, so I’m sticking it out now with a hopeful eye toward the future.)

Yes, an Anywhere search will include the filename, however I agree that it would be nice in some circumstances to be able to search just the filename.

There are two phases to indexing: extracting the text from the document and adding that text to the index. The latter is what efindextool does; this takes the majority of the time, and it happens completely in the background. The former also happens in a separate thread, but it has some locking, e.g. to protect against the file being moved out from under the extractor. This is one of the areas that has some room for optimization.

I don’t think that moving a document will cause it to be reindexed unless the filename changes as a result of the moving. Please contact me via e-mail if you can reproduce a situation where moving a document triggers an actual re-indexing (i.e. you see efindextool in top) rather than just a momentary item in the Activity Viewer as it checks to see whether the document needs indexing.

I haven’t explored this thoroughly yet, but I did just make a discovery about where things seem to be hanging up a bit.

I have a number of big folders of PDFs I am capturing from the Finder, but the capturing process is agonizingly slow – it does a file, pauses for quite a while with a beachball, then does the next. However, I tried to do something while it was importing, and discovered that if I click in the menu bar to drop down a menu, everything speeds way up – Growl can’t even keep up. My guess is that there’s some update that is done to the display window after each file is imported, and by short-circuiting that update (by holding the menu open), I avoid the beachballs and my files fly in. This might explain why it seemed like moving files was slow too, since it presumably does this display update after each file is moved as well.

I don’t think I have any other utilities that hook into the display code that might be causing a conflict, but it’s possible. For example, perhaps something might be passing through WindowShadeX – but I can’t think offhand of anything else I have installed that might be relevant. I have gotten a couple of crashes along the way that blamed something about the Window Server, usually this happened when I was dragging a bunch of records from one folder into another. Maybe that’s related (I’ll try to explore further, since if I’m the only one seeing this, then it may not be exactly EF that’s doing it but something about how EF interacts with something else.)

I realize that this is probably a bit too vague to be useful, I’ll see if I can figure out more specifics, but if it suggests something, I’d be interested to hear ideas.

Please contact me via e-mail so that I can look into what might be causing this.

By holding down the menu, you are postponing saving EagleFiler’s database as well as updates to the interface. Depending on what you had displayed, you could alternatively speed up the interface updates (to a lesser extent) by having nothing selected in the source list.

I don’t think so, because EagleFiler moves all the files atomically and then does one update.

I’ve added this in EagleFiler 1.1.6.

One of the big draws for me in using EF this way is the fact that it automatically detects if the library already contains the PDF file I’m trying to add, so I don’t have to think, I can just dump my folders of collected PDFs on it and, generally, get no serious duplication.

I also use EF to manage my PDF library, and now to many other kinds of files, as well as direct captures from the web. Avoiding duplications is a significant advantage. Initially it was slow when I made certain kinds of changes, but since recent updates, there is very little lag. About the only facility I miss is the ability to copy from one library to another, which is promised for a future update.

Filename search
It is very nice that filename search has been added in 1.1.6. However, the default search is Title which is almost useless at least for the pdfs I am dealing with. It would be nice to have a preference to let the user set the default search.

I don’t think a preference is necessary. If you set the search scope to Filename, EagleFiler will remember it for that window, and it will become the default for new windows (for that library as well as new libraries). So, basically you just have to change it for the windows/libraries that you’ve already opened, since EagleFiler was remembering the last-used scope, which for those was (I guess) Title.