Duplicate files continued
—Quote (Originally by tansey)—
- There are still about 400 duplicates sitting at the root level of the File folder. This is after I deleted over 150 files individually. Auditing is tedious because the real files should be filed within the sub folder structure.
—End Quote—
I see, so the duplicate files are not in the same folders as the originals? Did you initially import the files at the top level (the “Files” folder) and then move them into subfolders? Or did you capture them directly into the subfolders?
—End Quote (Originally by tsai)—
Documents were captured into top level and then tagged and moved to folders. The duplicates appear to be at the root level if they are properly named. I cannot find a location of the generic Learn2Grow Article Template files. Searching in the home folder only produces one hit, when there are 100+ duplicates.
Actually didn’t know you could capture to sub folders.
In the cases where there are multiple duplicates of the same file, could you give an example of which folders they are in?
—End Quote—
Named duplicates are at the File root level. I don’t see any at the sub folder level. Generic Learn2Grow Article Template files are a mystery to me.
—Quote (Originally by tansey)—
This means I would have had to manually go into hundreds of sub folders in the library and copy files into a location to be scanned. I clearly did not do any such thing.
—End Quote—
I’m not saying that you did that, but it seems incontrovertible that the files ended up there and so EagleFiler detected and imported them. It didn’t conjure them out of nothing.
—End Quote (Originally by tsai)—
Actually it looks like something conjured them out of nothing, but I am not saying that EF did it. I love EF.
—Quote (Originally by tansey)—
It is clear however, that EF did not identify any of these as duplicates even though they are exact duplicates in many cases.
—End Quote (Originally by tsai)—
That’s to be expected. When EagleFiler detects that a file has been added to a folder that it manages, it does not check for duplicates. It assumes that the file was purposely put there and so deleting it would be rude.
It does check for duplicates when capturing or importing via drag and drop or the “To Import” folder.
—End Quote (Originally by tsai)—
So the interesting issue is what was it scanning. I have no ideas at this time. Are there log files I can check/provide? All files imported to this library by me were done via capture key
—Quote (Originally by tansey)—
Bottom line: What is the quickest and easiest way to delete the duplicates.
—End Quote—
Are there any non-duplicates at the top level of the “Files” folder? Are there any duplicates in subfolders?
—End Quote (Originally by tsai)—
Subfolders don’t appear to have duplicates, and the top level is non duplicated, but I still don’t see were the generic Learn2Grow Article Template files are located.
Sampling is time consuming, but so far root level appears to be duplicates of subfolder articles. The problem is looking at the root and then confirming among 100’s of subfolders. The issue of the generic files is hard because the Finder (in my case Path Finder) does not show the generic Learn2Grow Article Template files.
—Quote (Originally by tansey)—
What can be done to avoid this type of corruption again.
—End Quote—
In order to prevent it, we would need to know for sure what caused it. Right now I have two theories:
- Maybe some other software added the duplicates at the top level after EagleFiler had moved them into subfolders. Where is your library stored? Any differences compared with your other libraries that are not exhibiting this problem? Are you using any syncing or file sharing software, e.g. Dropbox?
- Maybe there was a permissions or other filesystem problem such that when you filed into the subfolders the files were copied instead of moved, i.e. the originals were left behind at the top level and later detected as duplicates. If so, you could test this by capturing a new file, moving it, and seeing whether there’s still a file at the top level.
—End Quote (Originally by tsai)—
File is in Documents/Eaglefiler/L2G Articles. Most of my other EF libraries reside there but some are in other locations.
Don’t know why some other program might have reached into EF files and duplicated them and then put them in a place to be scanned for addition. Also unclear how the metadata was pulled out to rename the files and duplicate hundreds with the same file name.
Other libraries seem to be ok. Yes I am using Dropbox and MobileMe but I don’t see any relationship to duplicates since the duplicate files are long gone from Dropbox where they originated, and even those that arrived through Dropbox had correct metadata.
—Quote (Originally by tansey)—
I do use the Import folder on other libraries so it would be helpful to be able to turn scanning on again.
—End Quote—
Turning off “Scan for New Files” does not affect the “To Import” folder. They’re separate features.
—End Quote (Originally by tsai)—
Didn’t understand that, I need to read documentation to understand this feature. How do I turn it back on. The provided page turns it off but I don’t know if I have to use terminal or some other script to turn it back on.
Michael I know you are great at ferreting out and supporting your users so I am more than willing to try anything you suggest. Besides I couldn’t live without SpamSieve!
Note! I am having real problems with posting. I log in and then post a reply. When I go to submit, it acts as if I am not logged in. The reply is lost and I have to start over. Fortunately, since this has happened multiple times, I copy my response before hitting Submit Reply. It showed me as logged in before hitting Reply and then asking me to log in. Is this a timeout issue?
Thanks Frank