"ERROR Unable to Import", Timeouts, Throttling and TaskFailedException

I am having a great deal of trouble trying to import bookmarks into EagleFiler.
I admit there are a lot - almost 8000, but my understanding is that EF should be able to handle this.
I also see some of the same problems with imports as small as 10 or 12 bookmarks.

What I do:
I have the bookmarks broken up into three folders (with sub-hierarchies) of about 2500-3000 each.
I simply drag and drop a folder from Safari bookmarks into “records” in the EF window.
Is there a different or better way to import bookmarks as webarchives?

What happens:
In the activity window I see numerous imports happen.
In the Error window I begin to see errors related to “timeouts”. Occassionally I also get an “TaskFailedException 10. Error”
Out of about 8000 bookmarks, I got about 2000 timeout errors.

Why I don’t think the timeouts are legit:
If I select them in the error window they usually come up fine.
Also I prescreened all the bookmarks through a program called BookDog which does a good job of verifying every URL works.

Why I don’t think it’s my network/computer but related specifically to EagleFiler:
a. I have run ISP speed tests right around the same time as these errors and found my cable broadband connection to be 6kB/s down and 360kB/s up.
b. Also I have run “BookDog” which verifies every URL. I realize at different times of day there may be downtimes, but I don’t think this accounts for
the degree of timeouts I’m seeing. Could it be that a server is trying to respond but cannot get a message through because so many downloads
are occurring at the same time? Or EagleFiler is busy processing those downloads?
c. I also ran some of these tests on a different ISP/network (at a cafe; so wireless connection). Results were similar.

  1. Could there be a need to provide a throttling rate? Other tools that queue many requests like this seem to do this; for example “BookDog” when
    it verifies URL’s.

  2. Have other people had similar problems with getting “timeouts”? Can you suggest any other ways to troubleshoot this? How can I isolate the
    problem? If it is isolated to something outside of Eaglefiler, is there some utility you might suggest to debug this?

Workaround Option 1. I thought perhaps I could import the same folder again, and it would simply warn me about duplicates while it went ahead
and loaded many if not all of the previous one’s that had failed due to timeout. **This didn’t work **because apparently duplicate checking only works
on text files - not on webarchives nor URL’s. It seems like it would be very useful to **allow an option to check for duplicate URL’s(not content) **upon import and
prevent it. Why load it again if I already have it? I understand the user would have to accept the responsibility that two url’s may be identical but
generate different content depending on when they were accessed for example. [Update] Perhaps duplicate checking does work on webarchives in certain
cases or only when imported from OmniWeb (see #6 below)?

Workaround Option 2. Copy the error log file into a textedit file. Manually invoke each of the 2000 URL’s in Safari and capture using F1. This is
not practical. Maybe it could be scripted. But again maybe there would be a throttling issue. Even still there are other problems with this.
The error log file doesn’t include the folder hierarchy for which the bookmark belonged to - even if I could get the bookmark into EF.
Another problem with the error window (for me) is that the only way to see what caused the error is to click on the error; then the status bar shows it
as “timeout” or “host not found” etc. The problem here is that to save this information I have to select all in the window and copy to a textedit window
so I lose all of the “reason codes” as well. All I get is “Could not Import URL: http://www.apple.com”.

Workaround Option 3. Provide a way to go back and re-try failed imports. Requires EagleFiler enhancements. Possible approaches:
a. Add option to create a record even if there is a timeout. The record would retain the proper folder hierarchy from the original bookmarks and it
would have the URL. The user would at least have his sources retained. He would be no worse off (no loss of information). He would have to
manually try to re-import each of the URLs as time permits or as he needs them.
b. Add option to keep a specific log of failed imports and capability to re-try these at the user’s convenience.

Workaround Option 4. Try importing from OmniWeb instead of Safari. I decided to try this quickly. I was surprised by two results.
a. Drag and Drop from OmniWeb does not work as well as from Safari. Specifically, I could not drag a folder to import. I could only select groups
of bookmarks and drag them. So you lose all hierarchy information.
b. There DOES seem to be some kind of throttling of webarchives going on by EagleFiler and it appears to be different when importing from OmniWeb.
When I imported the same URLs from Safari and watched the Activity window, I saw 6 simultaneous webarchives being processed. However,
with OmniWeb I only see 4 being processed. This seems consistent with my informal observations that I get fewer timeout errors using OmniWeb;
but I still get some occassionally as well as the dreaded TaskFailedException.
c. I also noticed I received many more duplicate errors for this set of URLs when I imported them more than once - although still just a fraction of the total.
Whereas with Safari, for the same set, I get no duplicates detected.

Conclusion:
Even if the timeouts are not at all related to EagleFiler (probably true since no one else seems to have this problem), I was hoping that it would
support some way for re-tries (multiple passes). Also, I’d like to resolve the **TaskFailedException. ** Ultimately I just want to get all my bookmarks loaded.
Sorry for the long message.
And just to clarify something … I am using the trial version of the software.
I think it’s great software! I really do! … I’m just a little frustrated trying to get my information into it.
(and I still need to tackle email )

I’m sorry to hear that, but thanks for the very detailed report. I would expect this type of bulk import to work, barring network issues. I just tried dragging a folder of several hundred Safari bookmarks into EagleFiler, and it worked with no errors, so I’ll need some more information to figure out why it isn’t working for you.

What kind of Mac do you have, and how many processor cores does it have? Which version of Mac OS X are you using?

The above should work. Other options would be:

  1. Dragging bookmarks (rather than bookmark folders). Internally, EagleFiler treats this differently.
  2. Copying the bookmark URLs to the clipboard and then pasting them into EagleFiler’s “Import URL(s)” window.

Please e-mail me your log file:

/Users/<username>/Library/Logs/EagleFiler/EagleFiler.log

so that I can take a closer look at the errors.

Are you sure those numbers are right? That’s only slightly faster than modem speed going down, and cable is generally faster down than up.

It’s possible, although I have not seen that happen myself.

EagleFiler does throttle the number of simultaneous requests, although it’s possible that there’s a bug so that this isn’t working properly in your situation.

This is the first I’ve heard of this problem.

EagleFiler’s duplicate checking works on file contents. Sometimes this will “work” on Web archives, sometimes not. Some Web pages are dynamic such that the page will be slightly different if you download it twice in succession. There’s no way for EagleFiler to know if the differences are significant, so it has to treat them as not duplicates.

Many URLs have content that changes, so there are all sorts of cases where it’s desirable to allow multiple Web archives with the same URL. However, I have a feature request logged to make it an option to check for duplicates by URL.

That should work. You could import one URL at a time, and the “import” script command will wait until it’s finished importing before moving on to the next one.

The issue here is that OmniWeb does not support dragging bookmark folders.

This is because OmniWeb supplies URLs to EagleFiler, whereas Safari supplies bookmark files, and these take slightly different code paths in EagleFiler. However, from the above it sounds like the throttling is working if you see a maximum of 4 or 6 simultaneous imports. I would not expect to see timeouts with that number.

I think that’s probably a coincidence. In both cases, EagleFiler is trying to import the same URL.

Please click this link to reduce the number of simultaneous imports and see if that helps. (You can click this link to restore it to the original setting.)

Thanks for sending the log. In the next version I will make some changes so that EagleFiler can retry if the connection times out. In the meantime, does reducing the number of simultaneous imports help?

Hardware Overview:
Model Name: MacBook Pro 15"
Model Identifier: MacBookPro1,1
Processor Name: Intel Core Duo
Processor Speed: 1.83 GHz
Number Of Processors: 1
Total Number Of Cores: 2
L2 Cache: 2 MB
Memory: 1 GB
Bus Speed: 667 MHz
System Software Overview:
System Version: Mac OS X 10.5.4 (9E17)
Kernel Version: Darwin 9.4.0

Built-in Ethernet:
Type: Ethernet
Hardware: Ethernet
BSD Device Name: en0
IPv4 Addresses: 192.168.0.4
IPv4:
Addresses: 192.168.0.4
Configuration Method: DHCP
Interface Name: en0
NetworkSignature: IPv4.Router=192.168.0.1;IPv4.RouterHardwareAddress=00:09:5b:9a:8a:c4
Router: 192.168.0.1
Subnet Masks: 255.255.255.0
IPv6:
Configuration Method: Automatic
DNS:
Server Addresses: 65.175.128.46, 65.175.128.47
DHCP Server Responses:
Domain Name Servers: 65.175.128.46,65.175.128.47
Lease Duration (seconds): 0
DHCP Message Type: 0x05
Routers: 192.168.0.1
Server Identifier: 192.168.0.1
Subnet Mask: 255.255.255.0
Proxies:
FTP Proxy Enabled: No
FTP Passive Mode: Yes
Gopher Proxy Enabled: No
HTTP Proxy Enabled: No
HTTPS Proxy Enabled: No
RTSP Proxy Enabled: No
SOCKS Proxy Enabled: No
Ethernet:
MAC Address: 00:16:cb:8a:2a:3f
Media Options: Full Duplex, flow-control
Media Subtype: 100baseTX

Sorry. That was a typo. 6Mbps Down and 360kbps upstream.

I suspected the same thing. So I loaded a simple page with no ads on it and very little content … Google.com. and a few others. Then re-loaded them. It did not detect the duplicates. The next thing might be to try would be a couple static pages that one knows to be identical.

If I might be so bold … would this be a simple enough script for you to provide me? I am not fluent in applescript. I wouldn’t know where to begin. But I’m concerned that it would become tricky to somehow read one-at-time from safari’s bookmark hierarchy. Would I lose all hierarchy information? Ugh.
(Plus it would probably take forever to import.)

I am suspicious that it may be more than coincidence; perhaps OmniWeb is picking up more duplicates than safari does. As you mentioned, due to differences in the way imports are handled, they take different paths through the code. Just my gut feel after watching many tests and seeing some consistency.

By The Way …
To enter this multi-quoted message, I had to guess at the mark-up and hand edit it because the option for multi-quoted reply did not work!

Don

  1. I still have timeouts. It may have “helped” but that’s hard to quantify because when I import the same folder back-to-back I can get quite different number of timeouts.

  2. Interestingly, it seems that you might have been expecting only one import at a time. I saw two simultaneous imports consistently.

Please let me know if you have any other ideas for ways I can work around this.
Or isolate it.

[Update]
I tried this again. This time I did the import from OmniWeb (rather than Safari).

I received a TaskFailedException. I sent you a log file.

I received NO timeouts; however, I don’t know how far it got due to the exception.
(and the fact that I was importing the same set of URL’s into the same library
I had just used for the Safari test. So I can’t just count records. And EagleFiler
doesn’t tell me any stats like how many records it imported etc.)

I also noticed 4 simultaneous imports instead of one that should be.

OK, that’s much faster than my connection, so I doubt it’s related to the timeouts.

Web archives of a static page are not guaranteed to be identical. For example, the Web archive might include the HTTP headers from the Web server’s response, and the “Date” header contains a timestamp. If you create two Web archives in quick succession (or from Safari, using the same cached content), the dates will be identical, and so the duplicate will be detected. However, in general the timestamps will be different, so the Web archives will be different.

I don’t think Safari lets scripts access the bookmarks, anyway. So writing a script would presume that you had the URLs in some other format already.

OmniWeb isn’t doing anything. I think it’s just a matter of the timing. The sequence of URLs from OmniWeb may cause EagleFiler to fetch them in a particular order such that more of the timestamps end up the same.

The multi-quote button is for replying to multiple posts, not multiple parts of the same post.

I’m guessing it’s a network problem, then.

You have a dual-core processor, so I’d expect to see two.

I’ll send you a pre-release version of EagleFiler that retries when it encounters a timeout.

I tried the Pre-release version.

I still get timeout errors.
I’ll send you a log file.

Just as a follow-up, and to potentially help other people who might read this…
My problem has been resolved, thanks mostly to Michael.

I found in one of my log files the following:
9/11/08 7:10:55 AM com.apple.SystemStarter[29] Stopping Cisco Systems VPN Driver
9/11/08 7:10:56 AM com.apple.SystemStarter[29] kextunload: unload kext /System/Library/Extensions/CiscoVPN.kext succeeded

the CiscoVPN thing made me worry that maybe it is interfering with the networking queue. So I uninstalled it (MacZapper).

Also, Michael gave me a new build which limits the number of simultaneous imports to one (default is 6).

I was able to import about 8000 Webarchives in 3 separate sessions with <10 timeout errors combined.
I used eaglefiler 1.3.8-DR2.
I used preferences set to:
allow duplicates;
10x retries; this is a preference option he added to my build to re-try if there is a timeout.
turn off spotlight comments

My Conclusion:
The source of the excessive timeouts (and perhaps the internal error) was a combination of:

  1. too many simutaneous imports
  2. CiscoVPN client installed on my machine

Solving either one of these alone did not seem to fix the problem.

EagleFiler 1.4 includes some changes to reduce the number of simultaneous downloads and also to retry if there’s an error downloading a Web page.