You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@manifoldcf.apache.org by Tom Rees <tr...@chiliad.com> on 2014/04/25 20:23:25 UTC

Update on "Two simultaneous web crawls hang"

In an email on April 4, 2014 I reported that whenever I run two
simultaneous web crawls in ManifoldCF that both crawls will simply stop
progressing after a short period of time. When I look at the thread stack
traces I see that there are many fetcher threads that are waiting forever
in wait() calls.

I am still having that same problem, but I have tried different
configurations. First, in the previous email the tests used Postgres 9.3.2.
I tried using Postgres 9.1.0, but the web crawls hang in the same way.
However, it seems like the web crawls took slightly longer to stop making
progress. Also, I have tried crawling different web sites, and I also tried
using a custom output connector that only saves the downloaded files to the
local file system. The problem persists. Also, another developer at our
site has this same problem whenever he runs multiple web crawls. Is this a
known issue with the web crawler?

thanks,
Tom Rees