You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@manifoldcf.apache.org by Karl Wright <da...@gmail.com> on 2014/05/01 02:02:42 UTC

Re: Update on "Two simultaneous web crawls hang"

Site will be updated in a few hours.
Karl


On Wed, Apr 30, 2014 at 4:56 PM, Ahmet Arslan <io...@yahoo.com> wrote:

> Hi Tom,
>
> Vote has just ended. Karl is now pushing artifacts.
>
> Ahmet
>   On Wednesday, April 30, 2014 11:08 PM, Tom Rees <tr...@chiliad.com>
> wrote:
>  Karl,
>
> Thank you for your help. I would like to upgrade to ManifoldCF 1.6. Do you
> know when this will be available?
>
> thanks,
> Tom
>
>
> On Fri, Apr 25, 2014 at 3:14 PM, Karl Wright <da...@gmail.com> wrote:
>
> Hi Tom,
>
> If this was manifoldcf 1.5, then your problem is almost certainly the
> bandwidth throttling.  The code has a bad bug that enforces throttling that
> is 1000 times too slow.  Either adjust the throttle, or apply the patch, or
> upgrade to mcf 1.6.
>
> Thanks,
> Karl
>
> Sent from my Windows Phone
> ------------------------------
> From: Tom Rees
> Sent: 4/25/2014 2:23 PM
> To: user; Vasant Kumar; Steven Bennett; Amy Jocefczyk-Papa
> Subject: Update on "Two simultaneous web crawls hang"
>
>  In an email on April 4, 2014 I reported that whenever I run two
> simultaneous web crawls in ManifoldCF that both crawls will simply stop
> progressing after a short period of time. When I look at the thread stack
> traces I see that there are many fetcher threads that are waiting forever
> in wait() calls.
>
> I am still having that same problem, but I have tried different
> configurations. First, in the previous email the tests used Postgres 9.3.2.
> I tried using Postgres 9.1.0, but the web crawls hang in the same way.
> However, it seems like the web crawls took slightly longer to stop making
> progress. Also, I have tried crawling different web sites, and I also tried
> using a custom output connector that only saves the downloaded files to the
> local file system. The problem persists. Also, another developer at our
> site has this same problem whenever he runs multiple web crawls. Is this a
> known issue with the web crawler?
>
> thanks,
> Tom Rees
>
>
>
>
>