You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Greg Sellek <gs...@yahoo.com> on 2005/03/28 23:32:38 UTC

http.max.delays again

I know this has been asked before, but the Apache site doesn't let you search the archives and sourceforge is now showing a permission denied when I try and search the archives.

Anyway, I"m seeing 30-50% errors with the  http.max.delays error.  I saw one post awhile back about setting my server delay lower and less threads, but it doesn't seem to help.  The page pulls up in about 5 sec when viewed from a browser, and my http.timeout is set at 20000.

Any ideas?

Thanks,
Greg

 

		
---------------------------------
Do you Yahoo!?
 Make Yahoo! your home page   

Re: http.max.delays again

Posted by Doug Cutting <cu...@nutch.org>.
Greg Sellek wrote:
> I know this has been asked before, but the Apache site doesn't let you search the archives and sourceforge is now showing a permission denied when I try and search the archives.
> 
> Anyway, I"m seeing 30-50% errors with the  http.max.delays error.  I saw one post awhile back about setting my server delay lower and less threads, but it doesn't seem to help.  The page pulls up in about 5 sec when viewed from a browser, and my http.timeout is set at 20000.

How many hosts are you crawling?  How many threads are you using?

Try increasing http.max.delays and further decreasing the number of 
threads.  Or try setting fetcher.threads.per.host to something greater 
than one.

Doug

RE: http.max.delays again

Posted by Steve Follmer <sf...@meer.net>.
I could be out of line here, but I had a paranoid suspicion that
indexing
didn't seem to be honoring nutch-site.xml, so against best practices
I just modified nutch-conf.xml directly. Then again, not sure it helped,
I still have a similar problem. 

Steve

PS Ben Stiller should star in every movie.
http://www.thebestpageintheuniverse.net/c.cgi?u=ben_stiller_should_star_
in_every_movie



-----Original Message-----
From: Greg Sellek [mailto:gsellek@yahoo.com] 
Sent: Tuesday, March 29, 2005 5:33 AM
To: nutch-user@incubator.apache.org
Subject: http.max.delays again



I know this has been asked before, but the Apache site doesn't let you
search the archives and sourceforge is now showing a permission denied
when I try and search the archives.

Anyway, I"m seeing 30-50% errors with the  http.max.delays error.  I saw
one post awhile back about setting my server delay lower and less
threads, but it doesn't seem to help.  The page pulls up in about 5 sec
when viewed from a browser, and my http.timeout is set at 20000.

Any ideas?

Thanks,
Greg

 

		
---------------------------------
Do you Yahoo!?
 Make Yahoo! your home page