You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by "Insurance Squared Inc." <gc...@insurancesquared.com> on 2006/03/06 00:23:07 UTC

Normal search speeds

Asking again for the patience of the list, we're still working on speed. 
I guess what I need to know is if we still have a 'problem' or if the 
following search speeds are normal for nutch.

query: 'term life insurance'; first search 25 seconds, second search 6 
seconds.
query: 'stratford bed and breakfast'; first search 8 seconds, second 
search 2 seconds
query: 'mortgage broker'; first search 6 seconds, second search 5 seconds

Is this the type of speed you'd expect from a nutch install?  I keep 
feeling that it should be far faster than what we're seeing.

Specs: nutch 0.71, merged index.  Dedicated server, 4 million pages 
indexed, dual xeon, 8gigs RAM, 3 Scsi HD's in Raid 0. 

nutch 0.7.0 search performance measurement

Posted by Stefan Groschupf <sg...@media-style.com>.
Hi,
for people that found that interesting I had published some  
measurement values I had done a long time ago.
http://www.find23.net/Web-Site/blog/A712F01B-4BB1-4FC6-AE95- 
E64988FBCC79.html
All time related values are in milliseconds.
Don't take the values to serious however at least they give an idea.

Stefan


Am 06.03.2006 um 02:03 schrieb Insurance Squared Inc.:

> That's correct, we're not using ndfs.  As far as I know it's an out  
> of the box installation of Mandrake 2006, tomcat, and nutch.
>
> Byron's suggestion of merging to one index cut speeds by about 1/3  
> or 1/2.  I think we've already looked at the tomcat memory settings  
> but I'll ask our developer to look deeper.  I'm suspicious that  
> something's cycling somewhere, it's hard for me to imagine a  
> regular process taking 25 seconds when cpu and memory show nothing  
> really happening.  (I also suspect that the problem is not with  
> nutch, but instead with something at the OS or tomcat level, or  
> with another system process that nutch is using).
>
>
>
>
> Stefan Groschupf wrote:
>
>> This is very slow!
>> You can expect results in less than a second from my experience.
>> + check memory settings of tomcat.
>> + you do not use ndfs, right?
>>
>>
>> Am 06.03.2006 um 00:23 schrieb Insurance Squared Inc.:
>>
>>> Asking again for the patience of the list, we're still working  
>>> on  speed. I guess what I need to know is if we still have a  
>>> 'problem'  or if the following search speeds are normal for nutch.
>>>
>>> query: 'term life insurance'; first search 25 seconds, second   
>>> search 6 seconds.
>>> query: 'stratford bed and breakfast'; first search 8 seconds,   
>>> second search 2 seconds
>>> query: 'mortgage broker'; first search 6 seconds, second search  
>>> 5  seconds
>>>
>>> Is this the type of speed you'd expect from a nutch install?  I   
>>> keep feeling that it should be far faster than what we're seeing.
>>>
>>> Specs: nutch 0.71, merged index.  Dedicated server, 4 million  
>>> pages  indexed, dual xeon, 8gigs RAM, 3 Scsi HD's in Raid 0.
>>
>>
>> ---------------------------------------------
>> blog: http://www.find23.org
>> company: http://www.media-style.com
>>
>>
>>
>

---------------------------------------------
blog: http://www.find23.org
company: http://www.media-style.com



Re: Normal search speeds

Posted by Howie Wang <ho...@hotmail.com>.
If you want to narrow down whether it's a Tomcat issue, maybe you
could try running Nutch on another app server like Resin to see if
there's a difference. It's been a while since I used Tomcat, but I
did find the performance to be kind of slow. I think things are
supposed to be better now, but many claim that Resin is still faster.

Howie

>That's correct, we're not using ndfs.  As far as I know it's an out of the 
>box installation of Mandrake 2006, tomcat, and nutch.
>
>Byron's suggestion of merging to one index cut speeds by about 1/3 or 1/2.  
>I think we've already looked at the tomcat memory settings but I'll ask our 
>developer to look deeper.  I'm suspicious that something's cycling 
>somewhere, it's hard for me to imagine a regular process taking 25 seconds 
>when cpu and memory show nothing really happening.  (I also suspect that 
>the problem is not with nutch, but instead with something at the OS or 
>tomcat level, or with another system process that nutch is using).
>
>
>
>
>Stefan Groschupf wrote:
>
>>This is very slow!
>>You can expect results in less than a second from my experience.
>>+ check memory settings of tomcat.
>>+ you do not use ndfs, right?
>>
>>
>>Am 06.03.2006 um 00:23 schrieb Insurance Squared Inc.:
>>
>>>Asking again for the patience of the list, we're still working on  speed. 
>>>I guess what I need to know is if we still have a 'problem'  or if the 
>>>following search speeds are normal for nutch.
>>>
>>>query: 'term life insurance'; first search 25 seconds, second  search 6 
>>>seconds.
>>>query: 'stratford bed and breakfast'; first search 8 seconds,  second 
>>>search 2 seconds
>>>query: 'mortgage broker'; first search 6 seconds, second search 5  
>>>seconds
>>>
>>>Is this the type of speed you'd expect from a nutch install?  I  keep 
>>>feeling that it should be far faster than what we're seeing.
>>>
>>>Specs: nutch 0.71, merged index.  Dedicated server, 4 million pages  
>>>indexed, dual xeon, 8gigs RAM, 3 Scsi HD's in Raid 0.
>>
>>
>>---------------------------------------------
>>blog: http://www.find23.org
>>company: http://www.media-style.com
>>
>>
>>



Re: Normal search speeds

Posted by "Insurance Squared Inc." <gc...@insurancesquared.com>.
That's correct, we're not using ndfs.  As far as I know it's an out of 
the box installation of Mandrake 2006, tomcat, and nutch.

Byron's suggestion of merging to one index cut speeds by about 1/3 or 
1/2.  I think we've already looked at the tomcat memory settings but 
I'll ask our developer to look deeper.  I'm suspicious that something's 
cycling somewhere, it's hard for me to imagine a regular process taking 
25 seconds when cpu and memory show nothing really happening.  (I also 
suspect that the problem is not with nutch, but instead with something 
at the OS or tomcat level, or with another system process that nutch is 
using).




Stefan Groschupf wrote:

> This is very slow!
> You can expect results in less than a second from my experience.
> + check memory settings of tomcat.
> + you do not use ndfs, right?
>
>
> Am 06.03.2006 um 00:23 schrieb Insurance Squared Inc.:
>
>> Asking again for the patience of the list, we're still working on  
>> speed. I guess what I need to know is if we still have a 'problem'  
>> or if the following search speeds are normal for nutch.
>>
>> query: 'term life insurance'; first search 25 seconds, second  search 
>> 6 seconds.
>> query: 'stratford bed and breakfast'; first search 8 seconds,  second 
>> search 2 seconds
>> query: 'mortgage broker'; first search 6 seconds, second search 5  
>> seconds
>>
>> Is this the type of speed you'd expect from a nutch install?  I  keep 
>> feeling that it should be far faster than what we're seeing.
>>
>> Specs: nutch 0.71, merged index.  Dedicated server, 4 million pages  
>> indexed, dual xeon, 8gigs RAM, 3 Scsi HD's in Raid 0.
>
>
> ---------------------------------------------
> blog: http://www.find23.org
> company: http://www.media-style.com
>
>
>

Re: Normal search speeds

Posted by Stefan Groschupf <sg...@media-style.com>.
This is very slow!
You can expect results in less than a second from my experience.
+ check memory settings of tomcat.
+ you do not use ndfs, right?


Am 06.03.2006 um 00:23 schrieb Insurance Squared Inc.:

> Asking again for the patience of the list, we're still working on  
> speed. I guess what I need to know is if we still have a 'problem'  
> or if the following search speeds are normal for nutch.
>
> query: 'term life insurance'; first search 25 seconds, second  
> search 6 seconds.
> query: 'stratford bed and breakfast'; first search 8 seconds,  
> second search 2 seconds
> query: 'mortgage broker'; first search 6 seconds, second search 5  
> seconds
>
> Is this the type of speed you'd expect from a nutch install?  I  
> keep feeling that it should be far faster than what we're seeing.
>
> Specs: nutch 0.71, merged index.  Dedicated server, 4 million pages  
> indexed, dual xeon, 8gigs RAM, 3 Scsi HD's in Raid 0.

---------------------------------------------
blog: http://www.find23.org
company: http://www.media-style.com