You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by "K.A.Hussain Ali" <Hu...@photoninfotech.com> on 2005/12/31 07:17:59 UTC

uses of 'io.sort.mb' and ' io.sort.factor' in nutch-default.xml

Hi all

Could anyone tell the use of the  'io.sort.mb' and or 'io.sort.factor' in the nutch-default.xml,

Do Minmizing the value will have any performance effect ?
Is that any way related to the "Out of Memory Error"? if so what should be the minimum value for those entries.?

Any suggestions would greatly help
Thanks in advance
regards
-Hussain.





Re: uses of 'io.sort.mb' and ' io.sort.factor' in nutch-default.xml

Posted by Doug Cutting <cu...@nutch.org>.
K.A.Hussain Ali wrote:
> i did change the io.sort.mb and io.sort.factor and has default values of 
> 100

I am surprised that the default for io.sort.factor is 100.  I thought it 
was 10!  Please try setting this to a lower value.

> i still have the "Out of memory error" while updating

Can you please provide more details, like the last part of the output 
before this is thrown?  That should give some indication about where 
this is occurring.  Thanks.

Doug

Re: uses of 'io.sort.mb' and ' io.sort.factor' in nutch-default.xml

Posted by "K.A.Hussain Ali" <Hu...@photoninfotech.com>.
HI Piotr,

my configuration is

winxp,512 mb ram, nutch 7 ver, and jdk1.4.2
for -Xmx i pass 512m

my webdb contains like 15,000 urls only
i did change the io.sort.mb and io.sort.factor and has default values of 100


i can crawl using the nutch example but i filter the urls while indexing 
using some custom plugins

Is there any options i have to change ?
Is the ram enough for my usage ?

i still have the "Out of memory error" while updating
any help would greatly help
Thanks in advance
regards
-Hussain.


----- Original Message ----- 
From: "Piotr Kosiorowski" <pk...@gmail.com>
To: <nu...@lucene.apache.org>
Sent: Monday, January 02, 2006 5:23 PM
Subject: Re: uses of 'io.sort.mb' and ' io.sort.factor' in nutch-default.xml


> Hi Hussain,
> I had no such problems during  web db update.
> Please report your OS, nutch version, JDK version you use.
> What values do you pass for -Xmx param for JVM (or do you use nutch 
> default?)
> Have you changed default values for io.sort.* properties?
> How big is your WebDB?
> In previous email it looked like you are updating WebDB with very small 
> segment - am I right?
>
> Have you ever succeeded in updating this particular WebDB?
> Can you perform example from nutch tutorial successfully in your 
> environment?
> Regards
> Piotr
>
>
> K.A.Hussain Ali wrote:
>> Hi all
>>
>> Could anyone tell the use of the  'io.sort.mb' and or 'io.sort.factor' in 
>> the nutch-default.xml,
>>
>> Do Minmizing the value will have any performance effect ?
>> Is that any way related to the "Out of Memory Error"? if so what should 
>> be the minimum value for those entries.?
>>
>> Any suggestions would greatly help
>> Thanks in advance
>> regards
>> -Hussain.
>>
>>
>>
>>
>>
>
> 


Re: uses of 'io.sort.mb' and ' io.sort.factor' in nutch-default.xml

Posted by Piotr Kosiorowski <pk...@gmail.com>.
Hi Hussain,
I had no such problems during  web db update.
Please report your OS, nutch version, JDK version you use.
What values do you pass for -Xmx param for JVM (or do you use nutch 
default?)
Have you changed default values for io.sort.* properties?
How big is your WebDB?
In previous email it looked like you are updating WebDB with very small 
segment - am I right?

Have you ever succeeded in updating this particular WebDB?
Can you perform example from nutch tutorial successfully in your 
environment?
Regards
Piotr


K.A.Hussain Ali wrote:
> Hi all
> 
> Could anyone tell the use of the  'io.sort.mb' and or 'io.sort.factor' in the nutch-default.xml,
> 
> Do Minmizing the value will have any performance effect ?
> Is that any way related to the "Out of Memory Error"? if so what should be the minimum value for those entries.?
> 
> Any suggestions would greatly help
> Thanks in advance
> regards
> -Hussain.
> 
> 
> 
> 
>