You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by William Pierce <ev...@hotmail.com> on 2009/10/27 19:32:54 UTC

DIH out of memory exception

Folks:

My db contains approx 6M records -- on average each is approx 1K bytes.   When I use the DIH,  I reliably get an OOM exception.   The machine has 4 GB ram,  my tomcat is set to use max heap of 2G.  

The option of increasing memory is not tenable coz as the number of documents grows I will be back in this situation.  

Is there a way to batch the documents?  I tried setting the batchsize parameter to 500 on the <dataSource> tag where I specify the jdbc parameters.   This had no effect.

Best,

- Bill

Re: DIH out of memory exception

Posted by Constantijn Visinescu <ba...@gmail.com>.
Does this help?

http://wiki.apache.org/solr/DataImportHandlerFaq#I.27m_using_DataImportHandler_with_a_MySQL_database._My_table_is_huge_and_DataImportHandler_is_going_out_of_memory._Why_does_DataImportHandler_bring_everything_to_memory.3F

On Wed, Oct 28, 2009 at 12:38 AM, William Pierce <ev...@hotmail.com>wrote:

> Hi, Gilbert:
>
> Thanks for your tip!  I just tried it.  Unfortunately, it does not work for
> me.  I still get the OOM exception.
>
> How large was your dataset?  And what were your machine specs?
>
> Cheers,
>
> - Bill
>
> --------------------------------------------------
> From: "Gilbert Boyreau" <gb...@andevsol.com>
> Sent: Tuesday, October 27, 2009 11:54 AM
> To: <so...@lucene.apache.org>
> Subject: Re: DIH out of memory exception
>
>
>  Hi,
>>
>> I got the same problem using DIH with a large dataset in MySql database.
>>
>> Following :
>> http://dev.mysql.com/doc/refman/5.1/en/connector-j-reference-implementation-notes.html
>> ,
>> and looking at the java code, it appears that DIH use PreparedStatement in
>> the JdbcDataSource.
>>
>> I set the batchsize parameter to -1 and it solved my problem.
>>
>> Regards.
>> Gilbert.
>>
>> William Pierce a écrit :
>>
>>> Folks:
>>>
>>> My db contains approx 6M records -- on average each is approx 1K bytes.
>>> When I use the DIH,  I reliably get an OOM exception.   The machine has 4 GB
>>> ram,  my tomcat is set to use max heap of 2G.
>>> The option of increasing memory is not tenable coz as the number of
>>> documents grows I will be back in this situation.
>>> Is there a way to batch the documents?  I tried setting the batchsize
>>> parameter to 500 on the <dataSource> tag where I specify the jdbc
>>> parameters.   This had no effect.
>>>
>>> Best,
>>>
>>> - Bill
>>>
>>>
>>
>>

Re: DIH out of memory exception

Posted by William Pierce <ev...@hotmail.com>.
Hi, Gilbert:

Thanks for your tip!  I just tried it.  Unfortunately, it does not work for 
me.  I still get the OOM exception.

How large was your dataset?  And what were your machine specs?

Cheers,

- Bill

--------------------------------------------------
From: "Gilbert Boyreau" <gb...@andevsol.com>
Sent: Tuesday, October 27, 2009 11:54 AM
To: <so...@lucene.apache.org>
Subject: Re: DIH out of memory exception

> Hi,
>
> I got the same problem using DIH with a large dataset in MySql database.
>
> Following : 
> http://dev.mysql.com/doc/refman/5.1/en/connector-j-reference-implementation-notes.html,
> and looking at the java code, it appears that DIH use PreparedStatement in 
> the JdbcDataSource.
>
> I set the batchsize parameter to -1 and it solved my problem.
>
> Regards.
> Gilbert.
>
> William Pierce a écrit :
>> Folks:
>>
>> My db contains approx 6M records -- on average each is approx 1K bytes. 
>> When I use the DIH,  I reliably get an OOM exception.   The machine has 4 
>> GB ram,  my tomcat is set to use max heap of 2G.
>> The option of increasing memory is not tenable coz as the number of 
>> documents grows I will be back in this situation.
>> Is there a way to batch the documents?  I tried setting the batchsize 
>> parameter to 500 on the <dataSource> tag where I specify the jdbc 
>> parameters.   This had no effect.
>>
>> Best,
>>
>> - Bill
>>
>
> 

Re: DIH out of memory exception

Posted by Gilbert Boyreau <gb...@andevsol.com>.
Hi,

I got the same problem using DIH with a large dataset in MySql database.

Following : 
http://dev.mysql.com/doc/refman/5.1/en/connector-j-reference-implementation-notes.html, 

and looking at the java code, it appears that DIH use PreparedStatement 
in the JdbcDataSource.

I set the batchsize parameter to -1 and it solved my problem.

Regards.
Gilbert.

William Pierce a écrit :
> Folks:
>
> My db contains approx 6M records -- on average each is approx 1K bytes.   When I use the DIH,  I reliably get an OOM exception.   The machine has 4 GB ram,  my tomcat is set to use max heap of 2G.  
>
> The option of increasing memory is not tenable coz as the number of documents grows I will be back in this situation.  
>
> Is there a way to batch the documents?  I tried setting the batchsize parameter to 500 on the <dataSource> tag where I specify the jdbc parameters.   This had no effect.
>
> Best,
>
> - Bill
>