You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Harald Kirsch <Ha...@raytion.com> on 2013/08/16 16:09:17 UTC
Share splitting at 23 million documents -> OOM
Hi all.
Using the example setup of solr-4.4.0, I was able to easily feed 23
million documents from ClueWeb09.
The I tried to split the one shard into tqo. The size on disk is:
% du -sh collection1
118G collection1
I started Solr with 8GB for the JVM:
java -Xmx8000m -DzkRun -DnumShards=2
-Dbootstrap_confdir=./solr/collection1/conf
-Dcollection.configName=myconf -jar start.jar
Then I asked for the split
http://localhost:8983/solr/admin/collections?action=SPLITSHARD&collection=collection1&shard=shard1
After a while I got the OOM in the logs:
841168 [qtp614872954-17] ERROR
org.apache.solr.servlet.SolrDispatchFilter –
null:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space
My question: is it to be expected that the split needs huge amounts of
RAM or is there a chance that some configuration or procedure change
could get me past this?
Regards,
Harald.
--
Harald Kirsch
Raytion GmbH
Kaiser-Friedrich-Ring 74
40547 Duesseldorf
Fon +49-211-550266-0
Fax +49-211-550266-19
http://www.raytion.com
Re: Share splitting at 23 million documents -> OOM
Posted by Bastian Mathes <ba...@raytion.com>.
Hi Greg,
I am a colleague of Harald and had a look at his experiments last week.
You are right, unpacking a fresh Solr 4.4, feeding a small number of
documents (in my case 144) and trying to split the shard is not working.
I get the same error message ("maxValue must be non-negative") that was
discussed at Aug-13th on this list. The result of this discussion seems
to have been that there is no workaround until 4.5.
I then downloaded the 4.5 nightly build and tried the same, but get a
NullPointerException (OverseerCollectionProcessor.java:494), as it is
not a release but just a nightly build I guess that may happen. However
I think we have to conclude that there is a useful feature growing (and
it may already work under certain circumstances), but it is not ready to
use yet (maybe in 4.5/4.6, maybe 5.x).
Best regards,
Bastian
On 08/16/2013 06:31 PM, Greg Preston wrote:
> Have you tried it with a smaller number of documents? I haven't been able
> to successfully split a shard with 4.4.0 with even a handful of docs.
>
>
> -Greg
>
>
> On Fri, Aug 16, 2013 at 7:09 AM, Harald Kirsch <Ha...@raytion.com>wrote:
>
>> Hi all.
>>
>> Using the example setup of solr-4.4.0, I was able to easily feed 23
>> million documents from ClueWeb09.
>>
>> The I tried to split the one shard into tqo. The size on disk is:
>>
>> % du -sh collection1
>> 118G collection1
>>
>> I started Solr with 8GB for the JVM:
>>
>> java -Xmx8000m -DzkRun -DnumShards=2 -Dbootstrap_confdir=./solr/**collection1/conf
>> -Dcollection.configName=myconf -jar start.jar
>>
>> Then I asked for the split
>>
>> http://localhost:8983/solr/**admin/collections?action=**
>> SPLITSHARD&collection=**collection1&shard=shard1<http://localhost:8983/solr/admin/collections?action=SPLITSHARD&collection=collection1&shard=shard1>
>>
>> After a while I got the OOM in the logs:
>>
>> 841168 [qtp614872954-17] ERROR org.apache.solr.servlet.**SolrDispatchFilter
>> – null:java.lang.**RuntimeException: java.lang.OutOfMemoryError: Java
>> heap space
>>
>> My question: is it to be expected that the split needs huge amounts of RAM
>> or is there a chance that some configuration or procedure change could get
>> me past this?
>>
>> Regards,
>> Harald.
>> --
>> Harald Kirsch
>> Raytion GmbH
>> Kaiser-Friedrich-Ring 74
>> 40547 Duesseldorf
>> Fon +49-211-550266-0
>> Fax +49-211-550266-19
>> http://www.raytion.com
>>
>
--
Bastian Mathes
Raytion GmbH
Kaiser-Friedrich-Ring 74
40547 Duesseldorf
Fon +49-211-550266-0
Fax +49-211-550266-19
http://www.raytion.com
Re: Share splitting at 23 million documents -> OOM
Posted by Greg Preston <gp...@marinsoftware.com>.
Have you tried it with a smaller number of documents? I haven't been able
to successfully split a shard with 4.4.0 with even a handful of docs.
-Greg
On Fri, Aug 16, 2013 at 7:09 AM, Harald Kirsch <Ha...@raytion.com>wrote:
> Hi all.
>
> Using the example setup of solr-4.4.0, I was able to easily feed 23
> million documents from ClueWeb09.
>
> The I tried to split the one shard into tqo. The size on disk is:
>
> % du -sh collection1
> 118G collection1
>
> I started Solr with 8GB for the JVM:
>
> java -Xmx8000m -DzkRun -DnumShards=2 -Dbootstrap_confdir=./solr/**collection1/conf
> -Dcollection.configName=myconf -jar start.jar
>
> Then I asked for the split
>
> http://localhost:8983/solr/**admin/collections?action=**
> SPLITSHARD&collection=**collection1&shard=shard1<http://localhost:8983/solr/admin/collections?action=SPLITSHARD&collection=collection1&shard=shard1>
>
> After a while I got the OOM in the logs:
>
> 841168 [qtp614872954-17] ERROR org.apache.solr.servlet.**SolrDispatchFilter
> – null:java.lang.**RuntimeException: java.lang.OutOfMemoryError: Java
> heap space
>
> My question: is it to be expected that the split needs huge amounts of RAM
> or is there a chance that some configuration or procedure change could get
> me past this?
>
> Regards,
> Harald.
> --
> Harald Kirsch
> Raytion GmbH
> Kaiser-Friedrich-Ring 74
> 40547 Duesseldorf
> Fon +49-211-550266-0
> Fax +49-211-550266-19
> http://www.raytion.com
>