You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by shreejay <sh...@gmail.com> on 2012/11/05 01:11:52 UTC

Re: Solr4.0 / SolrCloud queries

Thanks Everyone. 

As Shawn mentioned, it was a memory issue. I reduced the amount allocated to
Java to 6 GB. And its been working pretty good. 

I am re-indexing one of the SolrCloud. I was having trouble with optimizing
the data when I indexed last time 

I am hoping optimizing will not be an issue this time due to the memory
changes. I will post more info once I am done. 

Thanks once again. 

--Shreejay




--
View this message in context: http://lucene.472066.n3.nabble.com/Solr4-0-SolrCloud-queries-tp4016825p4018176.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr4.0 / SolrCloud queries

Posted by shreejay <sh...@gmail.com>.

Hi all ,

I have managed to successfully index around 6 million documents, but while
indexing (and even now after the indexing has stopped), I am running into a
bunch of errors.

The most common error I see is
/" null:org.apache.solr.common.SolrException:
org.apache.solr.client.solrj.SolrServerException: Server refused connection
at: http://ABC:8983/solr/xyzabc"/

I have made sure that the servers are able to communicate with each other
using the same names.

Another error I keep getting is that the leader stops recovering and goes
"red" / recovery failed.
/"Error while trying to recover.
core=ABC123:org.apache.solr.common.SolrException: We are not the leader"/

The servers intermittently go offline taking down one of the shards and in
turn stopping all search queries.

The configuration I have

Shard1:
Server1 - Memory - 22GB , JVM - 8gb
Server2 - Memory - 22GB , JVM - 10gb (This one is on "recovery failed"
status, but still acting as a leader).

Shard2:
Server1 - Memory - 22GB , JVM - 8 GB (This one is on "recovery failed"
status, but still acting as a leader).
Server2 - Memory - 22 GB, JVM - 8 GB

Shard3
Server1 - Memory - 22 GB, JVM - 10 GB
Server2 - Memory - 22 GB, JVM - 8 GB

While typing his post I did a "Reload" from the Core Admin page, and both
servers (Shard1-Server2 and Shard2-Server1)came back up again.

Has anyone else encountered these issues? Any steps to prevent these?

Thanks.

--Shreejay

--
View this message in context: http://lucene.472066.n3.nabble.com/Solr4-0-SolrCloud-queries-tp4016825p4021154.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr4.0 / SolrCloud queries

Posted by shreejay <sh...@gmail.com>.

Thanks Mark. I meant ConcurrentMergeScheduler and ramBufferSizeMB (not
maxBuffer). These are my settings for Merge. 

/
    <ramBufferSizeMB>960</ramBufferSizeMB>

    <mergeFactor>40</mergeFactor>
    <mergeScheduler
class="org.apache.lucene.index.ConcurrentMergeScheduler"/>
/



--Shreejay


Mark Miller-3 wrote
> On Nov 9, 2012, at 1:20 PM, shreejay &lt;

> shreejayn@

> &gt; wrote:
> 
>> Instead of doing an optimize, I have now changed the Merge settings by
>> keeping a maxBuffer = 960, a merge Factor = 40 and ConcurrentMergePolicy. 
> 
> Don't you mean ConcurrentMergeScheduler?
> 
> Keep in mind that if you use the default TieredMergePolicy, mergeFactor
> will have no affect. You need to use  maxMergeAtOnce and segmentsPerTier
> as sub args to the merge policy config (see the commented out example in
> solrconfig.xml). 
> 
> Also, it's probably best to avoid using maxBufferedDocs at all.
> 
> - Mark


Mark Miller-3 wrote
> On Nov 9, 2012, at 1:20 PM, shreejay &lt;

> shreejayn@

> &gt; wrote:
> 
>> Instead of doing an optimize, I have now changed the Merge settings by
>> keeping a maxBuffer = 960, a merge Factor = 40 and ConcurrentMergePolicy. 
> 
> Don't you mean ConcurrentMergeScheduler?
> 
> Keep in mind that if you use the default TieredMergePolicy, mergeFactor
> will have no affect. You need to use  maxMergeAtOnce and segmentsPerTier
> as sub args to the merge policy config (see the commented out example in
> solrconfig.xml). 
> 
> Also, it's probably best to avoid using maxBufferedDocs at all.
> 
> - Mark





--
View this message in context: http://lucene.472066.n3.nabble.com/Solr4-0-SolrCloud-queries-tp4016825p4020200.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr4.0 / SolrCloud queries

Posted by Mark Miller <ma...@gmail.com>.

On Nov 9, 2012, at 1:20 PM, shreejay <sh...@gmail.com> wrote:

> Instead of doing an optimize, I have now changed the Merge settings by
> keeping a maxBuffer = 960, a merge Factor = 40 and ConcurrentMergePolicy. 

Don't you mean ConcurrentMergeScheduler?

Keep in mind that if you use the default TieredMergePolicy, mergeFactor will have no affect. You need to use  maxMergeAtOnce and segmentsPerTier as sub args to the merge policy config (see the commented out example in solrconfig.xml). 

Also, it's probably best to avoid using maxBufferedDocs at all.

- Mark

Re: Solr4.0 / SolrCloud queries

Posted by shreejay <sh...@gmail.com>.

Thanks Erick. I will try optimizing after indexing everything. I was doing it
after every batch since it was taking way too long to Optimize (which was
expected), but it was not finishing merging it into lesser number of
segments (1 segment). 

Instead of doing an optimize, I have now changed the Merge settings by
keeping a maxBuffer = 960, a merge Factor = 40 and ConcurrentMergePolicy. 

I am also going to check the <infoStream> option so I can see how the
indexing is going on. 


Thanks for your inputs. 

--Shreejay




--
View this message in context: http://lucene.472066.n3.nabble.com/Solr4-0-SolrCloud-queries-tp4016825p4019373.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr4.0 / SolrCloud queries

Posted by Erick Erickson <er...@gmail.com>.

You really should be careful about optimizes, they're generally not needed.
And optimizing is almost always wrong when done after every N documents in
a batch process. Do it at the very end or not at all. optimize essentially
re-writes the entire index into a single segment, so you're copying around
a lot of data.

And the operations that get done during optimize, which are mainly purging
information associated with deleted documents, gets done anyway upon
segment merging. Despite it's name, optimize usually has marginal effect,
if any, on search speed.

Or did you mean commit?

FWIW,
Erick

On Thu, Nov 8, 2012 at 4:05 PM, shreejay <sh...@gmail.com> wrote:

> I managed to re-index my data without issues. I indexed around 2 million
> documents in one of the clouds. I did an optimize after every 500k
> documents.
>
> I also changed the Memory settings and assigned only 6gb for Java , and
> kept
> 10Gb for OS.
>
> This seems to be working fine as of now. I am not seeing random leader
> selections and servers drops.
>
> Thanks everyone for your inputs.
>
> --Shreejay
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr4-0-SolrCloud-queries-tp4016825p4019144.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Solr4.0 / SolrCloud queries

Posted by shreejay <sh...@gmail.com>.

I managed to re-index my data without issues. I indexed around 2 million
documents in one of the clouds. I did an optimize after every 500k
documents. 

I also changed the Memory settings and assigned only 6gb for Java , and kept
10Gb for OS. 

This seems to be working fine as of now. I am not seeing random leader
selections and servers drops. 

Thanks everyone for your inputs. 

--Shreejay




--
View this message in context: http://lucene.472066.n3.nabble.com/Solr4-0-SolrCloud-queries-tp4016825p4019144.html
Sent from the Solr - User mailing list archive at Nabble.com.