You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by nabil Kouici <ko...@yahoo.fr> on 2014/10/12 20:46:28 UTC

Shard not accessible after restarting

Hi All,
I'm evaluating solr performance. I've created implicit collection with 2 shards in different server. first shard contains 100 million documents (30GB), second contain one million document.When I restart the second solr instance, shard become immediately available. However, when I restart the first solr, shard with 100 million doc take a huge time to be available for search. Is it normal? In Cloud interface, shard is green (Active). My servers have 28GB RAM.
I'm using default solrconfig.xml. 
Any help?
Regards,Nabil.

Re: Shard not accessible after restarting

Posted by Shawn Heisey <ap...@elyograg.org>.
On 10/13/2014 11:43 AM, nabil Kouici wrote:
> Thank you for this replay. I don't understand why solr load my shard in physical memory before becoming available. Is it related to LRU cache management?

Just like with the optimize, this also sounds like something that Solr
would never do.  If you have free memory, then the operating system will
automatically cache any data that gets read from the disk, but nothing
in Solr will pre-load the entire index into memory.  The automatic
caching is how ALL modern operating systems work, because using extra
memory for disk cache can make a computer run many times faster.

If you are seeing these things happen, then it is something in your
setup that's doing it, not Solr itself.

Look over everything on this wiki page.  I linked to it before, but it
was to a specific section near the end of the page.  The rest of the
page has good information on performance problems:

http://wiki.apache.org/solr/SolrPerformanceProblems

Thanks,
Shawn


Re: Shard not accessible after restarting

Posted by nabil Kouici <ko...@yahoo.fr>.
Thank you for this replay. I don't understand why solr load my shard in physical memory before becoming available. Is it related to LRU cache management?
Regards,Nabil. 

     Le Lundi 13 octobre 2014 16h20, Shawn Heisey <ap...@elyograg.org> a écrit :
   

 On 10/13/2014 1:15 AM, nabil Kouici wrote:
> This gives me a good understanding. However, I think that slowness in start-up in my case is due to segmentation. Before restarting my index contains 42 segments with 30GB. After restarting, index is in one segment with 22GB. So, I think that shard unavailability is coming from optimization process which is executed when we restart solr.
> 
> Is there any option is solr to avoid this implicit optimization on startup?
> 
> What I learned from this experience is that we need to do explicit optimization in regular period to avoid a huge number of segments. Is it a best practice in solr?

Solr will *never* do an optimize unless you tell it to.

During indexing, Solr does merge segments, but it is exceptionally rare
for an automatic merge to create only one segment, and as far as I know,
merging will never happen on startup, only when you index.

If an optimization is happening on startup, then something somewhere is
asking Solr to do it ... but searches should still be possible, even
during an optimize.

Startup should not be affected very much by an index that has a lot of
segments.  It will be a little bit slower, but the difference should be
small.

Thanks,
Shawn



   

Re: Shard not accessible after restarting

Posted by Shawn Heisey <ap...@elyograg.org>.
On 10/13/2014 1:15 AM, nabil Kouici wrote:
> This gives me a good understanding. However, I think that slowness in start-up in my case is due to segmentation. Before restarting my index contains 42 segments with 30GB. After restarting, index is in one segment with 22GB. So, I think that shard unavailability is coming from optimization process which is executed when we restart solr.
> 
> Is there any option is solr to avoid this implicit optimization on startup?
> 
> What I learned from this experience is that we need to do explicit optimization in regular period to avoid a huge number of segments. Is it a best practice in solr?

Solr will *never* do an optimize unless you tell it to.

During indexing, Solr does merge segments, but it is exceptionally rare
for an automatic merge to create only one segment, and as far as I know,
merging will never happen on startup, only when you index.

If an optimization is happening on startup, then something somewhere is
asking Solr to do it ... but searches should still be possible, even
during an optimize.

Startup should not be affected very much by an index that has a lot of
segments.  It will be a little bit slower, but the difference should be
small.

Thanks,
Shawn


Re: Shard not accessible after restarting

Posted by nabil Kouici <ko...@yahoo.fr>.
Hi Shawn,

This gives me a good understanding. However, I think that slowness in start-up in my case is due to segmentation. Before restarting my index contains 42 segments with 30GB. After restarting, index is in one segment with 22GB. So, I think that shard unavailability is coming from optimization process which is executed when we restart solr.

Is there any option is solr to avoid this implicit optimization on startup?

What I learned from this experience is that we need to do explicit optimization in regular period to avoid a huge number of segments. Is it a best practice in solr?

Regards,
Nabil.


Le Dimanche 12 octobre 2014 22h03, Shawn Heisey <ap...@elyograg.org> a écrit :
 


On 10/12/2014 12:46 PM, nabil Kouici wrote:

> I'm evaluating solr performance. I've created implicit collection with 2 shards in different server. first shard contains 100 million documents (30GB), second contain one million document.When I restart the second solr instance, shard become immediately available. However, when I restart the first solr, shard with 100 million doc take a huge time to be available for search. Is it normal? In Cloud interface, shard is green (Active). My servers have 28GB RAM.
> I'm using default solrconfig.xml. 

Does the following URL describe the problem you're running into?

http://wiki.apache.org/solr/SolrPerformanceProblems#Slow_startup

Thanks,
Shawn

Re: Shard not accessible after restarting

Posted by Shawn Heisey <ap...@elyograg.org>.
On 10/12/2014 12:46 PM, nabil Kouici wrote:
> I'm evaluating solr performance. I've created implicit collection with 2 shards in different server. first shard contains 100 million documents (30GB), second contain one million document.When I restart the second solr instance, shard become immediately available. However, when I restart the first solr, shard with 100 million doc take a huge time to be available for search. Is it normal? In Cloud interface, shard is green (Active). My servers have 28GB RAM.
> I'm using default solrconfig.xml. 

Does the following URL describe the problem you're running into?

http://wiki.apache.org/solr/SolrPerformanceProblems#Slow_startup

Thanks,
Shawn