You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by KNitin <ni...@gmail.com> on 2014/02/25 01:12:58 UTC

SolrCloud Startup

Hi

 I have a 4 node solrcloud cluster with more than 50 collections with 4
shards each. Everytime I want to make a schema change, I upload configs to
zookeeper and then restart all nodes. However the restart of every node is
very slow and takes about 20-30 minutes per node.

Is it recommended to make loadOnStartup=false and allow solrcloud to lazy
load? Is there a way to make schema changes without restarting solrcloud?


Thanks

Re: SolrCloud Startup

Posted by Otis Gospodnetic <ot...@gmail.com>.
Hi,

Slow startup.... could it be your transaction logs are being replayed?  Are
they very big?  Do you see lots of disk reading during those 20-30 minutes?

Shawn was referring to http://wiki.apache.org/solr/SolrPerformanceProblems

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Mon, Feb 24, 2014 at 10:41 PM, Shawn Heisey <so...@elyograg.org> wrote:

> > Hi
> >
> >  I have a 4 node solrcloud cluster with more than 50 collections with 4
> > shards each. Everytime I want to make a schema change, I upload configs
> to
> > zookeeper and then restart all nodes. However the restart of every node
> is
> > very slow and takes about 20-30 minutes per node.
> >
> > Is it recommended to make loadOnStartup=false and allow solrcloud to lazy
> > load? Is there a way to make schema changes without restarting solrcloud?
>
> I'm on my phone so getting a Url for you is hard. Search the wiki for
> SolrPerformanceProblems. There's a section there on slow startup.
>
> If that's not it, it's probably not enough RAM for the OS disk cache. That
> is also discussed on that wiki page.
>
> Thanks,
> Shawn
>
>
>
>

Re: SolrCloud Startup

Posted by KNitin <ni...@gmail.com>.
Thanks a lot, Shawn! I was missing an ICU jar as a part of my original
setup. I then copied the analysis jars into solr/lib and removed all
reference in solrconfig.xml and it worked like a charm

The permgen space also seems to have reduced significantly

Thanks
Nitin


On Tue, Mar 4, 2014 at 2:41 PM, Shawn Heisey <so...@elyograg.org> wrote:

> On 3/4/2014 3:09 PM, KNitin wrote:
>
>> I did the following as you suggested. I have a lib dir under /mnt/solr/
>> (this is the solr.solr.home dir) and moved all my jars in it. I do not
>> have
>> anySharedLib or lib references in my solr or solrconfig. xml file
>>
>> The jars are not getting loaded for a few custom analyzers I have in the
>> schema.
>>
>> Should I specify anywhere to use /mnt/solr/lib/ as the lib path to use
>> anywhere?
>>
>
> The solr home is where solr.xml lives.  So if /mnt/solr is that location,
> then that would be where you want solr home to point.  Generally your core
> directories are also in solr.home, but if you've customized the locations,
> that may not be true.
>
> In 4.3 and later, the lib directory under the solr home is automatically
> added to the classpath.  Also in that version, if you *do* explicitly
> include it, it won't work right -- which is what prompted me to file
> SOLR-4852.  The symptoms were that the jars would get loaded (twice,
> actually), but the classes were not actually available.
>
> Thanks,
> Shawn
>
>

Re: SolrCloud Startup

Posted by Shawn Heisey <so...@elyograg.org>.
On 3/4/2014 3:09 PM, KNitin wrote:
> I did the following as you suggested. I have a lib dir under /mnt/solr/
> (this is the solr.solr.home dir) and moved all my jars in it. I do not have
> anySharedLib or lib references in my solr or solrconfig. xml file
>
> The jars are not getting loaded for a few custom analyzers I have in the
> schema.
>
> Should I specify anywhere to use /mnt/solr/lib/ as the lib path to use
> anywhere?

The solr home is where solr.xml lives.  So if /mnt/solr is that 
location, then that would be where you want solr home to point.  
Generally your core directories are also in solr.home, but if you've 
customized the locations, that may not be true.

In 4.3 and later, the lib directory under the solr home is automatically 
added to the classpath.  Also in that version, if you *do* explicitly 
include it, it won't work right -- which is what prompted me to file 
SOLR-4852.  The symptoms were that the jars would get loaded (twice, 
actually), but the classes were not actually available.

Thanks,
Shawn


Re: SolrCloud Startup

Posted by KNitin <ni...@gmail.com>.
I did the following as you suggested. I have a lib dir under /mnt/solr/
(this is the solr.solr.home dir) and moved all my jars in it. I do not have
anySharedLib or lib references in my solr or solrconfig. xml file

The jars are not getting loaded for a few custom analyzers I have in the
schema.

Should I specify anywhere to use /mnt/solr/lib/ as the lib path to use
anywhere?

- Nitin


On Mon, Mar 3, 2014 at 3:06 PM, KNitin <ni...@gmail.com> wrote:

> Thanks, Shawn.  Right now my solr.solr.home is not being passed from the
> java runtime
>
> Lets say /mnt/solr/ is my solr root. I can add all jars to /mnt/solr/lib/
> and use -Dsolr.solr.home=/mnt/solr/  , that should do it right?
>
> Thanks
> Nitin
>
>
> On Mon, Mar 3, 2014 at 2:44 PM, Shawn Heisey <so...@elyograg.org> wrote:
>
>> On 3/3/2014 3:30 PM, KNitin wrote:
>>
>>> A quick ping on this. To give more stats, I have 100's of collections on
>>> every node. The time it takes for one collection to boot up
>>> /loadonStartup
>>> is around 10-20 seconds ("and sometimes even 1 minute). I do not have any
>>> query auto warming etc. On a per collection basis I load a bunch of
>>> libraries (for custom analyzer plugins) to compute the classpath. That
>>> might be a reason for the high boot up time
>>>
>>>    My solrconfig.xml entry is as follows
>>>
>>>    <lib dir="/mnt/solr/lib/" regex=".*\.jar" />
>>>
>>>   Every core that boots up seems to be loading all jars over and over
>>> again.
>>> Is there a way to ask solr to load all jars only once?
>>>
>>
>> Three steps:
>>
>> 1) Get rid of all your <lib> directives in solrconfig.xml entirely.
>> 2) Copy all the extra jars that you need into ${solr.solr.home}/lib.
>> 3) Remove any "sharedLib" parameter from your solr.xml file.
>>
>> Step 3 is required because you are on 4.3.1 (or later if you have already
>> upgraded).
>>
>> The final comment on the following issue summarizes issues that I ran
>> into while migrating this approach from 4.2.1 to later releases:
>>
>> https://issues.apache.org/jira/browse/SOLR-4852
>>
>> Thanks,
>> Shawn
>>
>>
>

Re: SolrCloud Startup

Posted by KNitin <ni...@gmail.com>.
Thanks, Shawn.  Right now my solr.solr.home is not being passed from the
java runtime

Lets say /mnt/solr/ is my solr root. I can add all jars to /mnt/solr/lib/
and use -Dsolr.solr.home=/mnt/solr/  , that should do it right?

Thanks
Nitin


On Mon, Mar 3, 2014 at 2:44 PM, Shawn Heisey <so...@elyograg.org> wrote:

> On 3/3/2014 3:30 PM, KNitin wrote:
>
>> A quick ping on this. To give more stats, I have 100's of collections on
>> every node. The time it takes for one collection to boot up /loadonStartup
>> is around 10-20 seconds ("and sometimes even 1 minute). I do not have any
>> query auto warming etc. On a per collection basis I load a bunch of
>> libraries (for custom analyzer plugins) to compute the classpath. That
>> might be a reason for the high boot up time
>>
>>    My solrconfig.xml entry is as follows
>>
>>    <lib dir="/mnt/solr/lib/" regex=".*\.jar" />
>>
>>   Every core that boots up seems to be loading all jars over and over
>> again.
>> Is there a way to ask solr to load all jars only once?
>>
>
> Three steps:
>
> 1) Get rid of all your <lib> directives in solrconfig.xml entirely.
> 2) Copy all the extra jars that you need into ${solr.solr.home}/lib.
> 3) Remove any "sharedLib" parameter from your solr.xml file.
>
> Step 3 is required because you are on 4.3.1 (or later if you have already
> upgraded).
>
> The final comment on the following issue summarizes issues that I ran into
> while migrating this approach from 4.2.1 to later releases:
>
> https://issues.apache.org/jira/browse/SOLR-4852
>
> Thanks,
> Shawn
>
>

Re: SolrCloud Startup

Posted by Shawn Heisey <so...@elyograg.org>.
On 3/3/2014 3:30 PM, KNitin wrote:
> A quick ping on this. To give more stats, I have 100's of collections on
> every node. The time it takes for one collection to boot up /loadonStartup
> is around 10-20 seconds ("and sometimes even 1 minute). I do not have any
> query auto warming etc. On a per collection basis I load a bunch of
> libraries (for custom analyzer plugins) to compute the classpath. That
> might be a reason for the high boot up time
>
>    My solrconfig.xml entry is as follows
>
>    <lib dir="/mnt/solr/lib/" regex=".*\.jar" />
>
>   Every core that boots up seems to be loading all jars over and over again.
> Is there a way to ask solr to load all jars only once?

Three steps:

1) Get rid of all your <lib> directives in solrconfig.xml entirely.
2) Copy all the extra jars that you need into ${solr.solr.home}/lib.
3) Remove any "sharedLib" parameter from your solr.xml file.

Step 3 is required because you are on 4.3.1 (or later if you have 
already upgraded).

The final comment on the following issue summarizes issues that I ran 
into while migrating this approach from 4.2.1 to later releases:

https://issues.apache.org/jira/browse/SOLR-4852

Thanks,
Shawn


Re: SolrCloud Startup

Posted by KNitin <ni...@gmail.com>.
A quick ping on this. To give more stats, I have 100's of collections on
every node. The time it takes for one collection to boot up /loadonStartup
is around 10-20 seconds ("and sometimes even 1 minute). I do not have any
query auto warming etc. On a per collection basis I load a bunch of
libraries (for custom analyzer plugins) to compute the classpath. That
might be a reason for the high boot up time

  My solrconfig.xml entry is as follows

  <lib dir="/mnt/solr/lib/" regex=".*\.jar" />

 Every core that boots up seems to be loading all jars over and over again.
Is there a way to ask solr to load all jars only once?

Thanks
- Nitin


On Wed, Feb 26, 2014 at 3:06 PM, KNitin <ni...@gmail.com> wrote:

> Thanks, Shawn. I will try to upgrade solr soon
>
> Reg firstSearcher: I think it does nothing now. I have configured to use
> ExternalFileLoader but there the external file has no contents. Most of the
> queries hitting the collection are expensive and tail queries. What will be
> your recommendation to warm the first Searcher/new Searcher?
>
> Thanks
> Nitin
>
>
> On Tue, Feb 25, 2014 at 4:12 PM, Shawn Heisey <so...@elyograg.org> wrote:
>
>> On 2/25/2014 4:30 PM, KNitin wrote:
>>
>>> Jeff :  Thanks. I have tried reload before but it is not reliable
>>> (atleast
>>> in 4.3.1). A few cores get initialized and few dont (show as just
>>> recovering or down) and hence had to move away from it. Is it a known
>>> issue
>>> in 4.3.1?
>>>
>>
>> With Solr 4.3.1, you are running into this bug with reloads under
>> SolrCloud:
>>
>> https://issues.apache.org/jira/browse/SOLR-4805
>>
>> The only way to recover from this bug is to restart Solr.The bug is fixed
>> in 4.4.0 and later.
>>
>>
>>  Shawn,Otis,Erick
>>>
>>>   Yes I have reviewed the page before and have given 1/4 of my mem to JVM
>>> and the rest to RAM/Os Cache. (15 Gb heap and 45 G to rest. Totally 60G
>>> machine). I have also reviewed the tlog file and they are in the order of
>>> KB (4-10 or 30). I have SSD and the reads are hardly noticable (in the
>>> order of 100Kb during that time frame). I have also disabled swap on all
>>> machines
>>>
>>> Regarding firstSearcher, It is currently set to externalFileLoader. What
>>> is
>>> the use of first searcher? I havent played around with it
>>>
>>
>> I don't think it's a good idea to have extensive warming queries.  I do
>> exactly one query in firstSearcher and newSearcher: a query for all
>> documents with zero rows, sorted on our most common sort field.  This is
>> designed purely to preload the sort data into the FieldCache.
>>
>> Thanks,
>> Shawn
>>
>>
>

Re: SolrCloud Startup

Posted by KNitin <ni...@gmail.com>.
Thanks, Shawn. I will try to upgrade solr soon

Reg firstSearcher: I think it does nothing now. I have configured to use
ExternalFileLoader but there the external file has no contents. Most of the
queries hitting the collection are expensive and tail queries. What will be
your recommendation to warm the first Searcher/new Searcher?

Thanks
Nitin


On Tue, Feb 25, 2014 at 4:12 PM, Shawn Heisey <so...@elyograg.org> wrote:

> On 2/25/2014 4:30 PM, KNitin wrote:
>
>> Jeff :  Thanks. I have tried reload before but it is not reliable (atleast
>> in 4.3.1). A few cores get initialized and few dont (show as just
>> recovering or down) and hence had to move away from it. Is it a known
>> issue
>> in 4.3.1?
>>
>
> With Solr 4.3.1, you are running into this bug with reloads under
> SolrCloud:
>
> https://issues.apache.org/jira/browse/SOLR-4805
>
> The only way to recover from this bug is to restart Solr.The bug is fixed
> in 4.4.0 and later.
>
>
>  Shawn,Otis,Erick
>>
>>   Yes I have reviewed the page before and have given 1/4 of my mem to JVM
>> and the rest to RAM/Os Cache. (15 Gb heap and 45 G to rest. Totally 60G
>> machine). I have also reviewed the tlog file and they are in the order of
>> KB (4-10 or 30). I have SSD and the reads are hardly noticable (in the
>> order of 100Kb during that time frame). I have also disabled swap on all
>> machines
>>
>> Regarding firstSearcher, It is currently set to externalFileLoader. What
>> is
>> the use of first searcher? I havent played around with it
>>
>
> I don't think it's a good idea to have extensive warming queries.  I do
> exactly one query in firstSearcher and newSearcher: a query for all
> documents with zero rows, sorted on our most common sort field.  This is
> designed purely to preload the sort data into the FieldCache.
>
> Thanks,
> Shawn
>
>

Re: SolrCloud Startup

Posted by Shawn Heisey <so...@elyograg.org>.
On 2/25/2014 4:30 PM, KNitin wrote:
> Jeff :  Thanks. I have tried reload before but it is not reliable (atleast
> in 4.3.1). A few cores get initialized and few dont (show as just
> recovering or down) and hence had to move away from it. Is it a known issue
> in 4.3.1?

With Solr 4.3.1, you are running into this bug with reloads under SolrCloud:

https://issues.apache.org/jira/browse/SOLR-4805

The only way to recover from this bug is to restart Solr.The bug is 
fixed in 4.4.0 and later.

> Shawn,Otis,Erick
>
>   Yes I have reviewed the page before and have given 1/4 of my mem to JVM
> and the rest to RAM/Os Cache. (15 Gb heap and 45 G to rest. Totally 60G
> machine). I have also reviewed the tlog file and they are in the order of
> KB (4-10 or 30). I have SSD and the reads are hardly noticable (in the
> order of 100Kb during that time frame). I have also disabled swap on all
> machines
>
> Regarding firstSearcher, It is currently set to externalFileLoader. What is
> the use of first searcher? I havent played around with it

I don't think it's a good idea to have extensive warming queries.  I do 
exactly one query in firstSearcher and newSearcher: a query for all 
documents with zero rows, sorted on our most common sort field.  This is 
designed purely to preload the sort data into the FieldCache.

Thanks,
Shawn


Re: SolrCloud Startup

Posted by KNitin <ni...@gmail.com>.
Erick: My autocommit is set to trigger every 30 seconds with
openSearcher=false. The autocommit for soft commits are disabled


On Tue, Feb 25, 2014 at 3:30 PM, KNitin <ni...@gmail.com> wrote:

> Jeff :  Thanks. I have tried reload before but it is not reliable (atleast
> in 4.3.1). A few cores get initialized and few dont (show as just
> recovering or down) and hence had to move away from it. Is it a known issue
> in 4.3.1?
>
> Shawn,Otis,Erick
>
>  Yes I have reviewed the page before and have given 1/4 of my mem to JVM
> and the rest to RAM/Os Cache. (15 Gb heap and 45 G to rest. Totally 60G
> machine). I have also reviewed the tlog file and they are in the order of
> KB (4-10 or 30). I have SSD and the reads are hardly noticable (in the
> order of 100Kb during that time frame). I have also disabled swap on all
> machines
>
> Regarding firstSearcher, It is currently set to externalFileLoader. What
> is the use of first searcher? I havent played around with it
>
> Thanks
> Nitin
>
>
>
>
>
> On Mon, Feb 24, 2014 at 7:58 PM, Erick Erickson <er...@gmail.com>wrote:
>
>> What is your firstSearcher set to in solrconfig.xml? If you're
>> doing something really crazy there that might be an issue.
>>
>> But I think Otis' suggestion is a lot more probable. What
>> are your autocommits configured to?
>>
>> Best,
>> Erick
>>
>>
>> On Mon, Feb 24, 2014 at 7:41 PM, Shawn Heisey <so...@elyograg.org> wrote:
>>
>> > > Hi
>> > >
>> > >  I have a 4 node solrcloud cluster with more than 50 collections with
>> 4
>> > > shards each. Everytime I want to make a schema change, I upload
>> configs
>> > to
>> > > zookeeper and then restart all nodes. However the restart of every
>> node
>> > is
>> > > very slow and takes about 20-30 minutes per node.
>> > >
>> > > Is it recommended to make loadOnStartup=false and allow solrcloud to
>> lazy
>> > > load? Is there a way to make schema changes without restarting
>> solrcloud?
>> >
>> > I'm on my phone so getting a Url for you is hard. Search the wiki for
>> > SolrPerformanceProblems. There's a section there on slow startup.
>> >
>> > If that's not it, it's probably not enough RAM for the OS disk cache.
>> That
>> > is also discussed on that wiki page.
>> >
>> > Thanks,
>> > Shawn
>> >
>> >
>> >
>> >
>>
>
>

Re: SolrCloud Startup

Posted by KNitin <ni...@gmail.com>.
Jeff :  Thanks. I have tried reload before but it is not reliable (atleast
in 4.3.1). A few cores get initialized and few dont (show as just
recovering or down) and hence had to move away from it. Is it a known issue
in 4.3.1?

Shawn,Otis,Erick

 Yes I have reviewed the page before and have given 1/4 of my mem to JVM
and the rest to RAM/Os Cache. (15 Gb heap and 45 G to rest. Totally 60G
machine). I have also reviewed the tlog file and they are in the order of
KB (4-10 or 30). I have SSD and the reads are hardly noticable (in the
order of 100Kb during that time frame). I have also disabled swap on all
machines

Regarding firstSearcher, It is currently set to externalFileLoader. What is
the use of first searcher? I havent played around with it

Thanks
Nitin





On Mon, Feb 24, 2014 at 7:58 PM, Erick Erickson <er...@gmail.com>wrote:

> What is your firstSearcher set to in solrconfig.xml? If you're
> doing something really crazy there that might be an issue.
>
> But I think Otis' suggestion is a lot more probable. What
> are your autocommits configured to?
>
> Best,
> Erick
>
>
> On Mon, Feb 24, 2014 at 7:41 PM, Shawn Heisey <so...@elyograg.org> wrote:
>
> > > Hi
> > >
> > >  I have a 4 node solrcloud cluster with more than 50 collections with 4
> > > shards each. Everytime I want to make a schema change, I upload configs
> > to
> > > zookeeper and then restart all nodes. However the restart of every node
> > is
> > > very slow and takes about 20-30 minutes per node.
> > >
> > > Is it recommended to make loadOnStartup=false and allow solrcloud to
> lazy
> > > load? Is there a way to make schema changes without restarting
> solrcloud?
> >
> > I'm on my phone so getting a Url for you is hard. Search the wiki for
> > SolrPerformanceProblems. There's a section there on slow startup.
> >
> > If that's not it, it's probably not enough RAM for the OS disk cache.
> That
> > is also discussed on that wiki page.
> >
> > Thanks,
> > Shawn
> >
> >
> >
> >
>

Re: SolrCloud Startup

Posted by Erick Erickson <er...@gmail.com>.
What is your firstSearcher set to in solrconfig.xml? If you're
doing something really crazy there that might be an issue.

But I think Otis' suggestion is a lot more probable. What
are your autocommits configured to?

Best,
Erick


On Mon, Feb 24, 2014 at 7:41 PM, Shawn Heisey <so...@elyograg.org> wrote:

> > Hi
> >
> >  I have a 4 node solrcloud cluster with more than 50 collections with 4
> > shards each. Everytime I want to make a schema change, I upload configs
> to
> > zookeeper and then restart all nodes. However the restart of every node
> is
> > very slow and takes about 20-30 minutes per node.
> >
> > Is it recommended to make loadOnStartup=false and allow solrcloud to lazy
> > load? Is there a way to make schema changes without restarting solrcloud?
>
> I'm on my phone so getting a Url for you is hard. Search the wiki for
> SolrPerformanceProblems. There's a section there on slow startup.
>
> If that's not it, it's probably not enough RAM for the OS disk cache. That
> is also discussed on that wiki page.
>
> Thanks,
> Shawn
>
>
>
>

Re: SolrCloud Startup

Posted by Shawn Heisey <so...@elyograg.org>.
> Hi
>
>  I have a 4 node solrcloud cluster with more than 50 collections with 4
> shards each. Everytime I want to make a schema change, I upload configs to
> zookeeper and then restart all nodes. However the restart of every node is
> very slow and takes about 20-30 minutes per node.
>
> Is it recommended to make loadOnStartup=false and allow solrcloud to lazy
> load? Is there a way to make schema changes without restarting solrcloud?

I'm on my phone so getting a Url for you is hard. Search the wiki for
SolrPerformanceProblems. There's a section there on slow startup.

If that's not it, it's probably not enough RAM for the OS disk cache. That
is also discussed on that wiki page.

Thanks,
Shawn




Re: SolrCloud Startup

Posted by Jeff Wartes <jw...@whitepages.com>.
There is a RELOAD collection command you might try:
https://cwiki.apache.org/confluence/display/solr/Collections+API#Collection
sAPI-api2


I think you¹ll find this a lot faster than restarting your whole JVM.


On 2/24/14, 4:12 PM, "KNitin" <ni...@gmail.com> wrote:

>Hi
>
> I have a 4 node solrcloud cluster with more than 50 collections with 4
>shards each. Everytime I want to make a schema change, I upload configs to
>zookeeper and then restart all nodes. However the restart of every node is
>very slow and takes about 20-30 minutes per node.
>
>Is it recommended to make loadOnStartup=false and allow solrcloud to lazy
>load? Is there a way to make schema changes without restarting solrcloud?
>
>
>Thanks