You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Doss <it...@gmail.com> on 2014/12/02 13:30:46 UTC

Re: SOLR not starting after restart 2 node cloud setup

Dear Erick,

Thanks for your thoughts, it helped me a lot. In my instances no solr logs
are appended in to catalina.out.

Now I placed the log4j.properties file. Solr logs are captured in solr.log
file with the help of it I found the reason for the issue.

I am starting tomcat with the option -Dbootstrap_conf=true which made solr
to look for core configuration files in a wrong directory, after removing
this it started without any issues.

I also commented suggester component which made solr to load fast.

Thanks,
Doss.




On Thu, Nov 20, 2014 at 9:47 PM, Erick Erickson <er...@gmail.com>
wrote:

> Doss:
>
> Tomcat often puts things in "catalina.out", you might check there,
> I've often seen logging information from Solr go there by
> default.
>
> Without having some idea what kinds of problems Solr is
> reporting when you see this situation, it's really hard to say.
>
> Some things I'd check first though, in order of what
> I _guess_ is most likely.
>
> > There have been anecdotal reports (in fact, I'm trying
> to understand the why of it right now) of the suggester
> taking a long time to initialize, even if you don't use it!
> So if you're not using the suggest component, try
> commenting out those sections in solrconfig.xml for
> the cores in question. I like this explanation since it
> fits with your symptoms, but I don't like it since the
> index you are using isn't all that big. So it's something
> of a shot in the dark. I expect that the core will
> _eventually_ come up, but I've seen reports of 10-15
> minutes being required, far beyond my patience! That
> said, this would also explain why deleting the index
> works.
>
> > OutOfMemory errors. You might be able to attach
> jConsole (part of the standard Java stuff) to the process
> and monitor the memory usage. If it's being pushed near
> the 5G limit that's the first thing I'd suspect.
>
> > If you're using the default setups, then the Zookeeper
> timeout may be too low, I think the default (not sure about
> whether it's been changed in 4.9) is 15 seconds, 30-60
> is usually much better.
>
> Best,
> Erick
>
>
> On Thu, Nov 20, 2014 at 3:47 AM, Doss <it...@gmail.com> wrote:
> > Dear Erick,
> >
> > Forgive my ignorance.
> >
> > Please find some of the details you required.
> >
> > *have you looked at the solr logs?*
> >
> >  > Sorry I haven't defined the log4j.properties file, so I don't have
> solr
> > logs. Since it requires tomcat restart I am planning to do it in next
> > restart.
> >
> > But found the following in tomcat log
> >
> > 18-Nov-2014 11:27:29.028 WARNING [localhost-startStop-2]
> > org.apache.catalina.loader.WebappClassLoader.clearReferencesThreads The
> web
> > application [/mima] appears to have started a thread named
> > [localhost-startStop-1-SendThread(10.236.149.28:2181)] but has failed to
> > stop it. This is very likely to create a memory leak. Stack trace of
> thread:
> >  sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
> >  sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
> >  sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
> >  sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
> >  sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
> >
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:349)
> >  org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
> >
> >
> > *How big are the cores?*
> >
> >> We have 16 cores, out of it only 5 are big ones. Total size of all 16
> > cores is 10+ GB
> >
> > *How many docs in the cores when the problem happens?*
> >
> > 1 core with 163 fields and 33,00,000 documents (Index size 2+ GB)
> >  4 cores with 3 fields and has 150,00,000 (approx) documents (1.2 to 1.5
> GB)
> > remaining cores are 1,00,000 to 40,00,000 documents
> >
> > *How much memory are you allocating the JVM? *
> >
> > 5GB for JVM, Total RAM available in the systems is 30 GB
> >
> > *can you restart Tomcat without a problem?*
> >
> > This problem is occurring in production, I never tried.
> >
> >
> > Thanks,
> > Doss.
> >
> >
> > On Wed, Nov 19, 2014 at 7:55 PM, Erick Erickson <erickerickson@gmail.com
> >
> > wrote:
> >
> >> You've really got to provide details for us to say much
> >> of anything. There are about a zillion things that it could be.
> >>
> >> In particular, have you looked at the solr logs? Are there
> >> any interesting things in them? How big are the cores?
> >> How much memory are you allocating the JVM? How
> >> many docs in the cores when the problem happens?
> >> Before the nodes stop responding, can you restart
> >> Tomcat without a problem?
> >>
> >> You might review:
> >> http://wiki.apache.org/solr/UsingMailingLists
> >>
> >> Best,
> >> Erick
> >>
> >>
> >> On Wed, Nov 19, 2014 at 1:04 AM, Doss <it...@gmail.com> wrote:
> >> > I have two node SOLR (4.9.0) cloud with Tomcat (8), Zookeeper. At
> times
> >> > SOLR in Node 1 stops responding, to fix the issue I am restarting
> tomcat
> >> in
> >> > Node 1, but SOLR not starting up, but if I remove the solr cores in
> both
> >> > nodes and try restarting it starts working, and then I have to reindex
> >> the
> >> > whole data again. We are using this setup in production because of
> this
> >> > issue we are having 1 to 1.30 hours of service down time. Any
> suggestions
> >> > would be greatly appreciated.
> >> >
> >> > Thanks,
> >> > Doss.
> >>
>

Re: SOLR not starting after restart 2 node cloud setup

Posted by Erick Erickson <er...@gmail.com>.
Glad you found a solution!

Best,
Erick

On Tue, Dec 2, 2014 at 4:30 AM, Doss <it...@gmail.com> wrote:
> Dear Erick,
>
> Thanks for your thoughts, it helped me a lot. In my instances no solr logs
> are appended in to catalina.out.
>
> Now I placed the log4j.properties file. Solr logs are captured in solr.log
> file with the help of it I found the reason for the issue.
>
> I am starting tomcat with the option -Dbootstrap_conf=true which made solr
> to look for core configuration files in a wrong directory, after removing
> this it started without any issues.
>
> I also commented suggester component which made solr to load fast.
>
> Thanks,
> Doss.
>
>
>
>
> On Thu, Nov 20, 2014 at 9:47 PM, Erick Erickson <er...@gmail.com>
> wrote:
>
>> Doss:
>>
>> Tomcat often puts things in "catalina.out", you might check there,
>> I've often seen logging information from Solr go there by
>> default.
>>
>> Without having some idea what kinds of problems Solr is
>> reporting when you see this situation, it's really hard to say.
>>
>> Some things I'd check first though, in order of what
>> I _guess_ is most likely.
>>
>> > There have been anecdotal reports (in fact, I'm trying
>> to understand the why of it right now) of the suggester
>> taking a long time to initialize, even if you don't use it!
>> So if you're not using the suggest component, try
>> commenting out those sections in solrconfig.xml for
>> the cores in question. I like this explanation since it
>> fits with your symptoms, but I don't like it since the
>> index you are using isn't all that big. So it's something
>> of a shot in the dark. I expect that the core will
>> _eventually_ come up, but I've seen reports of 10-15
>> minutes being required, far beyond my patience! That
>> said, this would also explain why deleting the index
>> works.
>>
>> > OutOfMemory errors. You might be able to attach
>> jConsole (part of the standard Java stuff) to the process
>> and monitor the memory usage. If it's being pushed near
>> the 5G limit that's the first thing I'd suspect.
>>
>> > If you're using the default setups, then the Zookeeper
>> timeout may be too low, I think the default (not sure about
>> whether it's been changed in 4.9) is 15 seconds, 30-60
>> is usually much better.
>>
>> Best,
>> Erick
>>
>>
>> On Thu, Nov 20, 2014 at 3:47 AM, Doss <it...@gmail.com> wrote:
>> > Dear Erick,
>> >
>> > Forgive my ignorance.
>> >
>> > Please find some of the details you required.
>> >
>> > *have you looked at the solr logs?*
>> >
>> >  > Sorry I haven't defined the log4j.properties file, so I don't have
>> solr
>> > logs. Since it requires tomcat restart I am planning to do it in next
>> > restart.
>> >
>> > But found the following in tomcat log
>> >
>> > 18-Nov-2014 11:27:29.028 WARNING [localhost-startStop-2]
>> > org.apache.catalina.loader.WebappClassLoader.clearReferencesThreads The
>> web
>> > application [/mima] appears to have started a thread named
>> > [localhost-startStop-1-SendThread(10.236.149.28:2181)] but has failed to
>> > stop it. This is very likely to create a memory leak. Stack trace of
>> thread:
>> >  sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
>> >  sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
>> >  sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
>> >  sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
>> >  sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
>> >
>> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:349)
>> >  org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
>> >
>> >
>> > *How big are the cores?*
>> >
>> >> We have 16 cores, out of it only 5 are big ones. Total size of all 16
>> > cores is 10+ GB
>> >
>> > *How many docs in the cores when the problem happens?*
>> >
>> > 1 core with 163 fields and 33,00,000 documents (Index size 2+ GB)
>> >  4 cores with 3 fields and has 150,00,000 (approx) documents (1.2 to 1.5
>> GB)
>> > remaining cores are 1,00,000 to 40,00,000 documents
>> >
>> > *How much memory are you allocating the JVM? *
>> >
>> > 5GB for JVM, Total RAM available in the systems is 30 GB
>> >
>> > *can you restart Tomcat without a problem?*
>> >
>> > This problem is occurring in production, I never tried.
>> >
>> >
>> > Thanks,
>> > Doss.
>> >
>> >
>> > On Wed, Nov 19, 2014 at 7:55 PM, Erick Erickson <erickerickson@gmail.com
>> >
>> > wrote:
>> >
>> >> You've really got to provide details for us to say much
>> >> of anything. There are about a zillion things that it could be.
>> >>
>> >> In particular, have you looked at the solr logs? Are there
>> >> any interesting things in them? How big are the cores?
>> >> How much memory are you allocating the JVM? How
>> >> many docs in the cores when the problem happens?
>> >> Before the nodes stop responding, can you restart
>> >> Tomcat without a problem?
>> >>
>> >> You might review:
>> >> http://wiki.apache.org/solr/UsingMailingLists
>> >>
>> >> Best,
>> >> Erick
>> >>
>> >>
>> >> On Wed, Nov 19, 2014 at 1:04 AM, Doss <it...@gmail.com> wrote:
>> >> > I have two node SOLR (4.9.0) cloud with Tomcat (8), Zookeeper. At
>> times
>> >> > SOLR in Node 1 stops responding, to fix the issue I am restarting
>> tomcat
>> >> in
>> >> > Node 1, but SOLR not starting up, but if I remove the solr cores in
>> both
>> >> > nodes and try restarting it starts working, and then I have to reindex
>> >> the
>> >> > whole data again. We are using this setup in production because of
>> this
>> >> > issue we are having 1 to 1.30 hours of service down time. Any
>> suggestions
>> >> > would be greatly appreciated.
>> >> >
>> >> > Thanks,
>> >> > Doss.
>> >>
>>