You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by Reej Nayagam <re...@gmail.com> on 2022/01/26 12:41:28 UTC

Zookeeper client Timeout - takes the default 30000 ms

Hi All,

I need to handle zk failure and so monitoring the zk ensemble, and if the
majority of the zk fails we'll activate the HA to point to a DB search.

So to check if each of the zk is alive , we are connecting as below,

*zkClient = solrZkClient(zkaddress,10000),*
*return zkclient.getSolrZookeeper().getState(),isAlive*

But I noticed, it still takes the default 30,000 ms timeout instead of 10k
milliseconds passed in.

Is there a way we can override zookeeper timeout, because we have 3 zk's
and if suppose all the 3 are down, to get the status of each we need to
wait for 30 seconds each.

Kindly advise if any of you have handled this. Thank you !

*Thanks,*
*Reej*

Re: Zookeeper client Timeout - takes the default 30000 ms

Posted by matthew sporleder <ms...@gmail.com>.
The errors are unusual but the znode_count is normal

On Fri, Jan 28, 2022 at 9:12 PM Reej Nayagam <re...@gmail.com> wrote:
>
> Hi All,
>
>  As suggested from the group I tried using this api call
> /sol/admin/zookeeper/status, to get the zk status
> whenever i try this in my browser one time I get the status as 0 and get
> the zk ensemble details, after a while when I try i get
> status : 500
> error: msg: "Java.net.SocketException:connection reset:
> trace: java.io.UncheckedIOException :
> java.net.socketexception:connection reset
>
> can I ignore if there is a socket exception because immediately if i try
> next time the status is ok no errors. Kindly advise.
>
> Also in the solr admin UI, I can see the below for all the zookeepers, is
> this normal? what is the zk_node_count
> ZK_node_count 1852
> zk_approximate_data_size 7853679
>
> *Thanks,*
> *Reej*
>
>
> On Thu, Jan 27, 2022 at 4:22 PM Reej Nayagam <re...@gmail.com> wrote:
>
> > Hi Vinay,
> >
> > We are connecting using cloudsolrclient passing the zk host, so if zk is
> > down, the connection to solr also won't happen.
> >
> > *Thanks,*
> > *Reej*
> >
> >
> > On Thu, Jan 27, 2022 at 12:35 PM Vinay Rajput <vi...@gmail.com>
> > wrote:
> >
> >> It also looks like from your requirement that you want to disable solr
> >> search and activate DB search in case of zookeeper cluster failure.
> >>
> >> That is NOT needed. Solr search is not impacted when zk cluster is down,
> >> only indexing is impacted. We have had a situation when our all zk nodes
> >> were down for few minutes and still there was no impact on search.
> >>
> >> Thanks,
> >> Vinay
> >>
> >> On Wed, 26 Jan 2022 at 9:12 PM, Walter Underwood <wu...@wunderwood.org>
> >> wrote:
> >>
> >> > You can check the status of each Zookeeper node with the “ruok” command.
> >> > This is one of the “four lettter words” admin commands.
> >> >
> >> >
> >> https://zookeeper.apache.org/doc/r3.4.8/zookeeperAdmin.html#sc_zkCommands
> >> >
> >> > This is how it works from a command line.
> >> >
> >> > $ echo ruok | nc zoo-shared-1.test.search.cheggnet.com 2181
> >> > imok
> >> >
> >> > wunder
> >> > Walter Underwood
> >> > wunder@wunderwood.org
> >> > http://observer.wunderwood.org/  (my blog)
> >> >
> >> > > On Jan 26, 2022, at 5:53 AM, Reej Nayagam <re...@gmail.com> wrote:
> >> > >
> >> > > The scenario is solr servers are up, but majority of the zk is down,
> >> > > so we need to tell the issue is with the zookeeper. I don’t find a
> >> way on
> >> > > how to identify the zookeeper status without waiting for the timeout
> >> to
> >> > > happen after 30 seconds.
> >> > >
> >> > > On Wed, 26 Jan 2022 at 9:39 PM, matthew sporleder <
> >> msporleder@gmail.com>
> >> > > wrote:
> >> > >
> >> > >> I don't understand your approach --
> >> > >>
> >> > >> For checking solr health I would probably use the ping endpoint or a
> >> > >> very fast query with a low timeout (q=*:*&timeAllowed=100&rows=0).
> >> > >>
> >> > >> IIRC zookeeper health (as seen by solr) is in the CLUSTERSTATUS admin
> >> > >> api command?  It's somewhere near there if not in CLUSTERSTATUS.
> >> > >>
> >> > >> For interacting with zookeeper itself I would probably just use zk
> >> > >> clients directly.
> >> > >>
> >> > >>
> >> > >>
> >> > >> On Wed, Jan 26, 2022 at 7:41 AM Reej Nayagam <re...@gmail.com>
> >> wrote:
> >> > >>>
> >> > >>> Hi All,
> >> > >>>
> >> > >>> I need to handle zk failure and so monitoring the zk ensemble, and
> >> if
> >> > the
> >> > >>> majority of the zk fails we'll activate the HA to point to a DB
> >> search.
> >> > >>>
> >> > >>> So to check if each of the zk is alive , we are connecting as below,
> >> > >>>
> >> > >>> *zkClient = solrZkClient(zkaddress,10000),*
> >> > >>> *return zkclient.getSolrZookeeper().getState(),isAlive*
> >> > >>>
> >> > >>> But I noticed, it still takes the default 30,000 ms timeout instead
> >> of
> >> > >> 10k
> >> > >>> milliseconds passed in.
> >> > >>>
> >> > >>> Is there a way we can override zookeeper timeout, because we have 3
> >> > zk's
> >> > >>> and if suppose all the 3 are down, to get the status of each we
> >> need to
> >> > >>> wait for 30 seconds each.
> >> > >>>
> >> > >>> Kindly advise if any of you have handled this. Thank you !
> >> > >>>
> >> > >>> *Thanks,*
> >> > >>> *Reej*
> >> > >>
> >> > > --
> >> > > *Thanks,*
> >> > > *Reej*
> >> >
> >> >
> >>
> >

Re: Zookeeper client Timeout - takes the default 30000 ms

Posted by Reej Nayagam <re...@gmail.com>.
Hi All,

 As suggested from the group I tried using this api call
/sol/admin/zookeeper/status, to get the zk status
whenever i try this in my browser one time I get the status as 0 and get
the zk ensemble details, after a while when I try i get
status : 500
error: msg: "Java.net.SocketException:connection reset:
trace: java.io.UncheckedIOException :
java.net.socketexception:connection reset

can I ignore if there is a socket exception because immediately if i try
next time the status is ok no errors. Kindly advise.

Also in the solr admin UI, I can see the below for all the zookeepers, is
this normal? what is the zk_node_count
ZK_node_count 1852
zk_approximate_data_size 7853679

*Thanks,*
*Reej*


On Thu, Jan 27, 2022 at 4:22 PM Reej Nayagam <re...@gmail.com> wrote:

> Hi Vinay,
>
> We are connecting using cloudsolrclient passing the zk host, so if zk is
> down, the connection to solr also won't happen.
>
> *Thanks,*
> *Reej*
>
>
> On Thu, Jan 27, 2022 at 12:35 PM Vinay Rajput <vi...@gmail.com>
> wrote:
>
>> It also looks like from your requirement that you want to disable solr
>> search and activate DB search in case of zookeeper cluster failure.
>>
>> That is NOT needed. Solr search is not impacted when zk cluster is down,
>> only indexing is impacted. We have had a situation when our all zk nodes
>> were down for few minutes and still there was no impact on search.
>>
>> Thanks,
>> Vinay
>>
>> On Wed, 26 Jan 2022 at 9:12 PM, Walter Underwood <wu...@wunderwood.org>
>> wrote:
>>
>> > You can check the status of each Zookeeper node with the “ruok” command.
>> > This is one of the “four lettter words” admin commands.
>> >
>> >
>> https://zookeeper.apache.org/doc/r3.4.8/zookeeperAdmin.html#sc_zkCommands
>> >
>> > This is how it works from a command line.
>> >
>> > $ echo ruok | nc zoo-shared-1.test.search.cheggnet.com 2181
>> > imok
>> >
>> > wunder
>> > Walter Underwood
>> > wunder@wunderwood.org
>> > http://observer.wunderwood.org/  (my blog)
>> >
>> > > On Jan 26, 2022, at 5:53 AM, Reej Nayagam <re...@gmail.com> wrote:
>> > >
>> > > The scenario is solr servers are up, but majority of the zk is down,
>> > > so we need to tell the issue is with the zookeeper. I don’t find a
>> way on
>> > > how to identify the zookeeper status without waiting for the timeout
>> to
>> > > happen after 30 seconds.
>> > >
>> > > On Wed, 26 Jan 2022 at 9:39 PM, matthew sporleder <
>> msporleder@gmail.com>
>> > > wrote:
>> > >
>> > >> I don't understand your approach --
>> > >>
>> > >> For checking solr health I would probably use the ping endpoint or a
>> > >> very fast query with a low timeout (q=*:*&timeAllowed=100&rows=0).
>> > >>
>> > >> IIRC zookeeper health (as seen by solr) is in the CLUSTERSTATUS admin
>> > >> api command?  It's somewhere near there if not in CLUSTERSTATUS.
>> > >>
>> > >> For interacting with zookeeper itself I would probably just use zk
>> > >> clients directly.
>> > >>
>> > >>
>> > >>
>> > >> On Wed, Jan 26, 2022 at 7:41 AM Reej Nayagam <re...@gmail.com>
>> wrote:
>> > >>>
>> > >>> Hi All,
>> > >>>
>> > >>> I need to handle zk failure and so monitoring the zk ensemble, and
>> if
>> > the
>> > >>> majority of the zk fails we'll activate the HA to point to a DB
>> search.
>> > >>>
>> > >>> So to check if each of the zk is alive , we are connecting as below,
>> > >>>
>> > >>> *zkClient = solrZkClient(zkaddress,10000),*
>> > >>> *return zkclient.getSolrZookeeper().getState(),isAlive*
>> > >>>
>> > >>> But I noticed, it still takes the default 30,000 ms timeout instead
>> of
>> > >> 10k
>> > >>> milliseconds passed in.
>> > >>>
>> > >>> Is there a way we can override zookeeper timeout, because we have 3
>> > zk's
>> > >>> and if suppose all the 3 are down, to get the status of each we
>> need to
>> > >>> wait for 30 seconds each.
>> > >>>
>> > >>> Kindly advise if any of you have handled this. Thank you !
>> > >>>
>> > >>> *Thanks,*
>> > >>> *Reej*
>> > >>
>> > > --
>> > > *Thanks,*
>> > > *Reej*
>> >
>> >
>>
>

Re: Zookeeper client Timeout - takes the default 30000 ms

Posted by Reej Nayagam <re...@gmail.com>.
Hi Vinay,

We are connecting using cloudsolrclient passing the zk host, so if zk is
down, the connection to solr also won't happen.

*Thanks,*
*Reej*


On Thu, Jan 27, 2022 at 12:35 PM Vinay Rajput <vi...@gmail.com>
wrote:

> It also looks like from your requirement that you want to disable solr
> search and activate DB search in case of zookeeper cluster failure.
>
> That is NOT needed. Solr search is not impacted when zk cluster is down,
> only indexing is impacted. We have had a situation when our all zk nodes
> were down for few minutes and still there was no impact on search.
>
> Thanks,
> Vinay
>
> On Wed, 26 Jan 2022 at 9:12 PM, Walter Underwood <wu...@wunderwood.org>
> wrote:
>
> > You can check the status of each Zookeeper node with the “ruok” command.
> > This is one of the “four lettter words” admin commands.
> >
> >
> https://zookeeper.apache.org/doc/r3.4.8/zookeeperAdmin.html#sc_zkCommands
> >
> > This is how it works from a command line.
> >
> > $ echo ruok | nc zoo-shared-1.test.search.cheggnet.com 2181
> > imok
> >
> > wunder
> > Walter Underwood
> > wunder@wunderwood.org
> > http://observer.wunderwood.org/  (my blog)
> >
> > > On Jan 26, 2022, at 5:53 AM, Reej Nayagam <re...@gmail.com> wrote:
> > >
> > > The scenario is solr servers are up, but majority of the zk is down,
> > > so we need to tell the issue is with the zookeeper. I don’t find a way
> on
> > > how to identify the zookeeper status without waiting for the timeout to
> > > happen after 30 seconds.
> > >
> > > On Wed, 26 Jan 2022 at 9:39 PM, matthew sporleder <
> msporleder@gmail.com>
> > > wrote:
> > >
> > >> I don't understand your approach --
> > >>
> > >> For checking solr health I would probably use the ping endpoint or a
> > >> very fast query with a low timeout (q=*:*&timeAllowed=100&rows=0).
> > >>
> > >> IIRC zookeeper health (as seen by solr) is in the CLUSTERSTATUS admin
> > >> api command?  It's somewhere near there if not in CLUSTERSTATUS.
> > >>
> > >> For interacting with zookeeper itself I would probably just use zk
> > >> clients directly.
> > >>
> > >>
> > >>
> > >> On Wed, Jan 26, 2022 at 7:41 AM Reej Nayagam <re...@gmail.com>
> wrote:
> > >>>
> > >>> Hi All,
> > >>>
> > >>> I need to handle zk failure and so monitoring the zk ensemble, and if
> > the
> > >>> majority of the zk fails we'll activate the HA to point to a DB
> search.
> > >>>
> > >>> So to check if each of the zk is alive , we are connecting as below,
> > >>>
> > >>> *zkClient = solrZkClient(zkaddress,10000),*
> > >>> *return zkclient.getSolrZookeeper().getState(),isAlive*
> > >>>
> > >>> But I noticed, it still takes the default 30,000 ms timeout instead
> of
> > >> 10k
> > >>> milliseconds passed in.
> > >>>
> > >>> Is there a way we can override zookeeper timeout, because we have 3
> > zk's
> > >>> and if suppose all the 3 are down, to get the status of each we need
> to
> > >>> wait for 30 seconds each.
> > >>>
> > >>> Kindly advise if any of you have handled this. Thank you !
> > >>>
> > >>> *Thanks,*
> > >>> *Reej*
> > >>
> > > --
> > > *Thanks,*
> > > *Reej*
> >
> >
>

Re: Zookeeper client Timeout - takes the default 30000 ms

Posted by Vinay Rajput <vi...@gmail.com>.
It also looks like from your requirement that you want to disable solr
search and activate DB search in case of zookeeper cluster failure.

That is NOT needed. Solr search is not impacted when zk cluster is down,
only indexing is impacted. We have had a situation when our all zk nodes
were down for few minutes and still there was no impact on search.

Thanks,
Vinay

On Wed, 26 Jan 2022 at 9:12 PM, Walter Underwood <wu...@wunderwood.org>
wrote:

> You can check the status of each Zookeeper node with the “ruok” command.
> This is one of the “four lettter words” admin commands.
>
> https://zookeeper.apache.org/doc/r3.4.8/zookeeperAdmin.html#sc_zkCommands
>
> This is how it works from a command line.
>
> $ echo ruok | nc zoo-shared-1.test.search.cheggnet.com 2181
> imok
>
> wunder
> Walter Underwood
> wunder@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
> > On Jan 26, 2022, at 5:53 AM, Reej Nayagam <re...@gmail.com> wrote:
> >
> > The scenario is solr servers are up, but majority of the zk is down,
> > so we need to tell the issue is with the zookeeper. I don’t find a way on
> > how to identify the zookeeper status without waiting for the timeout to
> > happen after 30 seconds.
> >
> > On Wed, 26 Jan 2022 at 9:39 PM, matthew sporleder <ms...@gmail.com>
> > wrote:
> >
> >> I don't understand your approach --
> >>
> >> For checking solr health I would probably use the ping endpoint or a
> >> very fast query with a low timeout (q=*:*&timeAllowed=100&rows=0).
> >>
> >> IIRC zookeeper health (as seen by solr) is in the CLUSTERSTATUS admin
> >> api command?  It's somewhere near there if not in CLUSTERSTATUS.
> >>
> >> For interacting with zookeeper itself I would probably just use zk
> >> clients directly.
> >>
> >>
> >>
> >> On Wed, Jan 26, 2022 at 7:41 AM Reej Nayagam <re...@gmail.com> wrote:
> >>>
> >>> Hi All,
> >>>
> >>> I need to handle zk failure and so monitoring the zk ensemble, and if
> the
> >>> majority of the zk fails we'll activate the HA to point to a DB search.
> >>>
> >>> So to check if each of the zk is alive , we are connecting as below,
> >>>
> >>> *zkClient = solrZkClient(zkaddress,10000),*
> >>> *return zkclient.getSolrZookeeper().getState(),isAlive*
> >>>
> >>> But I noticed, it still takes the default 30,000 ms timeout instead of
> >> 10k
> >>> milliseconds passed in.
> >>>
> >>> Is there a way we can override zookeeper timeout, because we have 3
> zk's
> >>> and if suppose all the 3 are down, to get the status of each we need to
> >>> wait for 30 seconds each.
> >>>
> >>> Kindly advise if any of you have handled this. Thank you !
> >>>
> >>> *Thanks,*
> >>> *Reej*
> >>
> > --
> > *Thanks,*
> > *Reej*
>
>

Re: Zookeeper client Timeout - takes the default 30000 ms

Posted by Walter Underwood <wu...@wunderwood.org>.
You can check the status of each Zookeeper node with the “ruok” command.
This is one of the “four lettter words” admin commands.

https://zookeeper.apache.org/doc/r3.4.8/zookeeperAdmin.html#sc_zkCommands

This is how it works from a command line.

$ echo ruok | nc zoo-shared-1.test.search.cheggnet.com 2181
imok

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Jan 26, 2022, at 5:53 AM, Reej Nayagam <re...@gmail.com> wrote:
> 
> The scenario is solr servers are up, but majority of the zk is down,
> so we need to tell the issue is with the zookeeper. I don’t find a way on
> how to identify the zookeeper status without waiting for the timeout to
> happen after 30 seconds.
> 
> On Wed, 26 Jan 2022 at 9:39 PM, matthew sporleder <ms...@gmail.com>
> wrote:
> 
>> I don't understand your approach --
>> 
>> For checking solr health I would probably use the ping endpoint or a
>> very fast query with a low timeout (q=*:*&timeAllowed=100&rows=0).
>> 
>> IIRC zookeeper health (as seen by solr) is in the CLUSTERSTATUS admin
>> api command?  It's somewhere near there if not in CLUSTERSTATUS.
>> 
>> For interacting with zookeeper itself I would probably just use zk
>> clients directly.
>> 
>> 
>> 
>> On Wed, Jan 26, 2022 at 7:41 AM Reej Nayagam <re...@gmail.com> wrote:
>>> 
>>> Hi All,
>>> 
>>> I need to handle zk failure and so monitoring the zk ensemble, and if the
>>> majority of the zk fails we'll activate the HA to point to a DB search.
>>> 
>>> So to check if each of the zk is alive , we are connecting as below,
>>> 
>>> *zkClient = solrZkClient(zkaddress,10000),*
>>> *return zkclient.getSolrZookeeper().getState(),isAlive*
>>> 
>>> But I noticed, it still takes the default 30,000 ms timeout instead of
>> 10k
>>> milliseconds passed in.
>>> 
>>> Is there a way we can override zookeeper timeout, because we have 3 zk's
>>> and if suppose all the 3 are down, to get the status of each we need to
>>> wait for 30 seconds each.
>>> 
>>> Kindly advise if any of you have handled this. Thank you !
>>> 
>>> *Thanks,*
>>> *Reej*
>> 
> -- 
> *Thanks,*
> *Reej*


Re: Zookeeper client Timeout - takes the default 30000 ms

Posted by Reej Nayagam <re...@gmail.com>.
Thank you Shawn, I will try this way and see if it helps.

*Thanks,*
*Reej*


On Wed, Jan 26, 2022 at 10:59 PM Shawn Heisey <ap...@elyograg.org> wrote:

> On 1/26/22 06:53, Reej Nayagam wrote:
> > The scenario is solr servers are up, but majority of the zk is down,
> > so we need to tell the issue is with the zookeeper. I don’t find a way on
> > how to identify the zookeeper status without waiting for the timeout to
> > happen after 30 seconds.
>
> I think the issue here is that the SolrZkClient is created with a
> cconstructor that does not specify the connection timeout.  That timeout
> defaults to 30 seconds.  You'll want to create it with both the client
> timeout and the connect timeout specified. Because in your situation
> with ZK being down you can't connect, it will be the connect timeout
> that applies, not the client timeout.
>
> Thanks,
> Shawn
>
>

Re: Zookeeper client Timeout - takes the default 30000 ms

Posted by Shawn Heisey <ap...@elyograg.org>.
On 1/26/22 06:53, Reej Nayagam wrote:
> The scenario is solr servers are up, but majority of the zk is down,
> so we need to tell the issue is with the zookeeper. I don’t find a way on
> how to identify the zookeeper status without waiting for the timeout to
> happen after 30 seconds.

I think the issue here is that the SolrZkClient is created with a 
cconstructor that does not specify the connection timeout.  That timeout 
defaults to 30 seconds.  You'll want to create it with both the client 
timeout and the connect timeout specified. Because in your situation 
with ZK being down you can't connect, it will be the connect timeout 
that applies, not the client timeout.

Thanks,
Shawn


Re: Zookeeper client Timeout - takes the default 30000 ms

Posted by Reej Nayagam <re...@gmail.com>.
Thanks Matthew, got some idea now.
Either this  way or with Shawn's approach of passing zkconnection timeout
will workout for me


*Thanks,*
*Reej*


On Wed, Jan 26, 2022 at 10:37 PM matthew sporleder <ms...@gmail.com>
wrote:

> Check this out:
> curl 'localhost:8983/solr/admin/zookeeper/status'
>
> On Wed, Jan 26, 2022 at 8:54 AM Reej Nayagam <re...@gmail.com> wrote:
> >
> > The scenario is solr servers are up, but majority of the zk is down,
> > so we need to tell the issue is with the zookeeper. I don’t find a way on
> > how to identify the zookeeper status without waiting for the timeout to
> > happen after 30 seconds.
> >
> > On Wed, 26 Jan 2022 at 9:39 PM, matthew sporleder <ms...@gmail.com>
> > wrote:
> >
> > > I don't understand your approach --
> > >
> > > For checking solr health I would probably use the ping endpoint or a
> > > very fast query with a low timeout (q=*:*&timeAllowed=100&rows=0).
> > >
> > > IIRC zookeeper health (as seen by solr) is in the CLUSTERSTATUS admin
> > > api command?  It's somewhere near there if not in CLUSTERSTATUS.
> > >
> > > For interacting with zookeeper itself I would probably just use zk
> > > clients directly.
> > >
> > >
> > >
> > > On Wed, Jan 26, 2022 at 7:41 AM Reej Nayagam <re...@gmail.com>
> wrote:
> > > >
> > > > Hi All,
> > > >
> > > > I need to handle zk failure and so monitoring the zk ensemble, and
> if the
> > > > majority of the zk fails we'll activate the HA to point to a DB
> search.
> > > >
> > > > So to check if each of the zk is alive , we are connecting as below,
> > > >
> > > > *zkClient = solrZkClient(zkaddress,10000),*
> > > > *return zkclient.getSolrZookeeper().getState(),isAlive*
> > > >
> > > > But I noticed, it still takes the default 30,000 ms timeout instead
> of
> > > 10k
> > > > milliseconds passed in.
> > > >
> > > > Is there a way we can override zookeeper timeout, because we have 3
> zk's
> > > > and if suppose all the 3 are down, to get the status of each we need
> to
> > > > wait for 30 seconds each.
> > > >
> > > > Kindly advise if any of you have handled this. Thank you !
> > > >
> > > > *Thanks,*
> > > > *Reej*
> > >
> > --
> > *Thanks,*
> > *Reej*
>

Re: Zookeeper client Timeout - takes the default 30000 ms

Posted by matthew sporleder <ms...@gmail.com>.
Check this out:
curl 'localhost:8983/solr/admin/zookeeper/status'

On Wed, Jan 26, 2022 at 8:54 AM Reej Nayagam <re...@gmail.com> wrote:
>
> The scenario is solr servers are up, but majority of the zk is down,
> so we need to tell the issue is with the zookeeper. I don’t find a way on
> how to identify the zookeeper status without waiting for the timeout to
> happen after 30 seconds.
>
> On Wed, 26 Jan 2022 at 9:39 PM, matthew sporleder <ms...@gmail.com>
> wrote:
>
> > I don't understand your approach --
> >
> > For checking solr health I would probably use the ping endpoint or a
> > very fast query with a low timeout (q=*:*&timeAllowed=100&rows=0).
> >
> > IIRC zookeeper health (as seen by solr) is in the CLUSTERSTATUS admin
> > api command?  It's somewhere near there if not in CLUSTERSTATUS.
> >
> > For interacting with zookeeper itself I would probably just use zk
> > clients directly.
> >
> >
> >
> > On Wed, Jan 26, 2022 at 7:41 AM Reej Nayagam <re...@gmail.com> wrote:
> > >
> > > Hi All,
> > >
> > > I need to handle zk failure and so monitoring the zk ensemble, and if the
> > > majority of the zk fails we'll activate the HA to point to a DB search.
> > >
> > > So to check if each of the zk is alive , we are connecting as below,
> > >
> > > *zkClient = solrZkClient(zkaddress,10000),*
> > > *return zkclient.getSolrZookeeper().getState(),isAlive*
> > >
> > > But I noticed, it still takes the default 30,000 ms timeout instead of
> > 10k
> > > milliseconds passed in.
> > >
> > > Is there a way we can override zookeeper timeout, because we have 3 zk's
> > > and if suppose all the 3 are down, to get the status of each we need to
> > > wait for 30 seconds each.
> > >
> > > Kindly advise if any of you have handled this. Thank you !
> > >
> > > *Thanks,*
> > > *Reej*
> >
> --
> *Thanks,*
> *Reej*

Re: Zookeeper client Timeout - takes the default 30000 ms

Posted by Reej Nayagam <re...@gmail.com>.
The scenario is solr servers are up, but majority of the zk is down,
so we need to tell the issue is with the zookeeper. I don’t find a way on
how to identify the zookeeper status without waiting for the timeout to
happen after 30 seconds.

On Wed, 26 Jan 2022 at 9:39 PM, matthew sporleder <ms...@gmail.com>
wrote:

> I don't understand your approach --
>
> For checking solr health I would probably use the ping endpoint or a
> very fast query with a low timeout (q=*:*&timeAllowed=100&rows=0).
>
> IIRC zookeeper health (as seen by solr) is in the CLUSTERSTATUS admin
> api command?  It's somewhere near there if not in CLUSTERSTATUS.
>
> For interacting with zookeeper itself I would probably just use zk
> clients directly.
>
>
>
> On Wed, Jan 26, 2022 at 7:41 AM Reej Nayagam <re...@gmail.com> wrote:
> >
> > Hi All,
> >
> > I need to handle zk failure and so monitoring the zk ensemble, and if the
> > majority of the zk fails we'll activate the HA to point to a DB search.
> >
> > So to check if each of the zk is alive , we are connecting as below,
> >
> > *zkClient = solrZkClient(zkaddress,10000),*
> > *return zkclient.getSolrZookeeper().getState(),isAlive*
> >
> > But I noticed, it still takes the default 30,000 ms timeout instead of
> 10k
> > milliseconds passed in.
> >
> > Is there a way we can override zookeeper timeout, because we have 3 zk's
> > and if suppose all the 3 are down, to get the status of each we need to
> > wait for 30 seconds each.
> >
> > Kindly advise if any of you have handled this. Thank you !
> >
> > *Thanks,*
> > *Reej*
>
-- 
*Thanks,*
*Reej*

Re: Zookeeper client Timeout - takes the default 30000 ms

Posted by matthew sporleder <ms...@gmail.com>.
I don't understand your approach --

For checking solr health I would probably use the ping endpoint or a
very fast query with a low timeout (q=*:*&timeAllowed=100&rows=0).

IIRC zookeeper health (as seen by solr) is in the CLUSTERSTATUS admin
api command?  It's somewhere near there if not in CLUSTERSTATUS.

For interacting with zookeeper itself I would probably just use zk
clients directly.



On Wed, Jan 26, 2022 at 7:41 AM Reej Nayagam <re...@gmail.com> wrote:
>
> Hi All,
>
> I need to handle zk failure and so monitoring the zk ensemble, and if the
> majority of the zk fails we'll activate the HA to point to a DB search.
>
> So to check if each of the zk is alive , we are connecting as below,
>
> *zkClient = solrZkClient(zkaddress,10000),*
> *return zkclient.getSolrZookeeper().getState(),isAlive*
>
> But I noticed, it still takes the default 30,000 ms timeout instead of 10k
> milliseconds passed in.
>
> Is there a way we can override zookeeper timeout, because we have 3 zk's
> and if suppose all the 3 are down, to get the status of each we need to
> wait for 30 seconds each.
>
> Kindly advise if any of you have handled this. Thank you !
>
> *Thanks,*
> *Reej*