You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@solr.apache.org by Mike Drob <md...@apache.org> on 2021/09/14 20:09:06 UTC

Discuss SIP-14 Embedded Zookeeper

Devs,

We've previously discussed maintaining ZK as being an operational hurdle
for some groups getting started or migrating to SolrCloud from non-ZK cloud
mode. I'd like to discuss the idea of embedding ZK in our own process
control.

Please see the SIP at
https://cwiki.apache.org/confluence/display/SOLR/SIP-14+Embedded+Zookeeper

Thank you,
Mike

Re: Discuss SIP-14 Embedded Zookeeper

Posted by Eric Pugh <ep...@opensourceconnections.com.INVALID>.
Agreed (though SIP-14 helps towards that end)…

Thoughts on what the commands might look like?   What do you think will be the road blocks?   Do we have any JIRA tickets that layout the path forward?

In the SIP there is some discussion of starting with a ZookeeperServerEmbedded, and even a PR by Mike Drob: https://github.com/apache/solr/pull/522 for that and migrating tests over…. It was last touched in January 2022 however.

I’m wondering if we still think that is the way to go to start with?  Is how we use Zookeeper still so fragile that we need to redo that plumbing first, versus maybe trying instead to start with the enabling of Zookeeper based on the role of the server instead?

I should be clear, I’m not volunteering to lead implementing this SIP as well…. I’m still pretty focused on wrapping up some things with the Solr CLI for the next few months ;-).





> On Sep 11, 2023, at 9:34 AM, Ilan Ginzburg <il...@gmail.com> wrote:
> 
> Getting rid of standalone Solr mode is not part of SIP-14, I suggest we
> discuss it separately from the effort to make SolrCloud easier to deploy or
> the default.
> 
> On Sat, Sep 9, 2023 at 4:11 PM Ishan Chattopadhyaya <
> ichattopadhyaya@gmail.com> wrote:
> 
>> +1 fully support this, also support the move for us to get rid of the
>> standalone Solr mode.
>> 
>> On Sat, 9 Sept, 2023, 5:12 pm Eric Pugh,
>> <ep...@opensourceconnections.com.invalid> wrote:
>> 
>>> The baby step that Jan is proposing for a ‘zookeeper’ node-role makes
>>> sense to me, for those who are only deploying a very small Solr setup.
>>> 
>>> What would that look like?    Would you start your solr like this?    I
>>> looked a bit at the Ref Guide page for Nodes, and I’m gathering it might
>>> look like:
>>> 
>>> One Solr with what we call Embedded ZK today:
>>> 
>>> bin/solr start -c
>>> 
>>> The -c switch and no -z parameter means it has an implicit Zookeeper Role
>>> and therefore fires up an embedded ZK.
>>> 
>>> The Same Thing Explicitly:
>>> 
>>> bin/solr start -c  -solr.node.roles=zookeeper:on -z localhost:9983
>>> 
>>> 
>>> So, what about two Solr nodes?
>>> 
>>> bin/solr start -c  -solr.node.roles=zookeeper:on -z localhost:9983 -p
>> 8983
>>> bin/solr start -c -z localhost:9983 -p 7574
>>> 
>>> And Four Solr Nodes with 3 enabled for ZK?
>>> 
>>> bin/solr start -c  -solr.node.roles=zookeeper:on -z
>>> localhost:9983,localhost:8574,localhost:9984 -p 8983
>>> bin/solr start -c  -solr.node.roles=zookeeper:on -z
>>> localhost:9983,localhost:8574,localhost:9984 -p 7574
>>> bin/solr start -c  -solr.node.roles=zookeeper:on -z
>>> localhost:9983,localhost:8574,localhost:9984 -p 8984
>>> bin/solr start -c  -z localhost:9983,localhost:8574,localhost:9984 -p
>> 7574
>>> 
>>> At least on the face of it, this doesn’t seem to far away….    I’m sure
>>> there is a lot of complexity that I don’t realize about ;-).   Like, will
>>> Solr start up if one of the three ZK it’s wants isn’t yet available?
>>> 
>>> 
>>> 
>>> Eric
>>> 
>>> 
>>> 
>>> 
>>>> On Sep 6, 2023, at 6:14 PM, Jan Høydahl <ja...@cominvent.com> wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> Eric Pugh and I discussed this SIP the other day, as a stepping stone
>>> for making cloud mode the default. Perhaps there is new energy for this
>> two
>>> years down the road?
>>>> 
>>>> We don't need to tackle the full dynamic scaling of ZK on day one.
>>>> Just adding a 'zookeeper' node-role so we could have tree fixed nodes
>>> acting as ZK would be a win and lower the complexity of deploying Solr.
>>>> We could always add magic auto scaling later.
>>>> 
>>>> Jan
>>>> 
>>>>> 17. jan. 2022 kl. 06:57 skrev Mark Miller <ma...@gmail.com>:
>>>>> 
>>>>> Yeah, there two reasons we didn’t push embedded Zookeeper out of the
>>> gate and even went so far as to call it a non production “demo” feature.
>>> Dynamic reconfiguration as a cluster changed over time, and a Zookeeper
>>> instance per Solr node being prohibitive. At least the latter was
>>> theoretical externally solvable it felt, but at the time, that just
>> brought
>>> back around to the lack of dynamic configuration.
>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@solr.apache.org
>>>> For additional commands, e-mail: dev-help@solr.apache.org
>>>> 
>>> 
>>> _______________________
>>> Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 |
>>> http://www.opensourceconnections.com <
>>> http://www.opensourceconnections.com/> | My Free/Busy <
>>> http://tinyurl.com/eric-cal>
>>> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed <
>>> 
>> https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw
>>> 
>>> 
>>> This e-mail and all contents, including attachments, is considered to be
>>> Company Confidential unless explicitly stated otherwise, regardless of
>>> whether attachments are marked as such.
>>> 
>>> 
>> 

_______________________
Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com <http://www.opensourceconnections.com/> | My Free/Busy <http://tinyurl.com/eric-cal>  
Co-Author: Apache Solr Enterprise Search Server, 3rd Ed <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>	
This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such.


Re: Discuss SIP-14 Embedded Zookeeper

Posted by Ilan Ginzburg <il...@gmail.com>.
Getting rid of standalone Solr mode is not part of SIP-14, I suggest we
discuss it separately from the effort to make SolrCloud easier to deploy or
the default.

On Sat, Sep 9, 2023 at 4:11 PM Ishan Chattopadhyaya <
ichattopadhyaya@gmail.com> wrote:

> +1 fully support this, also support the move for us to get rid of the
> standalone Solr mode.
>
> On Sat, 9 Sept, 2023, 5:12 pm Eric Pugh,
> <ep...@opensourceconnections.com.invalid> wrote:
>
> > The baby step that Jan is proposing for a ‘zookeeper’ node-role makes
> > sense to me, for those who are only deploying a very small Solr setup.
> >
> > What would that look like?    Would you start your solr like this?    I
> > looked a bit at the Ref Guide page for Nodes, and I’m gathering it might
> > look like:
> >
> > One Solr with what we call Embedded ZK today:
> >
> > bin/solr start -c
> >
> > The -c switch and no -z parameter means it has an implicit Zookeeper Role
> > and therefore fires up an embedded ZK.
> >
> > The Same Thing Explicitly:
> >
> > bin/solr start -c  -solr.node.roles=zookeeper:on -z localhost:9983
> >
> >
> > So, what about two Solr nodes?
> >
> > bin/solr start -c  -solr.node.roles=zookeeper:on -z localhost:9983 -p
> 8983
> > bin/solr start -c -z localhost:9983 -p 7574
> >
> > And Four Solr Nodes with 3 enabled for ZK?
> >
> > bin/solr start -c  -solr.node.roles=zookeeper:on -z
> > localhost:9983,localhost:8574,localhost:9984 -p 8983
> > bin/solr start -c  -solr.node.roles=zookeeper:on -z
> > localhost:9983,localhost:8574,localhost:9984 -p 7574
> > bin/solr start -c  -solr.node.roles=zookeeper:on -z
> > localhost:9983,localhost:8574,localhost:9984 -p 8984
> > bin/solr start -c  -z localhost:9983,localhost:8574,localhost:9984 -p
> 7574
> >
> > At least on the face of it, this doesn’t seem to far away….    I’m sure
> > there is a lot of complexity that I don’t realize about ;-).   Like, will
> > Solr start up if one of the three ZK it’s wants isn’t yet available?
> >
> >
> >
> > Eric
> >
> >
> >
> >
> > > On Sep 6, 2023, at 6:14 PM, Jan Høydahl <ja...@cominvent.com> wrote:
> > >
> > > Hi,
> > >
> > > Eric Pugh and I discussed this SIP the other day, as a stepping stone
> > for making cloud mode the default. Perhaps there is new energy for this
> two
> > years down the road?
> > >
> > > We don't need to tackle the full dynamic scaling of ZK on day one.
> > > Just adding a 'zookeeper' node-role so we could have tree fixed nodes
> > acting as ZK would be a win and lower the complexity of deploying Solr.
> > > We could always add magic auto scaling later.
> > >
> > > Jan
> > >
> > >> 17. jan. 2022 kl. 06:57 skrev Mark Miller <ma...@gmail.com>:
> > >>
> > >> Yeah, there two reasons we didn’t push embedded Zookeeper out of the
> > gate and even went so far as to call it a non production “demo” feature.
> > Dynamic reconfiguration as a cluster changed over time, and a Zookeeper
> > instance per Solr node being prohibitive. At least the latter was
> > theoretical externally solvable it felt, but at the time, that just
> brought
> > back around to the lack of dynamic configuration.
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@solr.apache.org
> > > For additional commands, e-mail: dev-help@solr.apache.org
> > >
> >
> > _______________________
> > Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 |
> > http://www.opensourceconnections.com <
> > http://www.opensourceconnections.com/> | My Free/Busy <
> > http://tinyurl.com/eric-cal>
> > Co-Author: Apache Solr Enterprise Search Server, 3rd Ed <
> >
> https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw
> >
> >
> > This e-mail and all contents, including attachments, is considered to be
> > Company Confidential unless explicitly stated otherwise, regardless of
> > whether attachments are marked as such.
> >
> >
>

Re: Discuss SIP-14 Embedded Zookeeper

Posted by Ishan Chattopadhyaya <ic...@gmail.com>.
+1 fully support this, also support the move for us to get rid of the
standalone Solr mode.

On Sat, 9 Sept, 2023, 5:12 pm Eric Pugh,
<ep...@opensourceconnections.com.invalid> wrote:

> The baby step that Jan is proposing for a ‘zookeeper’ node-role makes
> sense to me, for those who are only deploying a very small Solr setup.
>
> What would that look like?    Would you start your solr like this?    I
> looked a bit at the Ref Guide page for Nodes, and I’m gathering it might
> look like:
>
> One Solr with what we call Embedded ZK today:
>
> bin/solr start -c
>
> The -c switch and no -z parameter means it has an implicit Zookeeper Role
> and therefore fires up an embedded ZK.
>
> The Same Thing Explicitly:
>
> bin/solr start -c  -solr.node.roles=zookeeper:on -z localhost:9983
>
>
> So, what about two Solr nodes?
>
> bin/solr start -c  -solr.node.roles=zookeeper:on -z localhost:9983 -p 8983
> bin/solr start -c -z localhost:9983 -p 7574
>
> And Four Solr Nodes with 3 enabled for ZK?
>
> bin/solr start -c  -solr.node.roles=zookeeper:on -z
> localhost:9983,localhost:8574,localhost:9984 -p 8983
> bin/solr start -c  -solr.node.roles=zookeeper:on -z
> localhost:9983,localhost:8574,localhost:9984 -p 7574
> bin/solr start -c  -solr.node.roles=zookeeper:on -z
> localhost:9983,localhost:8574,localhost:9984 -p 8984
> bin/solr start -c  -z localhost:9983,localhost:8574,localhost:9984 -p 7574
>
> At least on the face of it, this doesn’t seem to far away….    I’m sure
> there is a lot of complexity that I don’t realize about ;-).   Like, will
> Solr start up if one of the three ZK it’s wants isn’t yet available?
>
>
>
> Eric
>
>
>
>
> > On Sep 6, 2023, at 6:14 PM, Jan Høydahl <ja...@cominvent.com> wrote:
> >
> > Hi,
> >
> > Eric Pugh and I discussed this SIP the other day, as a stepping stone
> for making cloud mode the default. Perhaps there is new energy for this two
> years down the road?
> >
> > We don't need to tackle the full dynamic scaling of ZK on day one.
> > Just adding a 'zookeeper' node-role so we could have tree fixed nodes
> acting as ZK would be a win and lower the complexity of deploying Solr.
> > We could always add magic auto scaling later.
> >
> > Jan
> >
> >> 17. jan. 2022 kl. 06:57 skrev Mark Miller <ma...@gmail.com>:
> >>
> >> Yeah, there two reasons we didn’t push embedded Zookeeper out of the
> gate and even went so far as to call it a non production “demo” feature.
> Dynamic reconfiguration as a cluster changed over time, and a Zookeeper
> instance per Solr node being prohibitive. At least the latter was
> theoretical externally solvable it felt, but at the time, that just brought
> back around to the lack of dynamic configuration.
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@solr.apache.org
> > For additional commands, e-mail: dev-help@solr.apache.org
> >
>
> _______________________
> Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 |
> http://www.opensourceconnections.com <
> http://www.opensourceconnections.com/> | My Free/Busy <
> http://tinyurl.com/eric-cal>
> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed <
> https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
>
> This e-mail and all contents, including attachments, is considered to be
> Company Confidential unless explicitly stated otherwise, regardless of
> whether attachments are marked as such.
>
>

Re: Discuss SIP-14 Embedded Zookeeper

Posted by Eric Pugh <ep...@opensourceconnections.com.INVALID>.
The baby step that Jan is proposing for a ‘zookeeper’ node-role makes sense to me, for those who are only deploying a very small Solr setup.   

What would that look like?    Would you start your solr like this?    I looked a bit at the Ref Guide page for Nodes, and I’m gathering it might look like:

One Solr with what we call Embedded ZK today:

bin/solr start -c

The -c switch and no -z parameter means it has an implicit Zookeeper Role and therefore fires up an embedded ZK.

The Same Thing Explicitly:

bin/solr start -c  -solr.node.roles=zookeeper:on -z localhost:9983


So, what about two Solr nodes?

bin/solr start -c  -solr.node.roles=zookeeper:on -z localhost:9983 -p 8983
bin/solr start -c -z localhost:9983 -p 7574

And Four Solr Nodes with 3 enabled for ZK?

bin/solr start -c  -solr.node.roles=zookeeper:on -z localhost:9983,localhost:8574,localhost:9984 -p 8983
bin/solr start -c  -solr.node.roles=zookeeper:on -z localhost:9983,localhost:8574,localhost:9984 -p 7574
bin/solr start -c  -solr.node.roles=zookeeper:on -z localhost:9983,localhost:8574,localhost:9984 -p 8984
bin/solr start -c  -z localhost:9983,localhost:8574,localhost:9984 -p 7574

At least on the face of it, this doesn’t seem to far away….    I’m sure there is a lot of complexity that I don’t realize about ;-).   Like, will Solr start up if one of the three ZK it’s wants isn’t yet available?



Eric




> On Sep 6, 2023, at 6:14 PM, Jan Høydahl <ja...@cominvent.com> wrote:
> 
> Hi,
> 
> Eric Pugh and I discussed this SIP the other day, as a stepping stone for making cloud mode the default. Perhaps there is new energy for this two years down the road? 
> 
> We don't need to tackle the full dynamic scaling of ZK on day one.
> Just adding a 'zookeeper' node-role so we could have tree fixed nodes acting as ZK would be a win and lower the complexity of deploying Solr.
> We could always add magic auto scaling later.
> 
> Jan
> 
>> 17. jan. 2022 kl. 06:57 skrev Mark Miller <ma...@gmail.com>:
>> 
>> Yeah, there two reasons we didn’t push embedded Zookeeper out of the gate and even went so far as to call it a non production “demo” feature.  Dynamic reconfiguration as a cluster changed over time, and a Zookeeper instance per Solr node being prohibitive. At least the latter was theoretical externally solvable it felt, but at the time, that just brought back around to the lack of dynamic configuration.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@solr.apache.org
> For additional commands, e-mail: dev-help@solr.apache.org
> 

_______________________
Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com <http://www.opensourceconnections.com/> | My Free/Busy <http://tinyurl.com/eric-cal>  
Co-Author: Apache Solr Enterprise Search Server, 3rd Ed <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>	
This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such.


Re: Discuss SIP-14 Embedded Zookeeper

Posted by Jan Høydahl <ja...@cominvent.com>.
Hi,

Eric Pugh and I discussed this SIP the other day, as a stepping stone for making cloud mode the default. Perhaps there is new energy for this two years down the road? 

We don't need to tackle the full dynamic scaling of ZK on day one.
Just adding a 'zookeeper' node-role so we could have tree fixed nodes acting as ZK would be a win and lower the complexity of deploying Solr.
We could always add magic auto scaling later.

Jan

> 17. jan. 2022 kl. 06:57 skrev Mark Miller <ma...@gmail.com>:
> 
> Yeah, there two reasons we didn’t push embedded Zookeeper out of the gate and even went so far as to call it a non production “demo” feature.  Dynamic reconfiguration as a cluster changed over time, and a Zookeeper instance per Solr node being prohibitive. At least the latter was theoretical externally solvable it felt, but at the time, that just brought back around to the lack of dynamic configuration.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@solr.apache.org
For additional commands, e-mail: dev-help@solr.apache.org


Re: Discuss SIP-14 Embedded Zookeeper

Posted by Mark Miller <ma...@gmail.com>.
Yeah, there two reasons we didn’t push embedded Zookeeper out of the gate
and even went so far as to call it a non production “demo” feature.
Dynamic reconfiguration as a cluster changed over time, and a Zookeeper
instance per Solr node being prohibitive. At least the latter was
theoretical externally solvable it felt, but at the time, that just brought
back around to the lack of dynamic configuration.

Re: Discuss SIP-14 Embedded Zookeeper

Posted by Mike Drob <md...@apache.org>.
I put together a draft PR for the first stage of the SIP which is to cut
over unit tests to using the embedded ZK. There's a few things missing,
like setting limiters and our injected use of ZKDatabase, but we've also
had discussions about how those are not particularly useful as is.

The biggest change is the removal of SolrZooKeeper class, which was mainly
a reflection scaffolding around getting internals. We can use zookeeper
testing constructs a bit more cleanly for this, but is does affect the
public API for clients.

https://github.com/apache/solr/pull/522

On Wed, Sep 15, 2021 at 9:17 AM Mike Drob <md...@apache.org> wrote:

> > If people are running 100 or 1000 node clusters and use each node as a
> ZK server, by default, what kind of impact would that have?
>
> Bad. Very very bad. The largest ZK quorum I've personally seen is 9, and
> I've heard rumors of somebody running 15. I think the recommended approach
> for distributing load is to use Observers[1], which may provide some
> tiering benefits or may be redundant with traditional ZK clients. Maybe an
> Observer per failure zone makes sense for the Solr Operator?
>
> [1]: https://zookeeper.apache.org/doc/current/zookeeperObservers.html
>
>
>
> On Wed, Sep 15, 2021 at 8:33 AM Houston Putman <ho...@apache.org> wrote:
>
>> If we were to make this work, and support productionized embedded
>> Zookeeper, then it would absolutely be something that we want to support by
>> default in the Solr Operator.
>>
>> I don't think we'd be able to cut the Zookeeper Operator dependency
>> really quickly, because this is going in at the earliest in Solr 9 and more
>> likely Solr 10 (probably). The Solr Operator still needs to support older
>> versions, especially Solr 8 for a fair amount of time. So once the minimum
>> supported Solr version is one that has this feature, then we can get rid of
>> the Zookeeper Operator for good. This is probably my favorite thing about
>> the SIP. The Zookeeper Operator is fine, but removing that dependency would
>> lift a huge burden off of the Solr Operator's shoulders.
>>
>> I also think it's a good idea to be able to start solr in a ZK-Only mode.
>>
>> Also you should be able to tell Solr whether you want it to start as a ZK
>> member or observer, or not run ZK on that node at all. I'm not extremely in
>> touch with the ZK community at this point, but what cluster sizes are
>> people scaling up to nowadays? If people are running 100 or 1000 node
>> clusters and use each node as a ZK server, by default, what kind of impact
>> would that have?
>>
>> On Tue, Sep 14, 2021 at 8:20 PM Mike Drob <md...@apache.org> wrote:
>>
>>> I like the idea of starting nodes in a ZK-only mode, probably we would
>>> call it something like coordination mode. It ties in to ideas that I've had
>>> while discussing with other folks about other Solr node specializations,
>>> like "edge" nodes that are part of a cluster but do not host collections
>>> and exist solely for routing http queries to the appropriate places.
>>>
>>> I think it could be useful in a k8s deployment as well, but I'd have to
>>> think about how we want to do all the port magic there. I know that I've
>>> had conversations with Houston about wanting to move away from
>>> ZookeeperOperator, but those haven't quite taken hold yet.
>>>
>>> On Tue, Sep 14, 2021 at 6:02 PM Jan Høydahl <ja...@cominvent.com>
>>> wrote:
>>>
>>>> Thanks for kicking this off Mike. I added a few "rejected alternatives"
>>>> and put a few questions for thought in a comment. You may want to keep all
>>>> discussion in this email thread, so here are the questions copied:
>>>>
>>>>
>>>> *This is promising! Question: Would this mode be valuable also for
>>>> Kubernetes deployments, i.e. we could get rid of the ZookeeperOperator and
>>>> instead let the SolrOperator keep track of which Solr pods that also act as
>>>> ZK nodes?Would we allow a Solr node to start in a ZK-only mode, i.e. not
>>>> eligible for collections/cores/overseer? This would also support those huge
>>>> clusters where you want dedicated ZKs.*
>>>>
>>>> This also ties in with SIP-6 Solr should own the bootstrap process
>>>> <https://cwiki.apache.org/confluence/display/SOLR/SIP-6+Solr+should+own+the+bootstrap+process>,
>>>> as we'd want to control startup/shutdown behavior wrt the zk, e.g. start
>>>> embedded zk before solr and stop solr before stopping zk. Perhaps also
>>>> gracefully exiting the quorum on planned shutdown?
>>>>
>>>> Jan
>>>>
>>>> 14. sep. 2021 kl. 22:09 skrev Mike Drob <md...@apache.org>:
>>>>
>>>> Devs,
>>>>
>>>> We've previously discussed maintaining ZK as being an operational
>>>> hurdle for some groups getting started or migrating to SolrCloud from
>>>> non-ZK cloud mode. I'd like to discuss the idea of embedding ZK in our own
>>>> process control.
>>>>
>>>> Please see the SIP at
>>>> https://cwiki.apache.org/confluence/display/SOLR/SIP-14+Embedded+Zookeeper
>>>>
>>>> Thank you,
>>>> Mike
>>>>
>>>>
>>>>

Re: Discuss SIP-14 Embedded Zookeeper

Posted by Mike Drob <md...@apache.org>.
> If people are running 100 or 1000 node clusters and use each node as a ZK
server, by default, what kind of impact would that have?

Bad. Very very bad. The largest ZK quorum I've personally seen is 9, and
I've heard rumors of somebody running 15. I think the recommended approach
for distributing load is to use Observers[1], which may provide some
tiering benefits or may be redundant with traditional ZK clients. Maybe an
Observer per failure zone makes sense for the Solr Operator?

[1]: https://zookeeper.apache.org/doc/current/zookeeperObservers.html



On Wed, Sep 15, 2021 at 8:33 AM Houston Putman <ho...@apache.org> wrote:

> If we were to make this work, and support productionized embedded
> Zookeeper, then it would absolutely be something that we want to support by
> default in the Solr Operator.
>
> I don't think we'd be able to cut the Zookeeper Operator dependency really
> quickly, because this is going in at the earliest in Solr 9 and more likely
> Solr 10 (probably). The Solr Operator still needs to support older
> versions, especially Solr 8 for a fair amount of time. So once the minimum
> supported Solr version is one that has this feature, then we can get rid of
> the Zookeeper Operator for good. This is probably my favorite thing about
> the SIP. The Zookeeper Operator is fine, but removing that dependency would
> lift a huge burden off of the Solr Operator's shoulders.
>
> I also think it's a good idea to be able to start solr in a ZK-Only mode.
>
> Also you should be able to tell Solr whether you want it to start as a ZK
> member or observer, or not run ZK on that node at all. I'm not extremely in
> touch with the ZK community at this point, but what cluster sizes are
> people scaling up to nowadays? If people are running 100 or 1000 node
> clusters and use each node as a ZK server, by default, what kind of impact
> would that have?
>
> On Tue, Sep 14, 2021 at 8:20 PM Mike Drob <md...@apache.org> wrote:
>
>> I like the idea of starting nodes in a ZK-only mode, probably we would
>> call it something like coordination mode. It ties in to ideas that I've had
>> while discussing with other folks about other Solr node specializations,
>> like "edge" nodes that are part of a cluster but do not host collections
>> and exist solely for routing http queries to the appropriate places.
>>
>> I think it could be useful in a k8s deployment as well, but I'd have to
>> think about how we want to do all the port magic there. I know that I've
>> had conversations with Houston about wanting to move away from
>> ZookeeperOperator, but those haven't quite taken hold yet.
>>
>> On Tue, Sep 14, 2021 at 6:02 PM Jan Høydahl <ja...@cominvent.com>
>> wrote:
>>
>>> Thanks for kicking this off Mike. I added a few "rejected alternatives"
>>> and put a few questions for thought in a comment. You may want to keep all
>>> discussion in this email thread, so here are the questions copied:
>>>
>>>
>>> *This is promising! Question: Would this mode be valuable also for
>>> Kubernetes deployments, i.e. we could get rid of the ZookeeperOperator and
>>> instead let the SolrOperator keep track of which Solr pods that also act as
>>> ZK nodes?Would we allow a Solr node to start in a ZK-only mode, i.e. not
>>> eligible for collections/cores/overseer? This would also support those huge
>>> clusters where you want dedicated ZKs.*
>>>
>>> This also ties in with SIP-6 Solr should own the bootstrap process
>>> <https://cwiki.apache.org/confluence/display/SOLR/SIP-6+Solr+should+own+the+bootstrap+process>,
>>> as we'd want to control startup/shutdown behavior wrt the zk, e.g. start
>>> embedded zk before solr and stop solr before stopping zk. Perhaps also
>>> gracefully exiting the quorum on planned shutdown?
>>>
>>> Jan
>>>
>>> 14. sep. 2021 kl. 22:09 skrev Mike Drob <md...@apache.org>:
>>>
>>> Devs,
>>>
>>> We've previously discussed maintaining ZK as being an operational hurdle
>>> for some groups getting started or migrating to SolrCloud from non-ZK cloud
>>> mode. I'd like to discuss the idea of embedding ZK in our own process
>>> control.
>>>
>>> Please see the SIP at
>>> https://cwiki.apache.org/confluence/display/SOLR/SIP-14+Embedded+Zookeeper
>>>
>>> Thank you,
>>> Mike
>>>
>>>
>>>

Re: Discuss SIP-14 Embedded Zookeeper

Posted by Houston Putman <ho...@apache.org>.
If we were to make this work, and support productionized embedded
Zookeeper, then it would absolutely be something that we want to support by
default in the Solr Operator.

I don't think we'd be able to cut the Zookeeper Operator dependency really
quickly, because this is going in at the earliest in Solr 9 and more likely
Solr 10 (probably). The Solr Operator still needs to support older
versions, especially Solr 8 for a fair amount of time. So once the minimum
supported Solr version is one that has this feature, then we can get rid of
the Zookeeper Operator for good. This is probably my favorite thing about
the SIP. The Zookeeper Operator is fine, but removing that dependency would
lift a huge burden off of the Solr Operator's shoulders.

I also think it's a good idea to be able to start solr in a ZK-Only mode.

Also you should be able to tell Solr whether you want it to start as a ZK
member or observer, or not run ZK on that node at all. I'm not extremely in
touch with the ZK community at this point, but what cluster sizes are
people scaling up to nowadays? If people are running 100 or 1000 node
clusters and use each node as a ZK server, by default, what kind of impact
would that have?

On Tue, Sep 14, 2021 at 8:20 PM Mike Drob <md...@apache.org> wrote:

> I like the idea of starting nodes in a ZK-only mode, probably we would
> call it something like coordination mode. It ties in to ideas that I've had
> while discussing with other folks about other Solr node specializations,
> like "edge" nodes that are part of a cluster but do not host collections
> and exist solely for routing http queries to the appropriate places.
>
> I think it could be useful in a k8s deployment as well, but I'd have to
> think about how we want to do all the port magic there. I know that I've
> had conversations with Houston about wanting to move away from
> ZookeeperOperator, but those haven't quite taken hold yet.
>
> On Tue, Sep 14, 2021 at 6:02 PM Jan Høydahl <ja...@cominvent.com> wrote:
>
>> Thanks for kicking this off Mike. I added a few "rejected alternatives"
>> and put a few questions for thought in a comment. You may want to keep all
>> discussion in this email thread, so here are the questions copied:
>>
>>
>> *This is promising! Question: Would this mode be valuable also for
>> Kubernetes deployments, i.e. we could get rid of the ZookeeperOperator and
>> instead let the SolrOperator keep track of which Solr pods that also act as
>> ZK nodes?Would we allow a Solr node to start in a ZK-only mode, i.e. not
>> eligible for collections/cores/overseer? This would also support those huge
>> clusters where you want dedicated ZKs.*
>>
>> This also ties in with SIP-6 Solr should own the bootstrap process
>> <https://cwiki.apache.org/confluence/display/SOLR/SIP-6+Solr+should+own+the+bootstrap+process>,
>> as we'd want to control startup/shutdown behavior wrt the zk, e.g. start
>> embedded zk before solr and stop solr before stopping zk. Perhaps also
>> gracefully exiting the quorum on planned shutdown?
>>
>> Jan
>>
>> 14. sep. 2021 kl. 22:09 skrev Mike Drob <md...@apache.org>:
>>
>> Devs,
>>
>> We've previously discussed maintaining ZK as being an operational hurdle
>> for some groups getting started or migrating to SolrCloud from non-ZK cloud
>> mode. I'd like to discuss the idea of embedding ZK in our own process
>> control.
>>
>> Please see the SIP at
>> https://cwiki.apache.org/confluence/display/SOLR/SIP-14+Embedded+Zookeeper
>>
>> Thank you,
>> Mike
>>
>>
>>

Re: Discuss SIP-14 Embedded Zookeeper

Posted by Mike Drob <md...@apache.org>.
I like the idea of starting nodes in a ZK-only mode, probably we would call
it something like coordination mode. It ties in to ideas that I've had
while discussing with other folks about other Solr node specializations,
like "edge" nodes that are part of a cluster but do not host collections
and exist solely for routing http queries to the appropriate places.

I think it could be useful in a k8s deployment as well, but I'd have to
think about how we want to do all the port magic there. I know that I've
had conversations with Houston about wanting to move away from
ZookeeperOperator, but those haven't quite taken hold yet.

On Tue, Sep 14, 2021 at 6:02 PM Jan Høydahl <ja...@cominvent.com> wrote:

> Thanks for kicking this off Mike. I added a few "rejected alternatives"
> and put a few questions for thought in a comment. You may want to keep all
> discussion in this email thread, so here are the questions copied:
>
>
> *This is promising! Question: Would this mode be valuable also for
> Kubernetes deployments, i.e. we could get rid of the ZookeeperOperator and
> instead let the SolrOperator keep track of which Solr pods that also act as
> ZK nodes?Would we allow a Solr node to start in a ZK-only mode, i.e. not
> eligible for collections/cores/overseer? This would also support those huge
> clusters where you want dedicated ZKs.*
>
> This also ties in with SIP-6 Solr should own the bootstrap process
> <https://cwiki.apache.org/confluence/display/SOLR/SIP-6+Solr+should+own+the+bootstrap+process>,
> as we'd want to control startup/shutdown behavior wrt the zk, e.g. start
> embedded zk before solr and stop solr before stopping zk. Perhaps also
> gracefully exiting the quorum on planned shutdown?
>
> Jan
>
> 14. sep. 2021 kl. 22:09 skrev Mike Drob <md...@apache.org>:
>
> Devs,
>
> We've previously discussed maintaining ZK as being an operational hurdle
> for some groups getting started or migrating to SolrCloud from non-ZK cloud
> mode. I'd like to discuss the idea of embedding ZK in our own process
> control.
>
> Please see the SIP at
> https://cwiki.apache.org/confluence/display/SOLR/SIP-14+Embedded+Zookeeper
>
> Thank you,
> Mike
>
>
>

Re: Discuss SIP-14 Embedded Zookeeper

Posted by Jan Høydahl <ja...@cominvent.com>.
Thanks for kicking this off Mike. I added a few "rejected alternatives" and put a few questions for thought in a comment. You may want to keep all discussion in this email thread, so here are the questions copied:

This is promising! Question: Would this mode be valuable also for Kubernetes deployments, i.e. we could get rid of the ZookeeperOperator and instead let the SolrOperator keep track of which Solr pods that also act as ZK nodes?
Would we allow a Solr node to start in a ZK-only mode, i.e. not eligible for collections/cores/overseer? This would also support those huge clusters where you want dedicated ZKs.

This also ties in with SIP-6 Solr should own the bootstrap process <https://cwiki.apache.org/confluence/display/SOLR/SIP-6+Solr+should+own+the+bootstrap+process>, as we'd want to control startup/shutdown behavior wrt the zk, e.g. start embedded zk before solr and stop solr before stopping zk. Perhaps also gracefully exiting the quorum on planned shutdown?

Jan

> 14. sep. 2021 kl. 22:09 skrev Mike Drob <md...@apache.org>:
> 
> Devs,
> 
> We've previously discussed maintaining ZK as being an operational hurdle for some groups getting started or migrating to SolrCloud from non-ZK cloud mode. I'd like to discuss the idea of embedding ZK in our own process control.
> 
> Please see the SIP at https://cwiki.apache.org/confluence/display/SOLR/SIP-14+Embedded+Zookeeper <https://cwiki.apache.org/confluence/display/SOLR/SIP-14+Embedded+Zookeeper>
> 
> Thank you,
> Mike