You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@solr.apache.org by Radu Gheorghe <ra...@sematext.com> on 2023/05/03 14:45:15 UTC

Re: SIP-18: A Solr Kubernetes Module for native integration

Hello,

Sorry for being late to the party. The SIP sounds good to me.

Houston, you already mentioned that work for the module is easy to
break down. You're referring to the fact that pretty much every major
piece of functionality (Authentication, ConfigSets...) can be
developed almost independent of each other, correct? I assume it's not
worth having multiple modules, because you're likely to want
everything, as a user, if you're using Kubernetes.

Best regards,
Radu
--
Elasticsearch/OpenSearch & Solr Consulting, Production Support & Training
Sematext Cloud - Full Stack Observability
https://sematext.com/

On Fri, Apr 21, 2023 at 12:55 AM Arrieta, Alejandro
<aa...@perrinsoftware.com> wrote:
>
> Hello team,
>
> noob warning:
> today I learned what SIP means. with SIP17 and 18 being very interesting
> reads.
> https://cwiki.apache.org/confluence/display/SOLR/Solr+Improvement+Proposals
> Too many telephone references.
> sorry for the interruption.
> Alejandro Arrieta
>
> On Thu, Apr 20, 2023 at 5:27 PM Houston Putman <ho...@apache.org> wrote:
>
> > Thanks for the questions Jason!
> >
> > So the general idea is that we'd add a Solr contrib/module, and that
> > > module would have a dep on some sort of Kubernetes client so it could
> > > manage certain Solr entities (e.g. security.json, configsets, etc.) as
> > > Kubernetes resources (configmaps, etc.).  Am I understanding that
> > > right?
> > >
> >
> > Yes, absolutely. And possibly other things, like leverage Kubernetes'
> > secrets managements to manage
> > credentials for users. (Auto-import BasicAuth secrets with certain labels,
> > integrate with Kubernetes ServiceAccounts, etc.)
> >
> > But yeah, generally the idea is to use Kubernetes state instead of
> > Zookeeper state for certain features.
> >
> > One place there might be room for improvement in the writeup so far is
> > > around the motivation/value-prop for some of these Solr->Kubernetes
> > > integrations.  The value in some integrations (e.g.
> > > KubernetesSSLCredentialsProvider) is relatively self-evident I think,
> > > but others are a little less clear and could use spelled out
> > > explicitly IMO.  e.g. What's the benefit of storing security.json or
> > > configsets in Kubernetes configmaps over ZooKeeper?
> > >
> >
> > This is a great question.
> >
> > Generally Solr has fairly good tool support for managing various things in
> > Zookeeper.
> >
> > The "zkCli.sh" script and various "bin/solr" commands allow users to easily
> > manage their Zookeeper state to setup
> > Solr to run the way they need it to. This works very well for users running
> > Solr on bare-metal, and manually running these commands.
> >
> > However, running these commands in Kubernetes is not very convenient and it
> > does not really jive with
> > the Kubernetes' idempotent model. Basically there isn't a good or easy way
> > to run to run the
> > solr/zk setup commands before a SolrCloud is created. And when we do it in
> > things like an "initContainer",
> > the commands have to be run every time a solr process is started (or
> > restarted). This isn't really convenient
> > and adds complexity that really makes running Solr on Kubernetes much less
> > appealing.
> >
> > Another thing is state management. So let's say that the Solr Operator
> > wants to enable auth by default when running Solr.
> > It has to create a security.json for Solr to use, and generate passwords
> > and secrets for users to use.
> > However, it also needs to setup a user & password for itself (the operator)
> > to use to interact with the cluster.
> > But that's ok, it does it, and it can easily upload this file to zookeeper
> > in the initContainer if no security.json already exists.
> >
> > However we need to allow users to update this file themselves to add more
> > users, and do other stuff. So basically we
> > can't let the Solr Operator make any changes to this file. So even if a
> > user decides that they want to change the security.json secret
> > they passed in the SolrCloud, the operator can't make that change happen,
> > since it can't overwrite what already exists in zookeeper.
> > This will always be a problem when there are two "sources of truth". One
> > has to be prioritized.
> >
> > If we allow the security.json to be loaded from a kubernetes secret, then
> > the secret that the user provides is the
> > single source of truth. So no matter if the security.json is changed
> > through the security UI, the changes will be reflected in
> > the kubernetes secret. So users can be free to overwrite that secret if
> > they want to, given that everyone knows its the current
> > accepted state of the security.json file.
> >
> > The exact same issues exist with ConfigMaps. Many Solr Operator users want
> > to manage their configMaps through
> > Kubernetes, just like they manage their SolrClouds. It makes sense, keep
> > all of your Solr infra managed together.
> > However the operator cannot keep the configSets managed in Zookeeper
> > updated with the configSets managed
> > via Kube ConfigSets. It's two sources of truth.
> >
> > *TLDR*: Solr has many command line utilities that work well to setup Solr
> > when its running on bare metal or a VM.
> > However, these solutions do not work well in a cloud system like
> > Kubernetes. If we try to make these things
> > easier to setup in Kubernetes, it ultimately results in 2 sources of truth
> > (Kubernetes and Zookeeper). If we make
> > plugins that allow to load in these settings from Kubernetes instead of
> > Zookeeper, we are back down to 1 source
> > of truth. And this single source of truth (obviously) works well in
> > Kubernetes, because they are native Kubernetes resources.
> >
> > - Houston
> >
> > On Tue, Apr 11, 2023 at 2:36 PM Jason Gerlowski <ge...@gmail.com>
> > wrote:
> >
> > > Hi Houston,
> > >
> > > So the general idea is that we'd add a Solr contrib/module, and that
> > > module would have a dep on some sort of Kubernetes client so it could
> > > manage certain Solr entities (e.g. security.json, configsets, etc.) as
> > > Kubernetes resources (configmaps, etc.).  Am I understanding that
> > > right?
> > >
> > > > Please let me know if I can explain more, or how I can make the SIP
> > page
> > > better.
> > >
> > > One place there might be room for improvement in the writeup so far is
> > > around the motivation/value-prop for some of these Solr->Kubernetes
> > > integrations.  The value in some integrations (e.g.
> > > KubernetesSSLCredentialsProvider) is relatively self-evident I think,
> > > but others are a little less clear and could use spelled out
> > > explicitly IMO.  e.g. What's the benefit of storing security.json or
> > > configsets in Kubernetes configmaps over ZooKeeper?
> > >
> > > Best,
> > >
> > > Jason
> > >
> > > On Wed, Apr 5, 2023 at 12:45 PM Houston Putman <ho...@apache.org>
> > wrote:
> > > >
> > > > Hey everyone,
> > > >
> > > > This is a new SIP, not a duplicate of SIP-17 (Authoscaling on
> > > Kubernetes),
> > > > and completely unrelated.
> > > >
> > > > Basically there is a lot of very messy logic we do in the Solr Operator
> > > to
> > > > bootstrap security and manage various things. This logic must exist
> > > because
> > > > Solr has no idea that Kubernetes exists.
> > > > If we can use Kubernetes APIs to pull in information, instead of
> > relying
> > > on
> > > > the Solr Operator to inject that information in hacky-ways, the user
> > > > experience on Kubernetes is going to get many times better for users
> > > > wanting to secure their SolrClouds. This will also help us use
> > > > authorization by default (which we always preach) via the Solr
> > Operator.
> > > >
> > > > This SIP is not very filled out because I'm still thinking on various
> > > > aspects. But in general, we can attack the different plugins one-by-one
> > > and
> > > > the SIP can evolve throughout the process. This SIP is very easy to
> > break
> > > > up, which is nice.
> > > >
> > > > Please let me know if I can explain more, or how I can make the SIP
> > page
> > > > better.
> > > >
> > > > - Houston
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@solr.apache.org
> > > For additional commands, e-mail: dev-help@solr.apache.org
> > >
> > >
> >

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@solr.apache.org
For additional commands, e-mail: dev-help@solr.apache.org


Re: SIP-18: A Solr Kubernetes Module for native integration

Posted by Houston Putman <ho...@apache.org>.
Yeah good question. Everything would be included in a single “kubernetes”
module. So while everything can be done independently, the first feature
will have to create the module, then the others features can be added.

Luckily adding a module is pretty straightforward, so it doesnt matter
which feature is added first.

- Houston

2023년 5월 3일 (수) 오전 10:45, Radu Gheorghe <ra...@sematext.com>님이 작성:

> Hello,
>
> Sorry for being late to the party. The SIP sounds good to me.
>
> Houston, you already mentioned that work for the module is easy to
> break down. You're referring to the fact that pretty much every major
> piece of functionality (Authentication, ConfigSets...) can be
> developed almost independent of each other, correct? I assume it's not
> worth having multiple modules, because you're likely to want
> everything, as a user, if you're using Kubernetes.
>
> Best regards,
> Radu
> --
> Elasticsearch/OpenSearch & Solr Consulting, Production Support & Training
> Sematext Cloud - Full Stack Observability
> https://sematext.com/
>
> On Fri, Apr 21, 2023 at 12:55 AM Arrieta, Alejandro
> <aa...@perrinsoftware.com> wrote:
> >
> > Hello team,
> >
> > noob warning:
> > today I learned what SIP means. with SIP17 and 18 being very interesting
> > reads.
> >
> https://cwiki.apache.org/confluence/display/SOLR/Solr+Improvement+Proposals
> > Too many telephone references.
> > sorry for the interruption.
> > Alejandro Arrieta
> >
> > On Thu, Apr 20, 2023 at 5:27 PM Houston Putman <ho...@apache.org>
> wrote:
> >
> > > Thanks for the questions Jason!
> > >
> > > So the general idea is that we'd add a Solr contrib/module, and that
> > > > module would have a dep on some sort of Kubernetes client so it could
> > > > manage certain Solr entities (e.g. security.json, configsets, etc.)
> as
> > > > Kubernetes resources (configmaps, etc.).  Am I understanding that
> > > > right?
> > > >
> > >
> > > Yes, absolutely. And possibly other things, like leverage Kubernetes'
> > > secrets managements to manage
> > > credentials for users. (Auto-import BasicAuth secrets with certain
> labels,
> > > integrate with Kubernetes ServiceAccounts, etc.)
> > >
> > > But yeah, generally the idea is to use Kubernetes state instead of
> > > Zookeeper state for certain features.
> > >
> > > One place there might be room for improvement in the writeup so far is
> > > > around the motivation/value-prop for some of these Solr->Kubernetes
> > > > integrations.  The value in some integrations (e.g.
> > > > KubernetesSSLCredentialsProvider) is relatively self-evident I think,
> > > > but others are a little less clear and could use spelled out
> > > > explicitly IMO.  e.g. What's the benefit of storing security.json or
> > > > configsets in Kubernetes configmaps over ZooKeeper?
> > > >
> > >
> > > This is a great question.
> > >
> > > Generally Solr has fairly good tool support for managing various
> things in
> > > Zookeeper.
> > >
> > > The "zkCli.sh" script and various "bin/solr" commands allow users to
> easily
> > > manage their Zookeeper state to setup
> > > Solr to run the way they need it to. This works very well for users
> running
> > > Solr on bare-metal, and manually running these commands.
> > >
> > > However, running these commands in Kubernetes is not very convenient
> and it
> > > does not really jive with
> > > the Kubernetes' idempotent model. Basically there isn't a good or easy
> way
> > > to run to run the
> > > solr/zk setup commands before a SolrCloud is created. And when we do
> it in
> > > things like an "initContainer",
> > > the commands have to be run every time a solr process is started (or
> > > restarted). This isn't really convenient
> > > and adds complexity that really makes running Solr on Kubernetes much
> less
> > > appealing.
> > >
> > > Another thing is state management. So let's say that the Solr Operator
> > > wants to enable auth by default when running Solr.
> > > It has to create a security.json for Solr to use, and generate
> passwords
> > > and secrets for users to use.
> > > However, it also needs to setup a user & password for itself (the
> operator)
> > > to use to interact with the cluster.
> > > But that's ok, it does it, and it can easily upload this file to
> zookeeper
> > > in the initContainer if no security.json already exists.
> > >
> > > However we need to allow users to update this file themselves to add
> more
> > > users, and do other stuff. So basically we
> > > can't let the Solr Operator make any changes to this file. So even if a
> > > user decides that they want to change the security.json secret
> > > they passed in the SolrCloud, the operator can't make that change
> happen,
> > > since it can't overwrite what already exists in zookeeper.
> > > This will always be a problem when there are two "sources of truth".
> One
> > > has to be prioritized.
> > >
> > > If we allow the security.json to be loaded from a kubernetes secret,
> then
> > > the secret that the user provides is the
> > > single source of truth. So no matter if the security.json is changed
> > > through the security UI, the changes will be reflected in
> > > the kubernetes secret. So users can be free to overwrite that secret if
> > > they want to, given that everyone knows its the current
> > > accepted state of the security.json file.
> > >
> > > The exact same issues exist with ConfigMaps. Many Solr Operator users
> want
> > > to manage their configMaps through
> > > Kubernetes, just like they manage their SolrClouds. It makes sense,
> keep
> > > all of your Solr infra managed together.
> > > However the operator cannot keep the configSets managed in Zookeeper
> > > updated with the configSets managed
> > > via Kube ConfigSets. It's two sources of truth.
> > >
> > > *TLDR*: Solr has many command line utilities that work well to setup
> Solr
> > > when its running on bare metal or a VM.
> > > However, these solutions do not work well in a cloud system like
> > > Kubernetes. If we try to make these things
> > > easier to setup in Kubernetes, it ultimately results in 2 sources of
> truth
> > > (Kubernetes and Zookeeper). If we make
> > > plugins that allow to load in these settings from Kubernetes instead of
> > > Zookeeper, we are back down to 1 source
> > > of truth. And this single source of truth (obviously) works well in
> > > Kubernetes, because they are native Kubernetes resources.
> > >
> > > - Houston
> > >
> > > On Tue, Apr 11, 2023 at 2:36 PM Jason Gerlowski <gerlowskija@gmail.com
> >
> > > wrote:
> > >
> > > > Hi Houston,
> > > >
> > > > So the general idea is that we'd add a Solr contrib/module, and that
> > > > module would have a dep on some sort of Kubernetes client so it could
> > > > manage certain Solr entities (e.g. security.json, configsets, etc.)
> as
> > > > Kubernetes resources (configmaps, etc.).  Am I understanding that
> > > > right?
> > > >
> > > > > Please let me know if I can explain more, or how I can make the SIP
> > > page
> > > > better.
> > > >
> > > > One place there might be room for improvement in the writeup so far
> is
> > > > around the motivation/value-prop for some of these Solr->Kubernetes
> > > > integrations.  The value in some integrations (e.g.
> > > > KubernetesSSLCredentialsProvider) is relatively self-evident I think,
> > > > but others are a little less clear and could use spelled out
> > > > explicitly IMO.  e.g. What's the benefit of storing security.json or
> > > > configsets in Kubernetes configmaps over ZooKeeper?
> > > >
> > > > Best,
> > > >
> > > > Jason
> > > >
> > > > On Wed, Apr 5, 2023 at 12:45 PM Houston Putman <ho...@apache.org>
> > > wrote:
> > > > >
> > > > > Hey everyone,
> > > > >
> > > > > This is a new SIP, not a duplicate of SIP-17 (Authoscaling on
> > > > Kubernetes),
> > > > > and completely unrelated.
> > > > >
> > > > > Basically there is a lot of very messy logic we do in the Solr
> Operator
> > > > to
> > > > > bootstrap security and manage various things. This logic must exist
> > > > because
> > > > > Solr has no idea that Kubernetes exists.
> > > > > If we can use Kubernetes APIs to pull in information, instead of
> > > relying
> > > > on
> > > > > the Solr Operator to inject that information in hacky-ways, the
> user
> > > > > experience on Kubernetes is going to get many times better for
> users
> > > > > wanting to secure their SolrClouds. This will also help us use
> > > > > authorization by default (which we always preach) via the Solr
> > > Operator.
> > > > >
> > > > > This SIP is not very filled out because I'm still thinking on
> various
> > > > > aspects. But in general, we can attack the different plugins
> one-by-one
> > > > and
> > > > > the SIP can evolve throughout the process. This SIP is very easy to
> > > break
> > > > > up, which is nice.
> > > > >
> > > > > Please let me know if I can explain more, or how I can make the SIP
> > > page
> > > > > better.
> > > > >
> > > > > - Houston
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: dev-unsubscribe@solr.apache.org
> > > > For additional commands, e-mail: dev-help@solr.apache.org
> > > >
> > > >
> > >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@solr.apache.org
> For additional commands, e-mail: dev-help@solr.apache.org
>
>