You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by Jan Høydahl <ja...@cominvent.com> on 2022/09/20 20:18:09 UTC

Loading solr.xml from zookeeper

Hi,

It has been possible to load solr.xml centrally from zookeeper for a long time.
However, I'm considering deprecating and removing this feature.
Please see https://issues.apache.org/jira/browse/SOLR-15959 for motivation.

My question to the users list is thus - are you loading solr.xml from zookeeper?
And if yes, why is that capability important for you - i.e. could you not configure it per node?

Jan

Re: Loading solr.xml from zookeeper

Posted by Houston Putman <ho...@apache.org>.
>
> Is there a trusted guide for running solr in docker out there?  I’ve seen
> a few but just wondering if you got one you like the most
>

https://solr.apache.org/guide/solr/latest/deployment-guide/solr-in-docker.html

This is the official guide. It was ported from the docs of the original
docker-solr repository.
There are a lot of improvements that can be made, but the effort should
definitely be made for the docker Ref guide pages.

- Houston

On Wed, Sep 21, 2022 at 1:46 PM Dave <ha...@gmail.com> wrote:

> Is there a trusted guide for running solr in docker out there?  I’ve seen
> a few but just wondering if you got one you like the most
>
> > On Sep 21, 2022, at 1:32 PM, David Smiley <ds...@apache.org> wrote:
> >
> > ANNAMANENI: can you clarify what you mean by "multiple repositories";
> maybe
> > "repositories" is a word with specific meaning for your system.  If this
> > functionality were removed, would something be harder?  I'm going to
> guess
> > you aren't using Docker / containers yet.
> >
> > ~ David Smiley
> > Apache Lucene/Solr Search Developer
> > http://www.linkedin.com/in/davidwsmiley
> >
> >
> >> On Wed, Sep 21, 2022 at 12:28 PM ANNAMANENI RAVEENDRA <
> >> a.raveendra432@gmail.com> wrote:
> >>
> >> Hi we are using solr.xml from zk as we are using it for storing multiple
> >> repositories in a single file.
> >>
> >>
> >>
> >>
> >> On Wed, 21 Sep 2022 at 12:25 PM, Houston Putman <
> houstonputman@gmail.com>
> >> wrote:
> >>
> >>> I have also never seen people use a ZK solr.xml, and I see the solr.xml
> >> as
> >>> a node-config file. I'd be very happy to only support file-system
> >> solr.xml
> >>> loading.
> >>>
> >>> On Wed, Sep 21, 2022 at 9:11 AM Jan Høydahl <ja...@cominvent.com>
> >> wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> Since 9.0 Solr can start with an empty SOLR_HOME as it will use
> >> defaults
> >>>> in their place <
> >>>>
> >>>
> >>
> https://solr.apache.org/guide/solr/9_0/upgrade-notes/major-changes-in-solr-9.html
> >>>> :
> >>>> "Solr no longer requires a solr.xml in $SOLR_HOME. If one is not
> found,
> >>>> Solr will instead use the default one from
> >>> $SOLR_TIP/server/solr/solr.xml."
> >>>>
> >>>> So this motivation for storing solr.xml centrally is no longer valid.
> >>>>
> >>>> I agree with David's comment on the JIRA that this is also a question
> >>>> about what we want solr.xml to be conceptually - a node-config file,
> >> or a
> >>>> cluster-config file. We have other locations for cluster-wide
> >>>> configuration. Today I think solr.xml is a mix of the two. A value
> like
> >>>> "zkClientTimeout" could be cluster-wide while "host" and "hostPort"
> are
> >>> of
> >>>> course node-local.
> >>>>
> >>>> I'm also thinking about rolling upgrade scenario. So you do a rolling
> >>>> upgrade from solr 8.11 to 9.0. For the 9.0 nodes you need different
> >>>> configuration, maybe even radically different XML. Today most values
> in
> >>>> solr.xml is sourced from Java System Properties, which is also a
> strong
> >>>> hint that this is per-node.
> >>>>
> >>>> Jan
> >>>>
> >>>>> 20. sep. 2022 kl. 22:50 skrev Shawn Heisey
> >>> <elyograg@elyograg.org.INVALID
> >>>>> :
> >>>>>
> >>>>> On 9/20/22 14:18, Jan Høydahl wrote:
> >>>>>> It has been possible to load solr.xml centrally from zookeeper for a
> >>>> long time.
> >>>>>> However, I'm considering deprecating and removing this feature.
> >>>>>> Please see https://issues.apache.org/jira/browse/SOLR-15959 for
> >>>> motivation.
> >>>>>>
> >>>>>> My question to the users list is thus - are you loading solr.xml
> >> from
> >>>> zookeeper?
> >>>>>> And if yes, why is that capability important for you - i.e. could
> >> you
> >>>> not configure it per node?
> >>>>>
> >>>>> I have done very little with SolrCloud myself.  I converted my tiny
> >>>> little install to cloud with embedded zk, just the one server, one
> >>>> collection and one core.  I do not have solr.xml in ZK.  I did this so
> >> I
> >>>> have access to whatever functionality is cloud-only, should a need
> ever
> >>>> arise.  I fiddle with that install sometimes to try and answer support
> >>>> questions.  Rebuilding the index only takes about ten minutes, so if I
> >>>> screw something up I just restore the working config, delete the data
> >>>> directory, restart, and reindex.
> >>>>>
> >>>>> I can see a lot of value in being able to fire up a Solr node with
> >> only
> >>>> /etc/default/solr.in.sh being provided.  The solr home can then be
> >>>> provided completely empty and the node will start, as long as solr.xml
> >> is
> >>>> in ZK.
> >>>>>
> >>>>> One thing I think we should do is make it so that Solr starts with
> >> some
> >>>> defaults, if solr.xml is not found at all.  We can then bikeshed about
> >>> what
> >>>> the defaults should be.
> >>>>>
> >>>>> Making it possible for cloud mode to start without solr.xml might
> >>> remove
> >>>> most people's need to have it live in ZK.  It would make things easier
> >> on
> >>>> docker users ... they would be able to attach a completely empty
> volume
> >>> for
> >>>> the solr home and Solr would start. They might then go back and add a
> >>>> solr.xml to provide custom settings.
> >>>>>
> >>>>> Thanks,
> >>>>> Shawn
> >>>>>
> >>>>
> >>>>
> >>>
> >>
>

Re: Loading solr.xml from zookeeper

Posted by Dave <ha...@gmail.com>.
Is there a trusted guide for running solr in docker out there?  I’ve seen a few but just wondering if you got one you like the most 

> On Sep 21, 2022, at 1:32 PM, David Smiley <ds...@apache.org> wrote:
> 
> ANNAMANENI: can you clarify what you mean by "multiple repositories"; maybe
> "repositories" is a word with specific meaning for your system.  If this
> functionality were removed, would something be harder?  I'm going to guess
> you aren't using Docker / containers yet.
> 
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
> 
> 
>> On Wed, Sep 21, 2022 at 12:28 PM ANNAMANENI RAVEENDRA <
>> a.raveendra432@gmail.com> wrote:
>> 
>> Hi we are using solr.xml from zk as we are using it for storing multiple
>> repositories in a single file.
>> 
>> 
>> 
>> 
>> On Wed, 21 Sep 2022 at 12:25 PM, Houston Putman <ho...@gmail.com>
>> wrote:
>> 
>>> I have also never seen people use a ZK solr.xml, and I see the solr.xml
>> as
>>> a node-config file. I'd be very happy to only support file-system
>> solr.xml
>>> loading.
>>> 
>>> On Wed, Sep 21, 2022 at 9:11 AM Jan Høydahl <ja...@cominvent.com>
>> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> Since 9.0 Solr can start with an empty SOLR_HOME as it will use
>> defaults
>>>> in their place <
>>>> 
>>> 
>> https://solr.apache.org/guide/solr/9_0/upgrade-notes/major-changes-in-solr-9.html
>>>> :
>>>> "Solr no longer requires a solr.xml in $SOLR_HOME. If one is not found,
>>>> Solr will instead use the default one from
>>> $SOLR_TIP/server/solr/solr.xml."
>>>> 
>>>> So this motivation for storing solr.xml centrally is no longer valid.
>>>> 
>>>> I agree with David's comment on the JIRA that this is also a question
>>>> about what we want solr.xml to be conceptually - a node-config file,
>> or a
>>>> cluster-config file. We have other locations for cluster-wide
>>>> configuration. Today I think solr.xml is a mix of the two. A value like
>>>> "zkClientTimeout" could be cluster-wide while "host" and "hostPort" are
>>> of
>>>> course node-local.
>>>> 
>>>> I'm also thinking about rolling upgrade scenario. So you do a rolling
>>>> upgrade from solr 8.11 to 9.0. For the 9.0 nodes you need different
>>>> configuration, maybe even radically different XML. Today most values in
>>>> solr.xml is sourced from Java System Properties, which is also a strong
>>>> hint that this is per-node.
>>>> 
>>>> Jan
>>>> 
>>>>> 20. sep. 2022 kl. 22:50 skrev Shawn Heisey
>>> <elyograg@elyograg.org.INVALID
>>>>> :
>>>>> 
>>>>> On 9/20/22 14:18, Jan Høydahl wrote:
>>>>>> It has been possible to load solr.xml centrally from zookeeper for a
>>>> long time.
>>>>>> However, I'm considering deprecating and removing this feature.
>>>>>> Please see https://issues.apache.org/jira/browse/SOLR-15959 for
>>>> motivation.
>>>>>> 
>>>>>> My question to the users list is thus - are you loading solr.xml
>> from
>>>> zookeeper?
>>>>>> And if yes, why is that capability important for you - i.e. could
>> you
>>>> not configure it per node?
>>>>> 
>>>>> I have done very little with SolrCloud myself.  I converted my tiny
>>>> little install to cloud with embedded zk, just the one server, one
>>>> collection and one core.  I do not have solr.xml in ZK.  I did this so
>> I
>>>> have access to whatever functionality is cloud-only, should a need ever
>>>> arise.  I fiddle with that install sometimes to try and answer support
>>>> questions.  Rebuilding the index only takes about ten minutes, so if I
>>>> screw something up I just restore the working config, delete the data
>>>> directory, restart, and reindex.
>>>>> 
>>>>> I can see a lot of value in being able to fire up a Solr node with
>> only
>>>> /etc/default/solr.in.sh being provided.  The solr home can then be
>>>> provided completely empty and the node will start, as long as solr.xml
>> is
>>>> in ZK.
>>>>> 
>>>>> One thing I think we should do is make it so that Solr starts with
>> some
>>>> defaults, if solr.xml is not found at all.  We can then bikeshed about
>>> what
>>>> the defaults should be.
>>>>> 
>>>>> Making it possible for cloud mode to start without solr.xml might
>>> remove
>>>> most people's need to have it live in ZK.  It would make things easier
>> on
>>>> docker users ... they would be able to attach a completely empty volume
>>> for
>>>> the solr home and Solr would start. They might then go back and add a
>>>> solr.xml to provide custom settings.
>>>>> 
>>>>> Thanks,
>>>>> Shawn
>>>>> 
>>>> 
>>>> 
>>> 
>> 

Re: Loading solr.xml from zookeeper

Posted by David Smiley <ds...@apache.org>.
ANNAMANENI: can you clarify what you mean by "multiple repositories"; maybe
"repositories" is a word with specific meaning for your system.  If this
functionality were removed, would something be harder?  I'm going to guess
you aren't using Docker / containers yet.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Wed, Sep 21, 2022 at 12:28 PM ANNAMANENI RAVEENDRA <
a.raveendra432@gmail.com> wrote:

> Hi we are using solr.xml from zk as we are using it for storing multiple
> repositories in a single file.
>
>
>
>
> On Wed, 21 Sep 2022 at 12:25 PM, Houston Putman <ho...@gmail.com>
> wrote:
>
> > I have also never seen people use a ZK solr.xml, and I see the solr.xml
> as
> > a node-config file. I'd be very happy to only support file-system
> solr.xml
> > loading.
> >
> > On Wed, Sep 21, 2022 at 9:11 AM Jan Høydahl <ja...@cominvent.com>
> wrote:
> >
> > > Hi,
> > >
> > > Since 9.0 Solr can start with an empty SOLR_HOME as it will use
> defaults
> > > in their place <
> > >
> >
> https://solr.apache.org/guide/solr/9_0/upgrade-notes/major-changes-in-solr-9.html
> > >:
> > > "Solr no longer requires a solr.xml in $SOLR_HOME. If one is not found,
> > > Solr will instead use the default one from
> > $SOLR_TIP/server/solr/solr.xml."
> > >
> > > So this motivation for storing solr.xml centrally is no longer valid.
> > >
> > > I agree with David's comment on the JIRA that this is also a question
> > > about what we want solr.xml to be conceptually - a node-config file,
> or a
> > > cluster-config file. We have other locations for cluster-wide
> > > configuration. Today I think solr.xml is a mix of the two. A value like
> > > "zkClientTimeout" could be cluster-wide while "host" and "hostPort" are
> > of
> > > course node-local.
> > >
> > > I'm also thinking about rolling upgrade scenario. So you do a rolling
> > > upgrade from solr 8.11 to 9.0. For the 9.0 nodes you need different
> > > configuration, maybe even radically different XML. Today most values in
> > > solr.xml is sourced from Java System Properties, which is also a strong
> > > hint that this is per-node.
> > >
> > > Jan
> > >
> > > > 20. sep. 2022 kl. 22:50 skrev Shawn Heisey
> > <elyograg@elyograg.org.INVALID
> > > >:
> > > >
> > > > On 9/20/22 14:18, Jan Høydahl wrote:
> > > >> It has been possible to load solr.xml centrally from zookeeper for a
> > > long time.
> > > >> However, I'm considering deprecating and removing this feature.
> > > >> Please see https://issues.apache.org/jira/browse/SOLR-15959 for
> > > motivation.
> > > >>
> > > >> My question to the users list is thus - are you loading solr.xml
> from
> > > zookeeper?
> > > >> And if yes, why is that capability important for you - i.e. could
> you
> > > not configure it per node?
> > > >
> > > > I have done very little with SolrCloud myself.  I converted my tiny
> > > little install to cloud with embedded zk, just the one server, one
> > > collection and one core.  I do not have solr.xml in ZK.  I did this so
> I
> > > have access to whatever functionality is cloud-only, should a need ever
> > > arise.  I fiddle with that install sometimes to try and answer support
> > > questions.  Rebuilding the index only takes about ten minutes, so if I
> > > screw something up I just restore the working config, delete the data
> > > directory, restart, and reindex.
> > > >
> > > > I can see a lot of value in being able to fire up a Solr node with
> only
> > > /etc/default/solr.in.sh being provided.  The solr home can then be
> > > provided completely empty and the node will start, as long as solr.xml
> is
> > > in ZK.
> > > >
> > > > One thing I think we should do is make it so that Solr starts with
> some
> > > defaults, if solr.xml is not found at all.  We can then bikeshed about
> > what
> > > the defaults should be.
> > > >
> > > > Making it possible for cloud mode to start without solr.xml might
> > remove
> > > most people's need to have it live in ZK.  It would make things easier
> on
> > > docker users ... they would be able to attach a completely empty volume
> > for
> > > the solr home and Solr would start. They might then go back and add a
> > > solr.xml to provide custom settings.
> > > >
> > > > Thanks,
> > > > Shawn
> > > >
> > >
> > >
> >
>

Re: Loading solr.xml from zookeeper

Posted by ANNAMANENI RAVEENDRA <a....@gmail.com>.
Hi we are using solr.xml from zk as we are using it for storing multiple
repositories in a single file.




On Wed, 21 Sep 2022 at 12:25 PM, Houston Putman <ho...@gmail.com>
wrote:

> I have also never seen people use a ZK solr.xml, and I see the solr.xml as
> a node-config file. I'd be very happy to only support file-system solr.xml
> loading.
>
> On Wed, Sep 21, 2022 at 9:11 AM Jan Høydahl <ja...@cominvent.com> wrote:
>
> > Hi,
> >
> > Since 9.0 Solr can start with an empty SOLR_HOME as it will use defaults
> > in their place <
> >
> https://solr.apache.org/guide/solr/9_0/upgrade-notes/major-changes-in-solr-9.html
> >:
> > "Solr no longer requires a solr.xml in $SOLR_HOME. If one is not found,
> > Solr will instead use the default one from
> $SOLR_TIP/server/solr/solr.xml."
> >
> > So this motivation for storing solr.xml centrally is no longer valid.
> >
> > I agree with David's comment on the JIRA that this is also a question
> > about what we want solr.xml to be conceptually - a node-config file, or a
> > cluster-config file. We have other locations for cluster-wide
> > configuration. Today I think solr.xml is a mix of the two. A value like
> > "zkClientTimeout" could be cluster-wide while "host" and "hostPort" are
> of
> > course node-local.
> >
> > I'm also thinking about rolling upgrade scenario. So you do a rolling
> > upgrade from solr 8.11 to 9.0. For the 9.0 nodes you need different
> > configuration, maybe even radically different XML. Today most values in
> > solr.xml is sourced from Java System Properties, which is also a strong
> > hint that this is per-node.
> >
> > Jan
> >
> > > 20. sep. 2022 kl. 22:50 skrev Shawn Heisey
> <elyograg@elyograg.org.INVALID
> > >:
> > >
> > > On 9/20/22 14:18, Jan Høydahl wrote:
> > >> It has been possible to load solr.xml centrally from zookeeper for a
> > long time.
> > >> However, I'm considering deprecating and removing this feature.
> > >> Please see https://issues.apache.org/jira/browse/SOLR-15959 for
> > motivation.
> > >>
> > >> My question to the users list is thus - are you loading solr.xml from
> > zookeeper?
> > >> And if yes, why is that capability important for you - i.e. could you
> > not configure it per node?
> > >
> > > I have done very little with SolrCloud myself.  I converted my tiny
> > little install to cloud with embedded zk, just the one server, one
> > collection and one core.  I do not have solr.xml in ZK.  I did this so I
> > have access to whatever functionality is cloud-only, should a need ever
> > arise.  I fiddle with that install sometimes to try and answer support
> > questions.  Rebuilding the index only takes about ten minutes, so if I
> > screw something up I just restore the working config, delete the data
> > directory, restart, and reindex.
> > >
> > > I can see a lot of value in being able to fire up a Solr node with only
> > /etc/default/solr.in.sh being provided.  The solr home can then be
> > provided completely empty and the node will start, as long as solr.xml is
> > in ZK.
> > >
> > > One thing I think we should do is make it so that Solr starts with some
> > defaults, if solr.xml is not found at all.  We can then bikeshed about
> what
> > the defaults should be.
> > >
> > > Making it possible for cloud mode to start without solr.xml might
> remove
> > most people's need to have it live in ZK.  It would make things easier on
> > docker users ... they would be able to attach a completely empty volume
> for
> > the solr home and Solr would start. They might then go back and add a
> > solr.xml to provide custom settings.
> > >
> > > Thanks,
> > > Shawn
> > >
> >
> >
>

Re: Loading solr.xml from zookeeper

Posted by Houston Putman <ho...@gmail.com>.
I have also never seen people use a ZK solr.xml, and I see the solr.xml as
a node-config file. I'd be very happy to only support file-system solr.xml
loading.

On Wed, Sep 21, 2022 at 9:11 AM Jan Høydahl <ja...@cominvent.com> wrote:

> Hi,
>
> Since 9.0 Solr can start with an empty SOLR_HOME as it will use defaults
> in their place <
> https://solr.apache.org/guide/solr/9_0/upgrade-notes/major-changes-in-solr-9.html>:
> "Solr no longer requires a solr.xml in $SOLR_HOME. If one is not found,
> Solr will instead use the default one from $SOLR_TIP/server/solr/solr.xml."
>
> So this motivation for storing solr.xml centrally is no longer valid.
>
> I agree with David's comment on the JIRA that this is also a question
> about what we want solr.xml to be conceptually - a node-config file, or a
> cluster-config file. We have other locations for cluster-wide
> configuration. Today I think solr.xml is a mix of the two. A value like
> "zkClientTimeout" could be cluster-wide while "host" and "hostPort" are of
> course node-local.
>
> I'm also thinking about rolling upgrade scenario. So you do a rolling
> upgrade from solr 8.11 to 9.0. For the 9.0 nodes you need different
> configuration, maybe even radically different XML. Today most values in
> solr.xml is sourced from Java System Properties, which is also a strong
> hint that this is per-node.
>
> Jan
>
> > 20. sep. 2022 kl. 22:50 skrev Shawn Heisey <elyograg@elyograg.org.INVALID
> >:
> >
> > On 9/20/22 14:18, Jan Høydahl wrote:
> >> It has been possible to load solr.xml centrally from zookeeper for a
> long time.
> >> However, I'm considering deprecating and removing this feature.
> >> Please see https://issues.apache.org/jira/browse/SOLR-15959 for
> motivation.
> >>
> >> My question to the users list is thus - are you loading solr.xml from
> zookeeper?
> >> And if yes, why is that capability important for you - i.e. could you
> not configure it per node?
> >
> > I have done very little with SolrCloud myself.  I converted my tiny
> little install to cloud with embedded zk, just the one server, one
> collection and one core.  I do not have solr.xml in ZK.  I did this so I
> have access to whatever functionality is cloud-only, should a need ever
> arise.  I fiddle with that install sometimes to try and answer support
> questions.  Rebuilding the index only takes about ten minutes, so if I
> screw something up I just restore the working config, delete the data
> directory, restart, and reindex.
> >
> > I can see a lot of value in being able to fire up a Solr node with only
> /etc/default/solr.in.sh being provided.  The solr home can then be
> provided completely empty and the node will start, as long as solr.xml is
> in ZK.
> >
> > One thing I think we should do is make it so that Solr starts with some
> defaults, if solr.xml is not found at all.  We can then bikeshed about what
> the defaults should be.
> >
> > Making it possible for cloud mode to start without solr.xml might remove
> most people's need to have it live in ZK.  It would make things easier on
> docker users ... they would be able to attach a completely empty volume for
> the solr home and Solr would start. They might then go back and add a
> solr.xml to provide custom settings.
> >
> > Thanks,
> > Shawn
> >
>
>

Re: Loading solr.xml from zookeeper

Posted by Jan Høydahl <ja...@cominvent.com>.
Thanks Shawn

It would surely be nice with a spring clean of configs for some future major version. Welcome to start a dev-list thread on that :)

The important for this thread is to decide what solr.xml is and isn’t. I think another key point is whether a config in solr.xml requires a node restart or not. It seems most of them do require a restart, which further implies it to be node-local. If you were to change solr.xml in zk, nothing would happen unless you restart every node after that edit. So you need to touch each node anyway..

Jan Høydahl

> 21. sep. 2022 kl. 21:58 skrev Shawn Heisey <ap...@elyograg.org.invalid>:
> 
> On 9/21/22 07:10, Jan Høydahl wrote:
>> Since 9.0 Solr can start with an empty SOLR_HOME as it will use defaults in their place <https://solr.apache.org/guide/solr/9_0/upgrade-notes/major-changes-in-solr-9.html>: "Solr no longer requires a solr.xml in $SOLR_HOME. If one is not found, Solr will instead use the default one from $SOLR_TIP/server/solr/solr.xml."
> 
> You can see how much I keep up with things, didn't know that was already handled.  I haven't used Solr professionally for a few years now, which means there is little time during work hours to keep close track of Solr's progress.  Thank you for letting me know.  I think that means I can delete solr.xml from my tiny Solr install.
> 
>> I agree with David's comment on the JIRA that this is also a question about what we want solr.xml to be conceptually - a node-config file, or a cluster-config file. We have other locations for cluster-wide configuration. Today I think solr.xml is a mix of the two. A value like "zkClientTimeout" could be cluster-wide while "host" and "hostPort" are of course node-local.
> 
> Topics for different threads below, and I am aware that what I am describing is a TON of work.  I would help with it as much as I can.
> 
> I think the entire configuration system needs a revamp.
> 
> 1) Choose one format for all configs.  Currently it is a mix of xml, properties, and json.  I really like the compactness of json, but the official standard does not support comments, and we use those extensively in the out-of-box xml configs.  Many of the libraries that parse json do have comment support, but I worry about relying on nonstandard extensions.  Related:  For JSON support, let's decide whether we are using jackson or noggit and remove the other one.  I suspect that some of our other dependencies depend on jackson, which may make the choice for us. Can jackson be used for the other things we do with XML?  I would like to reduce how many dependencies we have, make the download smaller.
> 
> 2) Make sure the entire config system follows good inheritance rules.  Cluster config takes effect unless node config overrides, and so on.  Cluster config probably only applies to cloud mode, so it should be in ZK, and completely configurable in the admin UI. Default node config in cloud mode should probably actually be part of cluster config, and we could even have node-specific overrides in ZK as well so they are easy to edit centrally.  It would be very cool if there was central editing even for the things that normally go in /etc/default/solr.in.sh.
> 
> 3) I think the admin UI should have an option to turn on in-UI editing of collection/core configurations, and that absolutely everything they can do is available in the UI.  Leave that feature off by default as a security measure, and have a big red security warning on the button that turns it on.
> 
> I imagine a SolrCloud world where EVERYTHING is configurable in the admin UI, with some of it turned off by default for security, where you can even change things like heap size and restart multiple Solr nodes all in the central UI.  It would be very nice if the UI even controlled ZK nodes.  As part of that, eliminating standalone mode is probably prudent.
> 
> A truly ambitious idea would be to have a full software suite that includes creating a VIP so there is automated redundancy of the central UI's IP address, with rpm and deb repos for easy install. Containers are very in right now, so create something similar with docker.
> 
> Thanks,
> Shawn
> 

Re: Loading solr.xml from zookeeper

Posted by Shawn Heisey <ap...@elyograg.org.INVALID>.
On 9/21/22 07:10, Jan Høydahl wrote:
> Since 9.0 Solr can start with an empty SOLR_HOME as it will use defaults in their place <https://solr.apache.org/guide/solr/9_0/upgrade-notes/major-changes-in-solr-9.html>: "Solr no longer requires a solr.xml in $SOLR_HOME. If one is not found, Solr will instead use the default one from $SOLR_TIP/server/solr/solr.xml."

You can see how much I keep up with things, didn't know that was already 
handled.  I haven't used Solr professionally for a few years now, which 
means there is little time during work hours to keep close track of 
Solr's progress.  Thank you for letting me know.  I think that means I 
can delete solr.xml from my tiny Solr install.

> I agree with David's comment on the JIRA that this is also a question about what we want solr.xml to be conceptually - a node-config file, or a cluster-config file. We have other locations for cluster-wide configuration. Today I think solr.xml is a mix of the two. A value like "zkClientTimeout" could be cluster-wide while "host" and "hostPort" are of course node-local.

Topics for different threads below, and I am aware that what I am 
describing is a TON of work.  I would help with it as much as I can.

I think the entire configuration system needs a revamp.

1) Choose one format for all configs.  Currently it is a mix of xml, 
properties, and json.  I really like the compactness of json, but the 
official standard does not support comments, and we use those 
extensively in the out-of-box xml configs.  Many of the libraries that 
parse json do have comment support, but I worry about relying on 
nonstandard extensions.  Related:  For JSON support, let's decide 
whether we are using jackson or noggit and remove the other one.  I 
suspect that some of our other dependencies depend on jackson, which may 
make the choice for us. Can jackson be used for the other things we do 
with XML?  I would like to reduce how many dependencies we have, make 
the download smaller.

2) Make sure the entire config system follows good inheritance rules.  
Cluster config takes effect unless node config overrides, and so on.  
Cluster config probably only applies to cloud mode, so it should be in 
ZK, and completely configurable in the admin UI. Default node config in 
cloud mode should probably actually be part of cluster config, and we 
could even have node-specific overrides in ZK as well so they are easy 
to edit centrally.  It would be very cool if there was central editing 
even for the things that normally go in /etc/default/solr.in.sh.

3) I think the admin UI should have an option to turn on in-UI editing 
of collection/core configurations, and that absolutely everything they 
can do is available in the UI.  Leave that feature off by default as a 
security measure, and have a big red security warning on the button that 
turns it on.

I imagine a SolrCloud world where EVERYTHING is configurable in the 
admin UI, with some of it turned off by default for security, where you 
can even change things like heap size and restart multiple Solr nodes 
all in the central UI.  It would be very nice if the UI even controlled 
ZK nodes.  As part of that, eliminating standalone mode is probably prudent.

A truly ambitious idea would be to have a full software suite that 
includes creating a VIP so there is automated redundancy of the central 
UI's IP address, with rpm and deb repos for easy install. Containers are 
very in right now, so create something similar with docker.

Thanks,
Shawn


Re: Loading solr.xml from zookeeper

Posted by Shawn Heisey <ap...@elyograg.org.INVALID>.
Sent this once already but it never made it to the list.  Checked apache 
mail archives to make sure it wasn't just me.  It's not there.

On 9/21/22 07:10, Jan Høydahl wrote:
> Since 9.0 Solr can start with an empty SOLR_HOME as it will use 
> defaults in their place 
> <https://solr.apache.org/guide/solr/9_0/upgrade-notes/major-changes-in-solr-9.html>: 
> "Solr no longer requires a solr.xml in $SOLR_HOME. If one is not 
> found, Solr will instead use the default one from 
> $SOLR_TIP/server/solr/solr.xml."

You can see how much I keep up with things, didn't know that was already 
handled.  I haven't used Solr professionally for a few years now, which 
means there is little time during work hours to keep close track of 
Solr's progress.  Thank you for letting me know.  I think that means I 
can delete solr.xml from my tiny Solr install.

> I agree with David's comment on the JIRA that this is also a question 
> about what we want solr.xml to be conceptually - a node-config file, 
> or a cluster-config file. We have other locations for cluster-wide 
> configuration. Today I think solr.xml is a mix of the two. A value 
> like "zkClientTimeout" could be cluster-wide while "host" and 
> "hostPort" are of course node-local.

Topics for different threads below, and I am aware that what I am 
describing is a TON of work.  I would help with it as much as I can.

I think the entire configuration system needs a revamp.

1) Choose one format for all configs.  Currently it is a mix of xml, 
properties, and json.  I really like the compactness of json, but the 
official standard does not support comments, and we use those 
extensively in the out-of-box xml configs.  Many of the libraries that 
parse json do have comment support, but I worry about relying on 
nonstandard extensions.  Related:  For JSON support, let's decide 
whether we are using jackson or noggit and remove the other one.  I 
suspect that some of our other dependencies depend on jackson, which may 
make the choice for us. Can jackson be used for the other things we do 
with XML?  I would like to reduce how many dependencies we have, make 
the download smaller.

2) Make sure the entire config system follows good inheritance rules.  
Cluster config takes effect unless node config overrides, and so on.  
Cluster config probably only applies to cloud mode, so it should be in 
ZK, and completely configurable in the admin UI. Default node config in 
cloud mode should probably actually be part of cluster config, and we 
could even have node-specific overrides in ZK as well so they are easy 
to edit centrally.  It would be very cool if there was central editing 
even for the things that normally go in /etc/default/solr.in.sh.

3) I think the admin UI should have an option to turn on in-UI editing 
of collection/core configurations, and that absolutely everything they 
can do is available in the UI.  Leave that feature off by default as a 
security measure, and have a big red security warning on the button that 
turns it on.

I imagine a SolrCloud world where EVERYTHING is configurable in the 
admin UI, with some of it turned off by default for security, where you 
can even change things like heap size and restart multiple Solr nodes 
all in the central UI.  It would be very nice if the UI even controlled 
ZK nodes.  As part of that, eliminating standalone mode is probably prudent.

A truly ambitious idea would be to have a full software suite that 
includes creating a VIP so there is automated redundancy of the central 
UI's IP address, with rpm and deb repos for easy install. Containers are 
very in right now, so create something similar with docker.

Thanks,
Shawn


Re: Loading solr.xml from zookeeper

Posted by Jan Høydahl <ja...@cominvent.com>.
Hi,

Since 9.0 Solr can start with an empty SOLR_HOME as it will use defaults in their place <https://solr.apache.org/guide/solr/9_0/upgrade-notes/major-changes-in-solr-9.html>: "Solr no longer requires a solr.xml in $SOLR_HOME. If one is not found, Solr will instead use the default one from $SOLR_TIP/server/solr/solr.xml."

So this motivation for storing solr.xml centrally is no longer valid.

I agree with David's comment on the JIRA that this is also a question about what we want solr.xml to be conceptually - a node-config file, or a cluster-config file. We have other locations for cluster-wide configuration. Today I think solr.xml is a mix of the two. A value like "zkClientTimeout" could be cluster-wide while "host" and "hostPort" are of course node-local. 

I'm also thinking about rolling upgrade scenario. So you do a rolling upgrade from solr 8.11 to 9.0. For the 9.0 nodes you need different configuration, maybe even radically different XML. Today most values in solr.xml is sourced from Java System Properties, which is also a strong hint that this is per-node.

Jan

> 20. sep. 2022 kl. 22:50 skrev Shawn Heisey <el...@elyograg.org.INVALID>:
> 
> On 9/20/22 14:18, Jan Høydahl wrote:
>> It has been possible to load solr.xml centrally from zookeeper for a long time.
>> However, I'm considering deprecating and removing this feature.
>> Please see https://issues.apache.org/jira/browse/SOLR-15959 for motivation.
>> 
>> My question to the users list is thus - are you loading solr.xml from zookeeper?
>> And if yes, why is that capability important for you - i.e. could you not configure it per node?
> 
> I have done very little with SolrCloud myself.  I converted my tiny little install to cloud with embedded zk, just the one server, one collection and one core.  I do not have solr.xml in ZK.  I did this so I have access to whatever functionality is cloud-only, should a need ever arise.  I fiddle with that install sometimes to try and answer support questions.  Rebuilding the index only takes about ten minutes, so if I screw something up I just restore the working config, delete the data directory, restart, and reindex.
> 
> I can see a lot of value in being able to fire up a Solr node with only /etc/default/solr.in.sh being provided.  The solr home can then be provided completely empty and the node will start, as long as solr.xml is in ZK.
> 
> One thing I think we should do is make it so that Solr starts with some defaults, if solr.xml is not found at all.  We can then bikeshed about what the defaults should be.
> 
> Making it possible for cloud mode to start without solr.xml might remove most people's need to have it live in ZK.  It would make things easier on docker users ... they would be able to attach a completely empty volume for the solr home and Solr would start. They might then go back and add a solr.xml to provide custom settings.
> 
> Thanks,
> Shawn
> 


Re: Loading solr.xml from zookeeper

Posted by Shawn Heisey <el...@elyograg.org.INVALID>.
On 9/20/22 14:18, Jan Høydahl wrote:
> It has been possible to load solr.xml centrally from zookeeper for a long time.
> However, I'm considering deprecating and removing this feature.
> Please see https://issues.apache.org/jira/browse/SOLR-15959 for motivation.
>
> My question to the users list is thus - are you loading solr.xml from zookeeper?
> And if yes, why is that capability important for you - i.e. could you not configure it per node?

I have done very little with SolrCloud myself.  I converted my tiny 
little install to cloud with embedded zk, just the one server, one 
collection and one core.  I do not have solr.xml in ZK.  I did this so I 
have access to whatever functionality is cloud-only, should a need ever 
arise.  I fiddle with that install sometimes to try and answer support 
questions.  Rebuilding the index only takes about ten minutes, so if I 
screw something up I just restore the working config, delete the data 
directory, restart, and reindex.

I can see a lot of value in being able to fire up a Solr node with only 
/etc/default/solr.in.sh being provided.  The solr home can then be 
provided completely empty and the node will start, as long as solr.xml 
is in ZK.

One thing I think we should do is make it so that Solr starts with some 
defaults, if solr.xml is not found at all.  We can then bikeshed about 
what the defaults should be.

Making it possible for cloud mode to start without solr.xml might remove 
most people's need to have it live in ZK.  It would make things easier 
on docker users ... they would be able to attach a completely empty 
volume for the solr home and Solr would start. They might then go back 
and add a solr.xml to provide custom settings.

Thanks,
Shawn