You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Timothy Potter <th...@gmail.com> on 2013/09/20 17:56:07 UTC

Need help understanding the use cases behind core auto-discovery

Trying to add some information about core.properties and auto-discovery in
Solr in Action and am at a loss for what to tell the reader is the purpose
of this feature.

Can anyone point me to any background information about core
auto-discovery? I'm not interested in the technical implementation details.
Mainly I'm trying to understand the motivation behind having this feature
as it seems unnecessary with the Core Admin API. Best I can tell is it
removes a manual step of firing off a call to the Core Admin API or loading
a core from the Admin UI. If that's it and I'm overthinking it, then cool
but was expecting more of an "ah-ha" moment with this feature ;-)

Any insights you can share are appreciated.

Thanks.
Tim

Re: Need help understanding the use cases behind core auto-discovery

Posted by Trey Grainger <so...@gmail.com>.

While on this topic...

Is it still true in Solr 4.5 (RC) that it is not possible to have a shared
config directory?  In general, I like the new core.properties mechanism
better as it removes the unnecessary centralized configuration of cores in
solr.xml, but I have an infrastructure where I have thousands of Solr Cores
with the same configs on a single server, and as last I could tell with
Solr 4.4 the only way to support this in core.properties was to copy and
paste or create symbolic links for the whole conf/ folder for every core
(i.e. thousands of identical copies of all config files in my case).

In the old solr.xml format, we could set the instanceDir to have all cores
reference the same folder, but in core.properties there doesn't seem to be
anything like this.  I tried just referencing solrconfig.xml in another
directory, but because everything is now relative to the conf/ directory
under the folder containing core.properties, none of the referenced files
were in the right place.

Is there any better guidance on migrating to core autodiscovery with the
need for a shared config directory (non-SolrCloud mode)?  This looked
promising, but it sounds dead from Erick's JIRA comment:
https://issues.apache.org/jira/browse/SOLR-4478

Thanks,

-Trey


On Sat, Sep 21, 2013 at 2:25 PM, Erick Erickson <er...@gmail.com>wrote:

> Also consider where SolrCloud is going. Trying to correctly maintain
> all the solr.xml files yourself on all the nodes would have
> been..."interesting". On all the machines in your 200 node cluster.
> With 17 different collections. With nodes coming and going. With
> splitting shards. With.....
>
> Collections are almost guaranteed to be distributed unevenly (e.g. a
> big collection might have 20 shards and a small collection 3 in the
> same cluster). So each node used to require solr.xml to be unique as
> far as everything in the <cores> tag. But everything  _not_ in the
> <cores> tags is common. Say you wanted to change the
> shardHandlerFactory (or any other setting we put in solr.xml that
> wouldn't have gone into the old <cores> tag). In the old-style way of
> doing things, since each solr.xml file on each node has potentially a
> different set of cores, you'd have to edit each and every one of them.
>
> The older way of doing this is fine as long as each solr.xml on each
> machine is self-consistent. So auto-discovery essentially automates
> that self-consistency.
>
> It also makes it possible to have Zookeeper manage your solr.xml and
> auto-distribute it to new nodes (or update existing) which would have
> taken a lot of effort to get right without auto-discovery. So changing
> the <shardHandlerFactory> consists of changing the solr.xml file and
> pushing it to ZooKeeper (don't quite remember the right JIRA, but you
> can do this now).
>
> I suppose it's like all other refactorings. Solr.xml had it's origin
> in the single-core days, then when multi-cores came into being it was
> expanded to include that information, but eventually became, as Yonik
> says, unnecessary central configuration which started becoming a
> limitation.
>
> FWIW,
> Erick
>
> On Fri, Sep 20, 2013 at 9:45 AM, Timothy Potter <th...@gmail.com>
> wrote:
> > Exactly the insight I was looking for! Thanks Yonik ;-)
> >
> >
> > On Fri, Sep 20, 2013 at 10:37 AM, Yonik Seeley <yo...@lucidworks.com>
> wrote:
> >
> >> On Fri, Sep 20, 2013 at 11:56 AM, Timothy Potter <th...@gmail.com>
> >> wrote:
> >> > Trying to add some information about core.properties and
> auto-discovery
> >> in
> >> > Solr in Action and am at a loss for what to tell the reader is the
> >> purpose
> >> > of this feature.
> >>
> >> IMO, it was more a removal of unnecessary central configuration.
> >> You previously had to list the core in solr.xml, and now you don't.
> >> Cores should be fully self-describing so that it should be easy to
> >> move them in the future just by moving the core directory (although
> >> that may not yet work...)
> >>
> >> -Yonik
> >> http://lucidworks.com
> >>
> >> > Can anyone point me to any background information about core
> >> > auto-discovery? I'm not interested in the technical implementation
> >> details.
> >> > Mainly I'm trying to understand the motivation behind having this
> feature
> >> > as it seems unnecessary with the Core Admin API. Best I can tell is it
> >> > removes a manual step of firing off a call to the Core Admin API or
> >> loading
> >> > a core from the Admin UI. If that's it and I'm overthinking it, then
> cool
> >> > but was expecting more of an "ah-ha" moment with this feature ;-)
> >> >
> >> > Any insights you can share are appreciated.
> >> >
> >> > Thanks.
> >> > Tim
> >>
>

Re: Need help understanding the use cases behind core auto-discovery

Posted by Erick Erickson <er...@gmail.com>.

Also consider where SolrCloud is going. Trying to correctly maintain
all the solr.xml files yourself on all the nodes would have
been..."interesting". On all the machines in your 200 node cluster.
With 17 different collections. With nodes coming and going. With
splitting shards. With.....

Collections are almost guaranteed to be distributed unevenly (e.g. a
big collection might have 20 shards and a small collection 3 in the
same cluster). So each node used to require solr.xml to be unique as
far as everything in the <cores> tag. But everything  _not_ in the
<cores> tags is common. Say you wanted to change the
shardHandlerFactory (or any other setting we put in solr.xml that
wouldn't have gone into the old <cores> tag). In the old-style way of
doing things, since each solr.xml file on each node has potentially a
different set of cores, you'd have to edit each and every one of them.

The older way of doing this is fine as long as each solr.xml on each
machine is self-consistent. So auto-discovery essentially automates
that self-consistency.

It also makes it possible to have Zookeeper manage your solr.xml and
auto-distribute it to new nodes (or update existing) which would have
taken a lot of effort to get right without auto-discovery. So changing
the <shardHandlerFactory> consists of changing the solr.xml file and
pushing it to ZooKeeper (don't quite remember the right JIRA, but you
can do this now).

I suppose it's like all other refactorings. Solr.xml had it's origin
in the single-core days, then when multi-cores came into being it was
expanded to include that information, but eventually became, as Yonik
says, unnecessary central configuration which started becoming a
limitation.

FWIW,
Erick

On Fri, Sep 20, 2013 at 9:45 AM, Timothy Potter <th...@gmail.com> wrote:
> Exactly the insight I was looking for! Thanks Yonik ;-)
>
>
> On Fri, Sep 20, 2013 at 10:37 AM, Yonik Seeley <yo...@lucidworks.com> wrote:
>
>> On Fri, Sep 20, 2013 at 11:56 AM, Timothy Potter <th...@gmail.com>
>> wrote:
>> > Trying to add some information about core.properties and auto-discovery
>> in
>> > Solr in Action and am at a loss for what to tell the reader is the
>> purpose
>> > of this feature.
>>
>> IMO, it was more a removal of unnecessary central configuration.
>> You previously had to list the core in solr.xml, and now you don't.
>> Cores should be fully self-describing so that it should be easy to
>> move them in the future just by moving the core directory (although
>> that may not yet work...)
>>
>> -Yonik
>> http://lucidworks.com
>>
>> > Can anyone point me to any background information about core
>> > auto-discovery? I'm not interested in the technical implementation
>> details.
>> > Mainly I'm trying to understand the motivation behind having this feature
>> > as it seems unnecessary with the Core Admin API. Best I can tell is it
>> > removes a manual step of firing off a call to the Core Admin API or
>> loading
>> > a core from the Admin UI. If that's it and I'm overthinking it, then cool
>> > but was expecting more of an "ah-ha" moment with this feature ;-)
>> >
>> > Any insights you can share are appreciated.
>> >
>> > Thanks.
>> > Tim
>>

Re: Need help understanding the use cases behind core auto-discovery

Posted by Timothy Potter <th...@gmail.com>.

Exactly the insight I was looking for! Thanks Yonik ;-)


On Fri, Sep 20, 2013 at 10:37 AM, Yonik Seeley <yo...@lucidworks.com> wrote:

> On Fri, Sep 20, 2013 at 11:56 AM, Timothy Potter <th...@gmail.com>
> wrote:
> > Trying to add some information about core.properties and auto-discovery
> in
> > Solr in Action and am at a loss for what to tell the reader is the
> purpose
> > of this feature.
>
> IMO, it was more a removal of unnecessary central configuration.
> You previously had to list the core in solr.xml, and now you don't.
> Cores should be fully self-describing so that it should be easy to
> move them in the future just by moving the core directory (although
> that may not yet work...)
>
> -Yonik
> http://lucidworks.com
>
> > Can anyone point me to any background information about core
> > auto-discovery? I'm not interested in the technical implementation
> details.
> > Mainly I'm trying to understand the motivation behind having this feature
> > as it seems unnecessary with the Core Admin API. Best I can tell is it
> > removes a manual step of firing off a call to the Core Admin API or
> loading
> > a core from the Admin UI. If that's it and I'm overthinking it, then cool
> > but was expecting more of an "ah-ha" moment with this feature ;-)
> >
> > Any insights you can share are appreciated.
> >
> > Thanks.
> > Tim
>

Re: Need help understanding the use cases behind core auto-discovery

Posted by Yonik Seeley <yo...@lucidworks.com>.

On Fri, Sep 20, 2013 at 11:56 AM, Timothy Potter <th...@gmail.com> wrote:
> Trying to add some information about core.properties and auto-discovery in
> Solr in Action and am at a loss for what to tell the reader is the purpose
> of this feature.

IMO, it was more a removal of unnecessary central configuration.
You previously had to list the core in solr.xml, and now you don't.
Cores should be fully self-describing so that it should be easy to
move them in the future just by moving the core directory (although
that may not yet work...)

-Yonik
http://lucidworks.com

> Can anyone point me to any background information about core
> auto-discovery? I'm not interested in the technical implementation details.
> Mainly I'm trying to understand the motivation behind having this feature
> as it seems unnecessary with the Core Admin API. Best I can tell is it
> removes a manual step of firing off a call to the Core Admin API or loading
> a core from the Admin UI. If that's it and I'm overthinking it, then cool
> but was expecting more of an "ah-ha" moment with this feature ;-)
>
> Any insights you can share are appreciated.
>
> Thanks.
> Tim