You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Tomás Fernández Löbbe <to...@gmail.com> on 2015/05/09 06:05:31 UTC

Configsets and Config APIs in Solr

I think the concept of ConfigSets has become a bit confusing with the
Config APIs (I'm thinking in SolrCloud mode here). Solr requires that a
configset is pushed to ZooKeeper before creating a collection that uses it.
It supports multiple collections using the same configset, which I think is
great. You could also have a couple of configsets that no collection is
currently using (who knows, maybe one that was recently deprecated, or that
will be used soon, etc). This gives me the idea that configsets are a
separate entity than the collection, not just a collection's configuration.

Config APIs allow you to operate on a collection to add handlers, change
settings, etc. The problem is that you are not really applying the changes
to the collection but to the complete configset. All collections using it
will get the changes, and all of them will be reloaded after a change.

Shouldn't those APIs be at a different level/outside the collection? Maybe
a configset API? Or, maybe the configs (for example, the
configoverlay.json) should only apply to the collection where the API call
was made and not to other collections using the configset?

Tomás

Re: Configsets and Config APIs in Solr

Posted by "david.w.smiley@gmail.com" <da...@gmail.com>.
+1 Tomas.

On Fri, May 15, 2015 at 12:40 PM Tomás Fernández Löbbe <
tomasflobbe@gmail.com> wrote:

> I agree about differentiating the mutable part (configoverlay, generated
> schema, etc) and the immutable (the configset) , but I think it would be
> better if the mutable part is placed under /collections/x/..., otherwise
> "/configs" would have a mix of ConfigSets and collection-specific
> configuration.
>
> On Fri, May 15, 2015 at 6:38 AM, Noble Paul <no...@gmail.com> wrote:
>
>> I think this needs more discussion
>>
>> When a collection is created we should have two things
>>
>> an immutable part and a mutable part
>>
>> for instance my collection name is "x" and it uses schemaless example conf
>>
>> I must now have two conf dirs
>>
>> configs/schemaless and
>> configs/x
>>
>> all the mutable stuff goes to configs/x
>>
>> and config/schemaless remains immutable
>>
>>
>>
>> On Tue, May 12, 2015 at 2:23 AM, Tomás Fernández Löbbe <
>> tomasflobbe@gmail.com> wrote:
>>
>>> I think this is fine.I don't think we need a new concept of "config
>>> templates", we just need to make it clear that the configset used to create
>>> the collection is not modified by Solr, and that any change done via API
>>> only affects the single collection where the config command is issued.
>>>
>>> I guess the schema API should start using something like configoverlay,
>>> or maybe persist the updated schema to this new path?
>>>
>>> Tomás
>>>
>>> On Fri, May 8, 2015 at 10:28 PM, Noble Paul <no...@gmail.com>
>>> wrote:
>>>
>>>> I agree with you on the point that it causes confusion.
>>>>
>>>> My suggestion would be to have something called "config templates" and
>>>> they are immutable . So , we don't need a configset API
>>>> each collection have it's own conf folder .
>>>>
>>>> So, when a collection is created we should go ahead and create a
>>>> corresponding conf dir.
>>>>
>>>> Ideally, it should not copy over all configs from it's template. It
>>>> should just store the configoverlay.json, params.json in the collection's
>>>> conf directory and inherit the rest from the template
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Sat, May 9, 2015 at 9:35 AM, Tomás Fernández Löbbe <
>>>> tomasflobbe@gmail.com> wrote:
>>>>
>>>>> I think the concept of ConfigSets has become a bit confusing with the
>>>>> Config APIs (I'm thinking in SolrCloud mode here). Solr requires that a
>>>>> configset is pushed to ZooKeeper before creating a collection that uses it.
>>>>> It supports multiple collections using the same configset, which I think is
>>>>> great. You could also have a couple of configsets that no collection is
>>>>> currently using (who knows, maybe one that was recently deprecated, or that
>>>>> will be used soon, etc). This gives me the idea that configsets are a
>>>>> separate entity than the collection, not just a collection's configuration.
>>>>>
>>>>> Config APIs allow you to operate on a collection to add handlers,
>>>>> change settings, etc. The problem is that you are not really applying the
>>>>> changes to the collection but to the complete configset. All collections
>>>>> using it will get the changes, and all of them will be reloaded after a
>>>>> change.
>>>>>
>>>>> Shouldn't those APIs be at a different level/outside the collection?
>>>>> Maybe a configset API? Or, maybe the configs (for example, the
>>>>> configoverlay.json) should only apply to the collection where the API call
>>>>> was made and not to other collections using the configset?
>>>>>
>>>>> Tomás
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> -----------------------------------------------------
>>>> Noble Paul
>>>>
>>>
>>>
>>
>>
>> --
>> -----------------------------------------------------
>> Noble Paul
>>
>
>

Re: Configsets and Config APIs in Solr

Posted by Tomás Fernández Löbbe <to...@gmail.com>.
I think there are still open questions based on people comments in this
email thread and in SOLR-5955. Seems like the concept of mutable/immutable
ConfigSet should be supported but not forced. Then, what is an mutable
ConfigSet? one that can be edited via API calls? then, should those Config
APIs be at the collection level or as a different API? Do we also want to
support Collection-specific configuration changes?

Accepting that shared ConfigSet can be edited by doing collection-specific
operations is a bad idea I think.

Tomás

On Mon, May 25, 2015 at 6:58 AM, Noble Paul <no...@gmail.com> wrote:

> >> but I think it would be better if the mutable part is placed under
> /collections/x/..., otherwise "/configs"
>
> it makes sense.
>
> The problem we have is managedschema currently writes to the same
>
> If we could change the managedschema behavior somehow it would have been
> better
>
> On Fri, May 22, 2015 at 10:21 PM, Tomás Fernández Löbbe
> <to...@gmail.com> wrote:
> >> TLDR: we should think about this as configset base vs per-collection
> diff,
> >> not as immutable base vs per-collection mutable.
> >
> > Makes sense, I was mostly thinking of it being immutable from the current
> > Config APIs. Editing a configset for multiple collection is a valid and
> > useful feature, the problem is doing that from inside one collection's
> API
> > call.
> >
> >> So then the question becomes, do we want an API that can *also* make
> >> collection-specific changes to a shared config?
> >
> > If we feel there is no need for collection-specific config changes, I'm
> OK,
> > but again, the API should be outside of the collection, like a Configset
> > API. The "generate configset based on X" should also be a command of this
> > API. In addition, this could allow users to edit a configset that's not
> > currently being used by any collection.
> >
> > Tomás
> >
> >
> > On Fri, May 22, 2015 at 7:10 AM, Yonik Seeley <ys...@gmail.com> wrote:
> >>
> >> Makes sense Greg.
> >>
> >> Just looking at it from the ZK perspective (APIs aside), the original
> >> idea behind referencing a config set by name was so that you could
> >> change it in one place and everyone relying on it would get the
> >> changes.
> >>
> >> If one wants collections to have separate independent config sets they
> >> can already do that.
> >>
> >> So then the question becomes, do we want an API that can *also* make
> >> collection-specific changes to a shared config?
> >>
> >> An alternative would be a command to make a copy of a config set, and
> >> a command to switch a specific collection to use that new config set.
> >> Then any further changes would be collection specific.  That's sort of
> >> like SOLR-5955 - config templates - but you can "template" off of any
> >> other config set, at any point in time.  Actually, that type of
> >> functionality seems generally useful regardless.
> >>
> >> -Yonik
> >>
> >>
> >> On Thu, May 21, 2015 at 8:07 PM, Gregory Chanan <gc...@cloudera.com>
> >> wrote:
> >> > I'm +1 on the general idea, but I'm not convinced about the
> >> > mutable/immutable separation.
> >> >
> >> > Do we not think it is valid to modify a single config(set) that
> affects
> >> > multiple collections?  I can imagine a case where my data with the
> same
> >> > config(set) is partitioned into many different collections, whether by
> >> > date,
> >> > sorted order, etc. that all use the same underlying config(set).
> Let's
> >> > say
> >> > I have collections partitioned by month and I decide I want to add
> >> > another
> >> > field; I don't want to have to modify
> >> > jan/schema
> >> > feb/schema
> >> > mar/schema
> >> > etc.
> >> >
> >> > I just want to modify the single underlying config(set).  You can
> >> > imagine
> >> > having a configset API that let's me do that, so if I wanted to
> modify a
> >> > single collection's config I would call:
> >> > jan/schema
> >> > but if i wanted to modify the underlying config(set) I would call:
> >> > configset/month_partitioned_config
> >> >
> >> > My point is this: if the problem is that it is confusing to have
> >> > configsets
> >> > modified when you make collection-level calls, then we should fix that
> >> > (I'm
> >> > 100% in agreement with that, btw).  You can fix that by having a
> >> > configset
> >> > and a per-collection diff; defining the configset as immutable doesn't
> >> > solve
> >> > the problem, only locks us into a implementation that doesn't support
> >> > the
> >> > use case above.  I'm not even saying we should implement a configset
> >> > API,
> >> > only that defining this as an immutable vs mutable implementation
> blocks
> >> > us
> >> > from doing that.
> >> >
> >> > TLDR: we should think about this as configset base vs per-collection
> >> > diff,
> >> > not as immutable base vs per-collection mutable.
> >> >
> >> > Thoughts?
> >> > Greg
> >> >
> >> >
> >> > On Tue, May 19, 2015 at 10:52 AM, Tomás Fernández Löbbe
> >> > <to...@gmail.com> wrote:
> >> >>
> >> >> I created https://issues.apache.org/jira/browse/SOLR-7570
> >> >>
> >> >> On Fri, May 15, 2015 at 10:31 AM, Alan Woodward <al...@flax.co.uk>
> >> >> wrote:
> >> >>>
> >> >>> +1
> >> >>>
> >> >>> A nice way of doing it would be to make it part of the
> >> >>> SolrResourceLoader
> >> >>> interface.  The ZK resource loader could check in the
> >> >>> collection-specific
> >> >>> zknode first, and then under configs/, and we could add a
> >> >>> writeResource()
> >> >>> method that writes to the collection-specific node as well.  Then
> all
> >> >>> config
> >> >>> I/O goes via the resource loader, and we have a way of keeping
> certain
> >> >>> parts
> >> >>> immutable.
> >> >>>
> >> >>> On 15 May 2015, at 17:39, Tomás Fernández Löbbe
> >> >>> <to...@gmail.com>
> >> >>> wrote:
> >> >>>
> >> >>> I agree about differentiating the mutable part (configoverlay,
> >> >>> generated
> >> >>> schema, etc) and the immutable (the configset) , but I think it
> would
> >> >>> be
> >> >>> better if the mutable part is placed under /collections/x/...,
> >> >>> otherwise
> >> >>> "/configs" would have a mix of ConfigSets and collection-specific
> >> >>> configuration.
> >> >>>
> >> >>> On Fri, May 15, 2015 at 6:38 AM, Noble Paul <no...@gmail.com>
> >> >>> wrote:
> >> >>>>
> >> >>>> I think this needs more discussion
> >> >>>>
> >> >>>> When a collection is created we should have two things
> >> >>>>
> >> >>>> an immutable part and a mutable part
> >> >>>>
> >> >>>> for instance my collection name is "x" and it uses schemaless
> example
> >> >>>> conf
> >> >>>>
> >> >>>> I must now have two conf dirs
> >> >>>>
> >> >>>> configs/schemaless and
> >> >>>> configs/x
> >> >>>>
> >> >>>> all the mutable stuff goes to configs/x
> >> >>>>
> >> >>>> and config/schemaless remains immutable
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>> On Tue, May 12, 2015 at 2:23 AM, Tomás Fernández Löbbe
> >> >>>> <to...@gmail.com> wrote:
> >> >>>>>
> >> >>>>> I think this is fine.I don't think we need a new concept of
> "config
> >> >>>>> templates", we just need to make it clear that the configset used
> to
> >> >>>>> create
> >> >>>>> the collection is not modified by Solr, and that any change done
> via
> >> >>>>> API
> >> >>>>> only affects the single collection where the config command is
> >> >>>>> issued.
> >> >>>>>
> >> >>>>> I guess the schema API should start using something like
> >> >>>>> configoverlay,
> >> >>>>> or maybe persist the updated schema to this new path?
> >> >>>>>
> >> >>>>> Tomás
> >> >>>>>
> >> >>>>> On Fri, May 8, 2015 at 10:28 PM, Noble Paul <noble.paul@gmail.com
> >
> >> >>>>> wrote:
> >> >>>>>>
> >> >>>>>> I agree with you on the point that it causes confusion.
> >> >>>>>>
> >> >>>>>> My suggestion would be to have something called "config
> templates"
> >> >>>>>> and
> >> >>>>>> they are immutable . So , we don't need a configset API
> >> >>>>>> each collection have it's own conf folder .
> >> >>>>>>
> >> >>>>>> So, when a collection is created we should go ahead and create a
> >> >>>>>> corresponding conf dir.
> >> >>>>>>
> >> >>>>>> Ideally, it should not copy over all configs from it's template.
> It
> >> >>>>>> should just store the configoverlay.json, params.json in the
> >> >>>>>> collection's
> >> >>>>>> conf directory and inherit the rest from the template
> >> >>>>>>
> >> >>>>>>
> >> >>>>>>
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> On Sat, May 9, 2015 at 9:35 AM, Tomás Fernández Löbbe
> >> >>>>>> <to...@gmail.com> wrote:
> >> >>>>>>>
> >> >>>>>>> I think the concept of ConfigSets has become a bit confusing
> with
> >> >>>>>>> the
> >> >>>>>>> Config APIs (I'm thinking in SolrCloud mode here). Solr requires
> >> >>>>>>> that a
> >> >>>>>>> configset is pushed to ZooKeeper before creating a collection
> that
> >> >>>>>>> uses it.
> >> >>>>>>> It supports multiple collections using the same configset,
> which I
> >> >>>>>>> think is
> >> >>>>>>> great. You could also have a couple of configsets that no
> >> >>>>>>> collection is
> >> >>>>>>> currently using (who knows, maybe one that was recently
> >> >>>>>>> deprecated, or that
> >> >>>>>>> will be used soon, etc). This gives me the idea that configsets
> >> >>>>>>> are a
> >> >>>>>>> separate entity than the collection, not just a collection's
> >> >>>>>>> configuration.
> >> >>>>>>>
> >> >>>>>>> Config APIs allow you to operate on a collection to add
> handlers,
> >> >>>>>>> change settings, etc. The problem is that you are not really
> >> >>>>>>> applying the
> >> >>>>>>> changes to the collection but to the complete configset. All
> >> >>>>>>> collections
> >> >>>>>>> using it will get the changes, and all of them will be reloaded
> >> >>>>>>> after a
> >> >>>>>>> change.
> >> >>>>>>>
> >> >>>>>>> Shouldn't those APIs be at a different level/outside the
> >> >>>>>>> collection?
> >> >>>>>>> Maybe a configset API? Or, maybe the configs (for example, the
> >> >>>>>>> configoverlay.json) should only apply to the collection where
> the
> >> >>>>>>> API call
> >> >>>>>>> was made and not to other collections using the configset?
> >> >>>>>>>
> >> >>>>>>> Tomás
> >> >>>>>>
> >> >>>>>>
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> --
> >> >>>>>> -----------------------------------------------------
> >> >>>>>> Noble Paul
> >> >>>>>
> >> >>>>>
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>> --
> >> >>>> -----------------------------------------------------
> >> >>>> Noble Paul
> >> >>>
> >> >>>
> >> >>>
> >> >>
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: dev-help@lucene.apache.org
> >>
> >
>
>
>
> --
> -----------------------------------------------------
> Noble Paul
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

Re: Configsets and Config APIs in Solr

Posted by Noble Paul <no...@gmail.com>.
>> but I think it would be better if the mutable part is placed under /collections/x/..., otherwise "/configs"

it makes sense.

The problem we have is managedschema currently writes to the same

If we could change the managedschema behavior somehow it would have been better

On Fri, May 22, 2015 at 10:21 PM, Tomás Fernández Löbbe
<to...@gmail.com> wrote:
>> TLDR: we should think about this as configset base vs per-collection diff,
>> not as immutable base vs per-collection mutable.
>
> Makes sense, I was mostly thinking of it being immutable from the current
> Config APIs. Editing a configset for multiple collection is a valid and
> useful feature, the problem is doing that from inside one collection's API
> call.
>
>> So then the question becomes, do we want an API that can *also* make
>> collection-specific changes to a shared config?
>
> If we feel there is no need for collection-specific config changes, I'm OK,
> but again, the API should be outside of the collection, like a Configset
> API. The "generate configset based on X" should also be a command of this
> API. In addition, this could allow users to edit a configset that's not
> currently being used by any collection.
>
> Tomás
>
>
> On Fri, May 22, 2015 at 7:10 AM, Yonik Seeley <ys...@gmail.com> wrote:
>>
>> Makes sense Greg.
>>
>> Just looking at it from the ZK perspective (APIs aside), the original
>> idea behind referencing a config set by name was so that you could
>> change it in one place and everyone relying on it would get the
>> changes.
>>
>> If one wants collections to have separate independent config sets they
>> can already do that.
>>
>> So then the question becomes, do we want an API that can *also* make
>> collection-specific changes to a shared config?
>>
>> An alternative would be a command to make a copy of a config set, and
>> a command to switch a specific collection to use that new config set.
>> Then any further changes would be collection specific.  That's sort of
>> like SOLR-5955 - config templates - but you can "template" off of any
>> other config set, at any point in time.  Actually, that type of
>> functionality seems generally useful regardless.
>>
>> -Yonik
>>
>>
>> On Thu, May 21, 2015 at 8:07 PM, Gregory Chanan <gc...@cloudera.com>
>> wrote:
>> > I'm +1 on the general idea, but I'm not convinced about the
>> > mutable/immutable separation.
>> >
>> > Do we not think it is valid to modify a single config(set) that affects
>> > multiple collections?  I can imagine a case where my data with the same
>> > config(set) is partitioned into many different collections, whether by
>> > date,
>> > sorted order, etc. that all use the same underlying config(set).  Let's
>> > say
>> > I have collections partitioned by month and I decide I want to add
>> > another
>> > field; I don't want to have to modify
>> > jan/schema
>> > feb/schema
>> > mar/schema
>> > etc.
>> >
>> > I just want to modify the single underlying config(set).  You can
>> > imagine
>> > having a configset API that let's me do that, so if I wanted to modify a
>> > single collection's config I would call:
>> > jan/schema
>> > but if i wanted to modify the underlying config(set) I would call:
>> > configset/month_partitioned_config
>> >
>> > My point is this: if the problem is that it is confusing to have
>> > configsets
>> > modified when you make collection-level calls, then we should fix that
>> > (I'm
>> > 100% in agreement with that, btw).  You can fix that by having a
>> > configset
>> > and a per-collection diff; defining the configset as immutable doesn't
>> > solve
>> > the problem, only locks us into a implementation that doesn't support
>> > the
>> > use case above.  I'm not even saying we should implement a configset
>> > API,
>> > only that defining this as an immutable vs mutable implementation blocks
>> > us
>> > from doing that.
>> >
>> > TLDR: we should think about this as configset base vs per-collection
>> > diff,
>> > not as immutable base vs per-collection mutable.
>> >
>> > Thoughts?
>> > Greg
>> >
>> >
>> > On Tue, May 19, 2015 at 10:52 AM, Tomás Fernández Löbbe
>> > <to...@gmail.com> wrote:
>> >>
>> >> I created https://issues.apache.org/jira/browse/SOLR-7570
>> >>
>> >> On Fri, May 15, 2015 at 10:31 AM, Alan Woodward <al...@flax.co.uk>
>> >> wrote:
>> >>>
>> >>> +1
>> >>>
>> >>> A nice way of doing it would be to make it part of the
>> >>> SolrResourceLoader
>> >>> interface.  The ZK resource loader could check in the
>> >>> collection-specific
>> >>> zknode first, and then under configs/, and we could add a
>> >>> writeResource()
>> >>> method that writes to the collection-specific node as well.  Then all
>> >>> config
>> >>> I/O goes via the resource loader, and we have a way of keeping certain
>> >>> parts
>> >>> immutable.
>> >>>
>> >>> On 15 May 2015, at 17:39, Tomás Fernández Löbbe
>> >>> <to...@gmail.com>
>> >>> wrote:
>> >>>
>> >>> I agree about differentiating the mutable part (configoverlay,
>> >>> generated
>> >>> schema, etc) and the immutable (the configset) , but I think it would
>> >>> be
>> >>> better if the mutable part is placed under /collections/x/...,
>> >>> otherwise
>> >>> "/configs" would have a mix of ConfigSets and collection-specific
>> >>> configuration.
>> >>>
>> >>> On Fri, May 15, 2015 at 6:38 AM, Noble Paul <no...@gmail.com>
>> >>> wrote:
>> >>>>
>> >>>> I think this needs more discussion
>> >>>>
>> >>>> When a collection is created we should have two things
>> >>>>
>> >>>> an immutable part and a mutable part
>> >>>>
>> >>>> for instance my collection name is "x" and it uses schemaless example
>> >>>> conf
>> >>>>
>> >>>> I must now have two conf dirs
>> >>>>
>> >>>> configs/schemaless and
>> >>>> configs/x
>> >>>>
>> >>>> all the mutable stuff goes to configs/x
>> >>>>
>> >>>> and config/schemaless remains immutable
>> >>>>
>> >>>>
>> >>>>
>> >>>> On Tue, May 12, 2015 at 2:23 AM, Tomás Fernández Löbbe
>> >>>> <to...@gmail.com> wrote:
>> >>>>>
>> >>>>> I think this is fine.I don't think we need a new concept of "config
>> >>>>> templates", we just need to make it clear that the configset used to
>> >>>>> create
>> >>>>> the collection is not modified by Solr, and that any change done via
>> >>>>> API
>> >>>>> only affects the single collection where the config command is
>> >>>>> issued.
>> >>>>>
>> >>>>> I guess the schema API should start using something like
>> >>>>> configoverlay,
>> >>>>> or maybe persist the updated schema to this new path?
>> >>>>>
>> >>>>> Tomás
>> >>>>>
>> >>>>> On Fri, May 8, 2015 at 10:28 PM, Noble Paul <no...@gmail.com>
>> >>>>> wrote:
>> >>>>>>
>> >>>>>> I agree with you on the point that it causes confusion.
>> >>>>>>
>> >>>>>> My suggestion would be to have something called "config templates"
>> >>>>>> and
>> >>>>>> they are immutable . So , we don't need a configset API
>> >>>>>> each collection have it's own conf folder .
>> >>>>>>
>> >>>>>> So, when a collection is created we should go ahead and create a
>> >>>>>> corresponding conf dir.
>> >>>>>>
>> >>>>>> Ideally, it should not copy over all configs from it's template. It
>> >>>>>> should just store the configoverlay.json, params.json in the
>> >>>>>> collection's
>> >>>>>> conf directory and inherit the rest from the template
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> On Sat, May 9, 2015 at 9:35 AM, Tomás Fernández Löbbe
>> >>>>>> <to...@gmail.com> wrote:
>> >>>>>>>
>> >>>>>>> I think the concept of ConfigSets has become a bit confusing with
>> >>>>>>> the
>> >>>>>>> Config APIs (I'm thinking in SolrCloud mode here). Solr requires
>> >>>>>>> that a
>> >>>>>>> configset is pushed to ZooKeeper before creating a collection that
>> >>>>>>> uses it.
>> >>>>>>> It supports multiple collections using the same configset, which I
>> >>>>>>> think is
>> >>>>>>> great. You could also have a couple of configsets that no
>> >>>>>>> collection is
>> >>>>>>> currently using (who knows, maybe one that was recently
>> >>>>>>> deprecated, or that
>> >>>>>>> will be used soon, etc). This gives me the idea that configsets
>> >>>>>>> are a
>> >>>>>>> separate entity than the collection, not just a collection's
>> >>>>>>> configuration.
>> >>>>>>>
>> >>>>>>> Config APIs allow you to operate on a collection to add handlers,
>> >>>>>>> change settings, etc. The problem is that you are not really
>> >>>>>>> applying the
>> >>>>>>> changes to the collection but to the complete configset. All
>> >>>>>>> collections
>> >>>>>>> using it will get the changes, and all of them will be reloaded
>> >>>>>>> after a
>> >>>>>>> change.
>> >>>>>>>
>> >>>>>>> Shouldn't those APIs be at a different level/outside the
>> >>>>>>> collection?
>> >>>>>>> Maybe a configset API? Or, maybe the configs (for example, the
>> >>>>>>> configoverlay.json) should only apply to the collection where the
>> >>>>>>> API call
>> >>>>>>> was made and not to other collections using the configset?
>> >>>>>>>
>> >>>>>>> Tomás
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> --
>> >>>>>> -----------------------------------------------------
>> >>>>>> Noble Paul
>> >>>>>
>> >>>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> -----------------------------------------------------
>> >>>> Noble Paul
>> >>>
>> >>>
>> >>>
>> >>
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>



-- 
-----------------------------------------------------
Noble Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Configsets and Config APIs in Solr

Posted by Tomás Fernández Löbbe <to...@gmail.com>.
> TLDR: we should think about this as configset base vs per-collection diff,
> not as immutable base vs per-collection mutable.

Makes sense, I was mostly thinking of it being immutable from the current
Config APIs. Editing a configset for multiple collection is a valid and
useful feature, the problem is doing that from inside one collection's API
call.

> So then the question becomes, do we want an API that can *also* make
> collection-specific changes to a shared config?

If we feel there is no need for collection-specific config changes, I'm OK,
but again, the API should be outside of the collection, like a Configset
API. The "generate configset based on X" should also be a command of this
API. In addition, this could allow users to edit a configset that's not
currently being used by any collection.

Tomás


On Fri, May 22, 2015 at 7:10 AM, Yonik Seeley <ys...@gmail.com> wrote:

> Makes sense Greg.
>
> Just looking at it from the ZK perspective (APIs aside), the original
> idea behind referencing a config set by name was so that you could
> change it in one place and everyone relying on it would get the
> changes.
>
> If one wants collections to have separate independent config sets they
> can already do that.
>
> So then the question becomes, do we want an API that can *also* make
> collection-specific changes to a shared config?
>
> An alternative would be a command to make a copy of a config set, and
> a command to switch a specific collection to use that new config set.
> Then any further changes would be collection specific.  That's sort of
> like SOLR-5955 - config templates - but you can "template" off of any
> other config set, at any point in time.  Actually, that type of
> functionality seems generally useful regardless.
>
> -Yonik
>
>
> On Thu, May 21, 2015 at 8:07 PM, Gregory Chanan <gc...@cloudera.com>
> wrote:
> > I'm +1 on the general idea, but I'm not convinced about the
> > mutable/immutable separation.
> >
> > Do we not think it is valid to modify a single config(set) that affects
> > multiple collections?  I can imagine a case where my data with the same
> > config(set) is partitioned into many different collections, whether by
> date,
> > sorted order, etc. that all use the same underlying config(set).  Let's
> say
> > I have collections partitioned by month and I decide I want to add
> another
> > field; I don't want to have to modify
> > jan/schema
> > feb/schema
> > mar/schema
> > etc.
> >
> > I just want to modify the single underlying config(set).  You can imagine
> > having a configset API that let's me do that, so if I wanted to modify a
> > single collection's config I would call:
> > jan/schema
> > but if i wanted to modify the underlying config(set) I would call:
> > configset/month_partitioned_config
> >
> > My point is this: if the problem is that it is confusing to have
> configsets
> > modified when you make collection-level calls, then we should fix that
> (I'm
> > 100% in agreement with that, btw).  You can fix that by having a
> configset
> > and a per-collection diff; defining the configset as immutable doesn't
> solve
> > the problem, only locks us into a implementation that doesn't support the
> > use case above.  I'm not even saying we should implement a configset API,
> > only that defining this as an immutable vs mutable implementation blocks
> us
> > from doing that.
> >
> > TLDR: we should think about this as configset base vs per-collection
> diff,
> > not as immutable base vs per-collection mutable.
> >
> > Thoughts?
> > Greg
> >
> >
> > On Tue, May 19, 2015 at 10:52 AM, Tomás Fernández Löbbe
> > <to...@gmail.com> wrote:
> >>
> >> I created https://issues.apache.org/jira/browse/SOLR-7570
> >>
> >> On Fri, May 15, 2015 at 10:31 AM, Alan Woodward <al...@flax.co.uk>
> wrote:
> >>>
> >>> +1
> >>>
> >>> A nice way of doing it would be to make it part of the
> SolrResourceLoader
> >>> interface.  The ZK resource loader could check in the
> collection-specific
> >>> zknode first, and then under configs/, and we could add a
> writeResource()
> >>> method that writes to the collection-specific node as well.  Then all
> config
> >>> I/O goes via the resource loader, and we have a way of keeping certain
> parts
> >>> immutable.
> >>>
> >>> On 15 May 2015, at 17:39, Tomás Fernández Löbbe <tomasflobbe@gmail.com
> >
> >>> wrote:
> >>>
> >>> I agree about differentiating the mutable part (configoverlay,
> generated
> >>> schema, etc) and the immutable (the configset) , but I think it would
> be
> >>> better if the mutable part is placed under /collections/x/...,
> otherwise
> >>> "/configs" would have a mix of ConfigSets and collection-specific
> >>> configuration.
> >>>
> >>> On Fri, May 15, 2015 at 6:38 AM, Noble Paul <no...@gmail.com>
> wrote:
> >>>>
> >>>> I think this needs more discussion
> >>>>
> >>>> When a collection is created we should have two things
> >>>>
> >>>> an immutable part and a mutable part
> >>>>
> >>>> for instance my collection name is "x" and it uses schemaless example
> >>>> conf
> >>>>
> >>>> I must now have two conf dirs
> >>>>
> >>>> configs/schemaless and
> >>>> configs/x
> >>>>
> >>>> all the mutable stuff goes to configs/x
> >>>>
> >>>> and config/schemaless remains immutable
> >>>>
> >>>>
> >>>>
> >>>> On Tue, May 12, 2015 at 2:23 AM, Tomás Fernández Löbbe
> >>>> <to...@gmail.com> wrote:
> >>>>>
> >>>>> I think this is fine.I don't think we need a new concept of "config
> >>>>> templates", we just need to make it clear that the configset used to
> create
> >>>>> the collection is not modified by Solr, and that any change done via
> API
> >>>>> only affects the single collection where the config command is
> issued.
> >>>>>
> >>>>> I guess the schema API should start using something like
> configoverlay,
> >>>>> or maybe persist the updated schema to this new path?
> >>>>>
> >>>>> Tomás
> >>>>>
> >>>>> On Fri, May 8, 2015 at 10:28 PM, Noble Paul <no...@gmail.com>
> >>>>> wrote:
> >>>>>>
> >>>>>> I agree with you on the point that it causes confusion.
> >>>>>>
> >>>>>> My suggestion would be to have something called "config templates"
> and
> >>>>>> they are immutable . So , we don't need a configset API
> >>>>>> each collection have it's own conf folder .
> >>>>>>
> >>>>>> So, when a collection is created we should go ahead and create a
> >>>>>> corresponding conf dir.
> >>>>>>
> >>>>>> Ideally, it should not copy over all configs from it's template. It
> >>>>>> should just store the configoverlay.json, params.json in the
> collection's
> >>>>>> conf directory and inherit the rest from the template
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Sat, May 9, 2015 at 9:35 AM, Tomás Fernández Löbbe
> >>>>>> <to...@gmail.com> wrote:
> >>>>>>>
> >>>>>>> I think the concept of ConfigSets has become a bit confusing with
> the
> >>>>>>> Config APIs (I'm thinking in SolrCloud mode here). Solr requires
> that a
> >>>>>>> configset is pushed to ZooKeeper before creating a collection that
> uses it.
> >>>>>>> It supports multiple collections using the same configset, which I
> think is
> >>>>>>> great. You could also have a couple of configsets that no
> collection is
> >>>>>>> currently using (who knows, maybe one that was recently
> deprecated, or that
> >>>>>>> will be used soon, etc). This gives me the idea that configsets
> are a
> >>>>>>> separate entity than the collection, not just a collection's
> configuration.
> >>>>>>>
> >>>>>>> Config APIs allow you to operate on a collection to add handlers,
> >>>>>>> change settings, etc. The problem is that you are not really
> applying the
> >>>>>>> changes to the collection but to the complete configset. All
> collections
> >>>>>>> using it will get the changes, and all of them will be reloaded
> after a
> >>>>>>> change.
> >>>>>>>
> >>>>>>> Shouldn't those APIs be at a different level/outside the
> collection?
> >>>>>>> Maybe a configset API? Or, maybe the configs (for example, the
> >>>>>>> configoverlay.json) should only apply to the collection where the
> API call
> >>>>>>> was made and not to other collections using the configset?
> >>>>>>>
> >>>>>>> Tomás
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> -----------------------------------------------------
> >>>>>> Noble Paul
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> -----------------------------------------------------
> >>>> Noble Paul
> >>>
> >>>
> >>>
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

Re: Configsets and Config APIs in Solr

Posted by Yonik Seeley <ys...@gmail.com>.
Makes sense Greg.

Just looking at it from the ZK perspective (APIs aside), the original
idea behind referencing a config set by name was so that you could
change it in one place and everyone relying on it would get the
changes.

If one wants collections to have separate independent config sets they
can already do that.

So then the question becomes, do we want an API that can *also* make
collection-specific changes to a shared config?

An alternative would be a command to make a copy of a config set, and
a command to switch a specific collection to use that new config set.
Then any further changes would be collection specific.  That's sort of
like SOLR-5955 - config templates - but you can "template" off of any
other config set, at any point in time.  Actually, that type of
functionality seems generally useful regardless.

-Yonik


On Thu, May 21, 2015 at 8:07 PM, Gregory Chanan <gc...@cloudera.com> wrote:
> I'm +1 on the general idea, but I'm not convinced about the
> mutable/immutable separation.
>
> Do we not think it is valid to modify a single config(set) that affects
> multiple collections?  I can imagine a case where my data with the same
> config(set) is partitioned into many different collections, whether by date,
> sorted order, etc. that all use the same underlying config(set).  Let's say
> I have collections partitioned by month and I decide I want to add another
> field; I don't want to have to modify
> jan/schema
> feb/schema
> mar/schema
> etc.
>
> I just want to modify the single underlying config(set).  You can imagine
> having a configset API that let's me do that, so if I wanted to modify a
> single collection's config I would call:
> jan/schema
> but if i wanted to modify the underlying config(set) I would call:
> configset/month_partitioned_config
>
> My point is this: if the problem is that it is confusing to have configsets
> modified when you make collection-level calls, then we should fix that (I'm
> 100% in agreement with that, btw).  You can fix that by having a configset
> and a per-collection diff; defining the configset as immutable doesn't solve
> the problem, only locks us into a implementation that doesn't support the
> use case above.  I'm not even saying we should implement a configset API,
> only that defining this as an immutable vs mutable implementation blocks us
> from doing that.
>
> TLDR: we should think about this as configset base vs per-collection diff,
> not as immutable base vs per-collection mutable.
>
> Thoughts?
> Greg
>
>
> On Tue, May 19, 2015 at 10:52 AM, Tomás Fernández Löbbe
> <to...@gmail.com> wrote:
>>
>> I created https://issues.apache.org/jira/browse/SOLR-7570
>>
>> On Fri, May 15, 2015 at 10:31 AM, Alan Woodward <al...@flax.co.uk> wrote:
>>>
>>> +1
>>>
>>> A nice way of doing it would be to make it part of the SolrResourceLoader
>>> interface.  The ZK resource loader could check in the collection-specific
>>> zknode first, and then under configs/, and we could add a writeResource()
>>> method that writes to the collection-specific node as well.  Then all config
>>> I/O goes via the resource loader, and we have a way of keeping certain parts
>>> immutable.
>>>
>>> On 15 May 2015, at 17:39, Tomás Fernández Löbbe <to...@gmail.com>
>>> wrote:
>>>
>>> I agree about differentiating the mutable part (configoverlay, generated
>>> schema, etc) and the immutable (the configset) , but I think it would be
>>> better if the mutable part is placed under /collections/x/..., otherwise
>>> "/configs" would have a mix of ConfigSets and collection-specific
>>> configuration.
>>>
>>> On Fri, May 15, 2015 at 6:38 AM, Noble Paul <no...@gmail.com> wrote:
>>>>
>>>> I think this needs more discussion
>>>>
>>>> When a collection is created we should have two things
>>>>
>>>> an immutable part and a mutable part
>>>>
>>>> for instance my collection name is "x" and it uses schemaless example
>>>> conf
>>>>
>>>> I must now have two conf dirs
>>>>
>>>> configs/schemaless and
>>>> configs/x
>>>>
>>>> all the mutable stuff goes to configs/x
>>>>
>>>> and config/schemaless remains immutable
>>>>
>>>>
>>>>
>>>> On Tue, May 12, 2015 at 2:23 AM, Tomás Fernández Löbbe
>>>> <to...@gmail.com> wrote:
>>>>>
>>>>> I think this is fine.I don't think we need a new concept of "config
>>>>> templates", we just need to make it clear that the configset used to create
>>>>> the collection is not modified by Solr, and that any change done via API
>>>>> only affects the single collection where the config command is issued.
>>>>>
>>>>> I guess the schema API should start using something like configoverlay,
>>>>> or maybe persist the updated schema to this new path?
>>>>>
>>>>> Tomás
>>>>>
>>>>> On Fri, May 8, 2015 at 10:28 PM, Noble Paul <no...@gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> I agree with you on the point that it causes confusion.
>>>>>>
>>>>>> My suggestion would be to have something called "config templates" and
>>>>>> they are immutable . So , we don't need a configset API
>>>>>> each collection have it's own conf folder .
>>>>>>
>>>>>> So, when a collection is created we should go ahead and create a
>>>>>> corresponding conf dir.
>>>>>>
>>>>>> Ideally, it should not copy over all configs from it's template. It
>>>>>> should just store the configoverlay.json, params.json in the collection's
>>>>>> conf directory and inherit the rest from the template
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sat, May 9, 2015 at 9:35 AM, Tomás Fernández Löbbe
>>>>>> <to...@gmail.com> wrote:
>>>>>>>
>>>>>>> I think the concept of ConfigSets has become a bit confusing with the
>>>>>>> Config APIs (I'm thinking in SolrCloud mode here). Solr requires that a
>>>>>>> configset is pushed to ZooKeeper before creating a collection that uses it.
>>>>>>> It supports multiple collections using the same configset, which I think is
>>>>>>> great. You could also have a couple of configsets that no collection is
>>>>>>> currently using (who knows, maybe one that was recently deprecated, or that
>>>>>>> will be used soon, etc). This gives me the idea that configsets are a
>>>>>>> separate entity than the collection, not just a collection's configuration.
>>>>>>>
>>>>>>> Config APIs allow you to operate on a collection to add handlers,
>>>>>>> change settings, etc. The problem is that you are not really applying the
>>>>>>> changes to the collection but to the complete configset. All collections
>>>>>>> using it will get the changes, and all of them will be reloaded after a
>>>>>>> change.
>>>>>>>
>>>>>>> Shouldn't those APIs be at a different level/outside the collection?
>>>>>>> Maybe a configset API? Or, maybe the configs (for example, the
>>>>>>> configoverlay.json) should only apply to the collection where the API call
>>>>>>> was made and not to other collections using the configset?
>>>>>>>
>>>>>>> Tomás
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> -----------------------------------------------------
>>>>>> Noble Paul
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> -----------------------------------------------------
>>>> Noble Paul
>>>
>>>
>>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Configsets and Config APIs in Solr

Posted by Gregory Chanan <gc...@cloudera.com>.
I'm +1 on the general idea, but I'm not convinced about the
mutable/immutable separation.

Do we not think it is valid to modify a single config(set) that affects
multiple collections?  I can imagine a case where my data with the same
config(set) is partitioned into many different collections, whether by
date, sorted order, etc. that all use the same underlying config(set).
Let's say I have collections partitioned by month and I decide I want to
add another field; I don't want to have to modify
jan/schema
feb/schema
mar/schema
etc.

I just want to modify the single underlying config(set).  You can imagine
having a configset API that let's me do that, so if I wanted to modify a
single collection's config I would call:
jan/schema
but if i wanted to modify the underlying config(set) I would call:
configset/month_partitioned_config

My point is this: if the problem is that it is confusing to have configsets
modified when you make collection-level calls, then we should fix that (I'm
100% in agreement with that, btw).  You can fix that by having a configset
and a per-collection diff; defining the configset as immutable doesn't
solve the problem, only locks us into a implementation that doesn't support
the use case above.  I'm not even saying we should implement a configset
API, only that defining this as an immutable vs mutable implementation
blocks us from doing that.

TLDR: we should think about this as configset base vs per-collection diff,
not as immutable base vs per-collection mutable.

Thoughts?
Greg


On Tue, May 19, 2015 at 10:52 AM, Tomás Fernández Löbbe <
tomasflobbe@gmail.com> wrote:

> I created https://issues.apache.org/jira/browse/SOLR-7570
>
> On Fri, May 15, 2015 at 10:31 AM, Alan Woodward <al...@flax.co.uk> wrote:
>
>> +1
>>
>> A nice way of doing it would be to make it part of the SolrResourceLoader
>> interface.  The ZK resource loader could check in the collection-specific
>> zknode first, and then under configs/, and we could add a writeResource()
>> method that writes to the collection-specific node as well.  Then all
>> config I/O goes via the resource loader, and we have a way of keeping
>> certain parts immutable.
>>
>> On 15 May 2015, at 17:39, Tomás Fernández Löbbe <to...@gmail.com>
>> wrote:
>>
>> I agree about differentiating the mutable part (configoverlay, generated
>> schema, etc) and the immutable (the configset) , but I think it would be
>> better if the mutable part is placed under /collections/x/..., otherwise
>> "/configs" would have a mix of ConfigSets and collection-specific
>> configuration.
>>
>> On Fri, May 15, 2015 at 6:38 AM, Noble Paul <no...@gmail.com> wrote:
>>
>>> I think this needs more discussion
>>>
>>> When a collection is created we should have two things
>>>
>>> an immutable part and a mutable part
>>>
>>> for instance my collection name is "x" and it uses schemaless example
>>> conf
>>>
>>> I must now have two conf dirs
>>>
>>> configs/schemaless and
>>> configs/x
>>>
>>> all the mutable stuff goes to configs/x
>>>
>>> and config/schemaless remains immutable
>>>
>>>
>>>
>>> On Tue, May 12, 2015 at 2:23 AM, Tomás Fernández Löbbe <
>>> tomasflobbe@gmail.com> wrote:
>>>
>>>> I think this is fine.I don't think we need a new concept of "config
>>>> templates", we just need to make it clear that the configset used to create
>>>> the collection is not modified by Solr, and that any change done via API
>>>> only affects the single collection where the config command is issued.
>>>>
>>>> I guess the schema API should start using something like configoverlay,
>>>> or maybe persist the updated schema to this new path?
>>>>
>>>> Tomás
>>>>
>>>> On Fri, May 8, 2015 at 10:28 PM, Noble Paul <no...@gmail.com>
>>>> wrote:
>>>>
>>>>> I agree with you on the point that it causes confusion.
>>>>>
>>>>> My suggestion would be to have something called "config templates" and
>>>>> they are immutable . So , we don't need a configset API
>>>>> each collection have it's own conf folder .
>>>>>
>>>>> So, when a collection is created we should go ahead and create a
>>>>> corresponding conf dir.
>>>>>
>>>>> Ideally, it should not copy over all configs from it's template. It
>>>>> should just store the configoverlay.json, params.json in the collection's
>>>>> conf directory and inherit the rest from the template
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Sat, May 9, 2015 at 9:35 AM, Tomás Fernández Löbbe <
>>>>> tomasflobbe@gmail.com> wrote:
>>>>>
>>>>>> I think the concept of ConfigSets has become a bit confusing with the
>>>>>> Config APIs (I'm thinking in SolrCloud mode here). Solr requires that a
>>>>>> configset is pushed to ZooKeeper before creating a collection that uses it.
>>>>>> It supports multiple collections using the same configset, which I think is
>>>>>> great. You could also have a couple of configsets that no collection is
>>>>>> currently using (who knows, maybe one that was recently deprecated, or that
>>>>>> will be used soon, etc). This gives me the idea that configsets are a
>>>>>> separate entity than the collection, not just a collection's configuration.
>>>>>>
>>>>>> Config APIs allow you to operate on a collection to add handlers,
>>>>>> change settings, etc. The problem is that you are not really applying the
>>>>>> changes to the collection but to the complete configset. All collections
>>>>>> using it will get the changes, and all of them will be reloaded after a
>>>>>> change.
>>>>>>
>>>>>> Shouldn't those APIs be at a different level/outside the collection?
>>>>>> Maybe a configset API? Or, maybe the configs (for example, the
>>>>>> configoverlay.json) should only apply to the collection where the API call
>>>>>> was made and not to other collections using the configset?
>>>>>>
>>>>>> Tomás
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> -----------------------------------------------------
>>>>> Noble Paul
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> -----------------------------------------------------
>>> Noble Paul
>>>
>>
>>
>>
>

Re: Configsets and Config APIs in Solr

Posted by Tomás Fernández Löbbe <to...@gmail.com>.
I created https://issues.apache.org/jira/browse/SOLR-7570

On Fri, May 15, 2015 at 10:31 AM, Alan Woodward <al...@flax.co.uk> wrote:

> +1
>
> A nice way of doing it would be to make it part of the SolrResourceLoader
> interface.  The ZK resource loader could check in the collection-specific
> zknode first, and then under configs/, and we could add a writeResource()
> method that writes to the collection-specific node as well.  Then all
> config I/O goes via the resource loader, and we have a way of keeping
> certain parts immutable.
>
> On 15 May 2015, at 17:39, Tomás Fernández Löbbe <to...@gmail.com>
> wrote:
>
> I agree about differentiating the mutable part (configoverlay, generated
> schema, etc) and the immutable (the configset) , but I think it would be
> better if the mutable part is placed under /collections/x/..., otherwise
> "/configs" would have a mix of ConfigSets and collection-specific
> configuration.
>
> On Fri, May 15, 2015 at 6:38 AM, Noble Paul <no...@gmail.com> wrote:
>
>> I think this needs more discussion
>>
>> When a collection is created we should have two things
>>
>> an immutable part and a mutable part
>>
>> for instance my collection name is "x" and it uses schemaless example conf
>>
>> I must now have two conf dirs
>>
>> configs/schemaless and
>> configs/x
>>
>> all the mutable stuff goes to configs/x
>>
>> and config/schemaless remains immutable
>>
>>
>>
>> On Tue, May 12, 2015 at 2:23 AM, Tomás Fernández Löbbe <
>> tomasflobbe@gmail.com> wrote:
>>
>>> I think this is fine.I don't think we need a new concept of "config
>>> templates", we just need to make it clear that the configset used to create
>>> the collection is not modified by Solr, and that any change done via API
>>> only affects the single collection where the config command is issued.
>>>
>>> I guess the schema API should start using something like configoverlay,
>>> or maybe persist the updated schema to this new path?
>>>
>>> Tomás
>>>
>>> On Fri, May 8, 2015 at 10:28 PM, Noble Paul <no...@gmail.com>
>>> wrote:
>>>
>>>> I agree with you on the point that it causes confusion.
>>>>
>>>> My suggestion would be to have something called "config templates" and
>>>> they are immutable . So , we don't need a configset API
>>>> each collection have it's own conf folder .
>>>>
>>>> So, when a collection is created we should go ahead and create a
>>>> corresponding conf dir.
>>>>
>>>> Ideally, it should not copy over all configs from it's template. It
>>>> should just store the configoverlay.json, params.json in the collection's
>>>> conf directory and inherit the rest from the template
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Sat, May 9, 2015 at 9:35 AM, Tomás Fernández Löbbe <
>>>> tomasflobbe@gmail.com> wrote:
>>>>
>>>>> I think the concept of ConfigSets has become a bit confusing with the
>>>>> Config APIs (I'm thinking in SolrCloud mode here). Solr requires that a
>>>>> configset is pushed to ZooKeeper before creating a collection that uses it.
>>>>> It supports multiple collections using the same configset, which I think is
>>>>> great. You could also have a couple of configsets that no collection is
>>>>> currently using (who knows, maybe one that was recently deprecated, or that
>>>>> will be used soon, etc). This gives me the idea that configsets are a
>>>>> separate entity than the collection, not just a collection's configuration.
>>>>>
>>>>> Config APIs allow you to operate on a collection to add handlers,
>>>>> change settings, etc. The problem is that you are not really applying the
>>>>> changes to the collection but to the complete configset. All collections
>>>>> using it will get the changes, and all of them will be reloaded after a
>>>>> change.
>>>>>
>>>>> Shouldn't those APIs be at a different level/outside the collection?
>>>>> Maybe a configset API? Or, maybe the configs (for example, the
>>>>> configoverlay.json) should only apply to the collection where the API call
>>>>> was made and not to other collections using the configset?
>>>>>
>>>>> Tomás
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> -----------------------------------------------------
>>>> Noble Paul
>>>>
>>>
>>>
>>
>>
>> --
>> -----------------------------------------------------
>> Noble Paul
>>
>
>
>

Re: Configsets and Config APIs in Solr

Posted by Alan Woodward <al...@flax.co.uk>.
+1

A nice way of doing it would be to make it part of the SolrResourceLoader interface.  The ZK resource loader could check in the collection-specific zknode first, and then under configs/, and we could add a writeResource() method that writes to the collection-specific node as well.  Then all config I/O goes via the resource loader, and we have a way of keeping certain parts immutable.

On 15 May 2015, at 17:39, Tomás Fernández Löbbe <to...@gmail.com> wrote:

> I agree about differentiating the mutable part (configoverlay, generated schema, etc) and the immutable (the configset) , but I think it would be better if the mutable part is placed under /collections/x/..., otherwise "/configs" would have a mix of ConfigSets and collection-specific configuration. 
> 
> On Fri, May 15, 2015 at 6:38 AM, Noble Paul <no...@gmail.com> wrote:
> I think this needs more discussion
> 
> When a collection is created we should have two things
> 
> an immutable part and a mutable part
> 
> for instance my collection name is "x" and it uses schemaless example conf
> 
> I must now have two conf dirs 
> 
> configs/schemaless and 
> configs/x
> 
> all the mutable stuff goes to configs/x
> 
> and config/schemaless remains immutable
> 
> 
> 
> On Tue, May 12, 2015 at 2:23 AM, Tomás Fernández Löbbe <to...@gmail.com> wrote:
> I think this is fine.I don't think we need a new concept of "config templates", we just need to make it clear that the configset used to create the collection is not modified by Solr, and that any change done via API only affects the single collection where the config command is issued. 
> 
> I guess the schema API should start using something like configoverlay, or maybe persist the updated schema to this new path?
> 
> Tomás
> 
> On Fri, May 8, 2015 at 10:28 PM, Noble Paul <no...@gmail.com> wrote:
> I agree with you on the point that it causes confusion.
> 
> My suggestion would be to have something called "config templates" and they are immutable . So , we don't need a configset API
> each collection have it's own conf folder .
> 
> So, when a collection is created we should go ahead and create a corresponding conf dir.
> 
> Ideally, it should not copy over all configs from it's template. It should just store the configoverlay.json, params.json in the collection's conf directory and inherit the rest from the template
> 
> 
> 
> 
> 
> On Sat, May 9, 2015 at 9:35 AM, Tomás Fernández Löbbe <to...@gmail.com> wrote:
> I think the concept of ConfigSets has become a bit confusing with the Config APIs (I'm thinking in SolrCloud mode here). Solr requires that a configset is pushed to ZooKeeper before creating a collection that uses it. It supports multiple collections using the same configset, which I think is great. You could also have a couple of configsets that no collection is currently using (who knows, maybe one that was recently deprecated, or that will be used soon, etc). This gives me the idea that configsets are a separate entity than the collection, not just a collection's configuration.
> 
> Config APIs allow you to operate on a collection to add handlers, change settings, etc. The problem is that you are not really applying the changes to the collection but to the complete configset. All collections using it will get the changes, and all of them will be reloaded after a change.
> 
> Shouldn't those APIs be at a different level/outside the collection? Maybe a configset API? Or, maybe the configs (for example, the configoverlay.json) should only apply to the collection where the API call was made and not to other collections using the configset?
> 
> Tomás
> 
> 
> 
> -- 
> -----------------------------------------------------
> Noble Paul
> 
> 
> 
> 
> -- 
> -----------------------------------------------------
> Noble Paul
> 


Re: Configsets and Config APIs in Solr

Posted by Tomás Fernández Löbbe <to...@gmail.com>.
I agree about differentiating the mutable part (configoverlay, generated
schema, etc) and the immutable (the configset) , but I think it would be
better if the mutable part is placed under /collections/x/..., otherwise
"/configs" would have a mix of ConfigSets and collection-specific
configuration.

On Fri, May 15, 2015 at 6:38 AM, Noble Paul <no...@gmail.com> wrote:

> I think this needs more discussion
>
> When a collection is created we should have two things
>
> an immutable part and a mutable part
>
> for instance my collection name is "x" and it uses schemaless example conf
>
> I must now have two conf dirs
>
> configs/schemaless and
> configs/x
>
> all the mutable stuff goes to configs/x
>
> and config/schemaless remains immutable
>
>
>
> On Tue, May 12, 2015 at 2:23 AM, Tomás Fernández Löbbe <
> tomasflobbe@gmail.com> wrote:
>
>> I think this is fine.I don't think we need a new concept of "config
>> templates", we just need to make it clear that the configset used to create
>> the collection is not modified by Solr, and that any change done via API
>> only affects the single collection where the config command is issued.
>>
>> I guess the schema API should start using something like configoverlay,
>> or maybe persist the updated schema to this new path?
>>
>> Tomás
>>
>> On Fri, May 8, 2015 at 10:28 PM, Noble Paul <no...@gmail.com> wrote:
>>
>>> I agree with you on the point that it causes confusion.
>>>
>>> My suggestion would be to have something called "config templates" and
>>> they are immutable . So , we don't need a configset API
>>> each collection have it's own conf folder .
>>>
>>> So, when a collection is created we should go ahead and create a
>>> corresponding conf dir.
>>>
>>> Ideally, it should not copy over all configs from it's template. It
>>> should just store the configoverlay.json, params.json in the collection's
>>> conf directory and inherit the rest from the template
>>>
>>>
>>>
>>>
>>>
>>> On Sat, May 9, 2015 at 9:35 AM, Tomás Fernández Löbbe <
>>> tomasflobbe@gmail.com> wrote:
>>>
>>>> I think the concept of ConfigSets has become a bit confusing with the
>>>> Config APIs (I'm thinking in SolrCloud mode here). Solr requires that a
>>>> configset is pushed to ZooKeeper before creating a collection that uses it.
>>>> It supports multiple collections using the same configset, which I think is
>>>> great. You could also have a couple of configsets that no collection is
>>>> currently using (who knows, maybe one that was recently deprecated, or that
>>>> will be used soon, etc). This gives me the idea that configsets are a
>>>> separate entity than the collection, not just a collection's configuration.
>>>>
>>>> Config APIs allow you to operate on a collection to add handlers,
>>>> change settings, etc. The problem is that you are not really applying the
>>>> changes to the collection but to the complete configset. All collections
>>>> using it will get the changes, and all of them will be reloaded after a
>>>> change.
>>>>
>>>> Shouldn't those APIs be at a different level/outside the collection?
>>>> Maybe a configset API? Or, maybe the configs (for example, the
>>>> configoverlay.json) should only apply to the collection where the API call
>>>> was made and not to other collections using the configset?
>>>>
>>>> Tomás
>>>>
>>>
>>>
>>>
>>> --
>>> -----------------------------------------------------
>>> Noble Paul
>>>
>>
>>
>
>
> --
> -----------------------------------------------------
> Noble Paul
>

Re: Configsets and Config APIs in Solr

Posted by Noble Paul <no...@gmail.com>.
I think this needs more discussion

When a collection is created we should have two things

an immutable part and a mutable part

for instance my collection name is "x" and it uses schemaless example conf

I must now have two conf dirs

configs/schemaless and
configs/x

all the mutable stuff goes to configs/x

and config/schemaless remains immutable



On Tue, May 12, 2015 at 2:23 AM, Tomás Fernández Löbbe <
tomasflobbe@gmail.com> wrote:

> I think this is fine.I don't think we need a new concept of "config
> templates", we just need to make it clear that the configset used to create
> the collection is not modified by Solr, and that any change done via API
> only affects the single collection where the config command is issued.
>
> I guess the schema API should start using something like configoverlay, or
> maybe persist the updated schema to this new path?
>
> Tomás
>
> On Fri, May 8, 2015 at 10:28 PM, Noble Paul <no...@gmail.com> wrote:
>
>> I agree with you on the point that it causes confusion.
>>
>> My suggestion would be to have something called "config templates" and
>> they are immutable . So , we don't need a configset API
>> each collection have it's own conf folder .
>>
>> So, when a collection is created we should go ahead and create a
>> corresponding conf dir.
>>
>> Ideally, it should not copy over all configs from it's template. It
>> should just store the configoverlay.json, params.json in the collection's
>> conf directory and inherit the rest from the template
>>
>>
>>
>>
>>
>> On Sat, May 9, 2015 at 9:35 AM, Tomás Fernández Löbbe <
>> tomasflobbe@gmail.com> wrote:
>>
>>> I think the concept of ConfigSets has become a bit confusing with the
>>> Config APIs (I'm thinking in SolrCloud mode here). Solr requires that a
>>> configset is pushed to ZooKeeper before creating a collection that uses it.
>>> It supports multiple collections using the same configset, which I think is
>>> great. You could also have a couple of configsets that no collection is
>>> currently using (who knows, maybe one that was recently deprecated, or that
>>> will be used soon, etc). This gives me the idea that configsets are a
>>> separate entity than the collection, not just a collection's configuration.
>>>
>>> Config APIs allow you to operate on a collection to add handlers, change
>>> settings, etc. The problem is that you are not really applying the changes
>>> to the collection but to the complete configset. All collections using it
>>> will get the changes, and all of them will be reloaded after a change.
>>>
>>> Shouldn't those APIs be at a different level/outside the collection?
>>> Maybe a configset API? Or, maybe the configs (for example, the
>>> configoverlay.json) should only apply to the collection where the API call
>>> was made and not to other collections using the configset?
>>>
>>> Tomás
>>>
>>
>>
>>
>> --
>> -----------------------------------------------------
>> Noble Paul
>>
>
>


-- 
-----------------------------------------------------
Noble Paul

Re: Configsets and Config APIs in Solr

Posted by Tomás Fernández Löbbe <to...@gmail.com>.
I think this is fine.I don't think we need a new concept of "config
templates", we just need to make it clear that the configset used to create
the collection is not modified by Solr, and that any change done via API
only affects the single collection where the config command is issued.

I guess the schema API should start using something like configoverlay, or
maybe persist the updated schema to this new path?

Tomás

On Fri, May 8, 2015 at 10:28 PM, Noble Paul <no...@gmail.com> wrote:

> I agree with you on the point that it causes confusion.
>
> My suggestion would be to have something called "config templates" and
> they are immutable . So , we don't need a configset API
> each collection have it's own conf folder .
>
> So, when a collection is created we should go ahead and create a
> corresponding conf dir.
>
> Ideally, it should not copy over all configs from it's template. It should
> just store the configoverlay.json, params.json in the collection's conf
> directory and inherit the rest from the template
>
>
>
>
>
> On Sat, May 9, 2015 at 9:35 AM, Tomás Fernández Löbbe <
> tomasflobbe@gmail.com> wrote:
>
>> I think the concept of ConfigSets has become a bit confusing with the
>> Config APIs (I'm thinking in SolrCloud mode here). Solr requires that a
>> configset is pushed to ZooKeeper before creating a collection that uses it.
>> It supports multiple collections using the same configset, which I think is
>> great. You could also have a couple of configsets that no collection is
>> currently using (who knows, maybe one that was recently deprecated, or that
>> will be used soon, etc). This gives me the idea that configsets are a
>> separate entity than the collection, not just a collection's configuration.
>>
>> Config APIs allow you to operate on a collection to add handlers, change
>> settings, etc. The problem is that you are not really applying the changes
>> to the collection but to the complete configset. All collections using it
>> will get the changes, and all of them will be reloaded after a change.
>>
>> Shouldn't those APIs be at a different level/outside the collection?
>> Maybe a configset API? Or, maybe the configs (for example, the
>> configoverlay.json) should only apply to the collection where the API call
>> was made and not to other collections using the configset?
>>
>> Tomás
>>
>
>
>
> --
> -----------------------------------------------------
> Noble Paul
>

Re: Configsets and Config APIs in Solr

Posted by Noble Paul <no...@gmail.com>.
I agree with you on the point that it causes confusion.

My suggestion would be to have something called "config templates" and they
are immutable . So , we don't need a configset API
each collection have it's own conf folder .

So, when a collection is created we should go ahead and create a
corresponding conf dir.

Ideally, it should not copy over all configs from it's template. It should
just store the configoverlay.json, params.json in the collection's conf
directory and inherit the rest from the template





On Sat, May 9, 2015 at 9:35 AM, Tomás Fernández Löbbe <tomasflobbe@gmail.com
> wrote:

> I think the concept of ConfigSets has become a bit confusing with the
> Config APIs (I'm thinking in SolrCloud mode here). Solr requires that a
> configset is pushed to ZooKeeper before creating a collection that uses it.
> It supports multiple collections using the same configset, which I think is
> great. You could also have a couple of configsets that no collection is
> currently using (who knows, maybe one that was recently deprecated, or that
> will be used soon, etc). This gives me the idea that configsets are a
> separate entity than the collection, not just a collection's configuration.
>
> Config APIs allow you to operate on a collection to add handlers, change
> settings, etc. The problem is that you are not really applying the changes
> to the collection but to the complete configset. All collections using it
> will get the changes, and all of them will be reloaded after a change.
>
> Shouldn't those APIs be at a different level/outside the collection? Maybe
> a configset API? Or, maybe the configs (for example, the
> configoverlay.json) should only apply to the collection where the API call
> was made and not to other collections using the configset?
>
> Tomás
>



-- 
-----------------------------------------------------
Noble Paul