You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Ishan Chattopadhyaya <ic...@gmail.com> on 2021/10/27 15:07:38 UTC

Re: Is it Time to Deprecate the Legacy Facets API

Should we deprecate classic faceting in 9x now?

> It's worth investigating deprecating the stats component also. I believe
JSON facets covers that functionality as well. It will be painful for users
though to switch over unfortunately.

+1, lets deprecate stats component too.


On Thu, Jan 28, 2021 at 5:22 AM Joel Bernstein <jo...@gmail.com> wrote:

> It's worth investigating deprecating the stats component also. I believe
> JSON facets covers that functionality as well. It will be painful for users
> though to switch over unfortunately.
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>
> On Fri, Jan 22, 2021 at 1:14 PM Jason Gerlowski <ge...@gmail.com>
> wrote:
>
>> Personally I'd love to see us stop maintaining the duplicated code of
>> the underlying implementations.  I wouldn't mind losing the legacy
>> syntax as well - I'll take a clear, verbose API over a less-clear,
>> concise one any day.  But I'm probably a minority there.
>>
>> Either way I agree with Michael when he said above that the first step
>> would have to be a parity investigation for features and performance.
>>
>> Best,
>>
>> Jason
>>
>> On Fri, Jan 22, 2021 at 10:05 AM Michael Gibney
>> <mi...@michaelgibney.net> wrote:
>> >
>> > I agree it would make long-term sense to consolidate the backend
>> implementation. I think leaving the "classic" user-facing facet API (with
>> JSON Facet module as a backend) would be a good idea. Either way, I think a
>> first step would be checking for parity between existing backend
>> implementations -- possibly in terms of features [1], but certainly in
>> terms of performance for common use cases [2].
>> >
>> > I think removal of the "classic" user-facing API would cause a lot of
>> consternation in the user community. I can even see a
>> non-backward-compatibility argument for preserving the "classic"
>> user-facing API: it's simpler for simple use cases. _If_ the ultimate goal
>> is removal of the "classic" user-facing API (not presuming that it is),
>> that approach could be facilitated in the short term by enticing users
>> towards "JSON Facet" API ... basically with a "feature freeze" on the
>> legacy implementation. No new features [3], no new optimizations [4] for
>> "classic"; concentrate such efforts on JSON Facet. This seems to already be
>> the de facto case, but it could be a more intentional decision -- e.g. in
>> [3] it's straightforward to extend the the proposed "facet cache" to the
>> "classic" impl ... but I could see an argument for intentionally not doing
>> so.
>> >
>> > Robert, I think your concerns about UninvertedField could be addressed
>> by the `uninvertible="false"` property (currently defaults to "true" for
>> backward compatibility iiuc; but could default to "false", or at least
>> provide the ability to set the default for all fields to "false" at node
>> level solr.xml? -- I know I've wished for the latter!). Also fwiw I'm not
>> aware of any JSON Facet processors that work with string values in RAM ...
>> I do think all JSON Facet processors use OrdinalMap now, where relevant.
>> >
>> > [1] https://issues.apache.org/jira/browse/SOLR-14921
>> > [2] https://issues.apache.org/jira/browse/SOLR-14764
>> > [3] https://issues.apache.org/jira/browse/SOLR-13807
>> > [4] https://issues.apache.org/jira/browse/SOLR-10732
>> >
>> > On Fri, Jan 22, 2021 at 12:46 AM Robert Muir <rc...@gmail.com> wrote:
>> >>
>> >> Do these two options conflate concerns of input format vs. actual
>> >> algorithm? That was always my disappointment.
>> >>
>> >> I feel like the java apis are off here at the lower level, and it
>> >> hurts the user.
>> >> I don't talk about the input format from the user, instead I mean the
>> >> execution of the faceting query.
>> >>
>> >> IMO: building top-level caches (e.g. uninvertedfield) or
>> >> on-the-fly-caches (e.g. fieldcache) is totally trappy already.
>> >> But with the uninvertedfield of json facets it does its own thing,
>> >> even if you went thru the trouble to enable docvalues at index time:
>> >> that's sad.
>> >>
>> >> the code by default should not give the user jvm
>> >> heap/garbage-collector hell. If you want to do that to yourself, for a
>> >> totally static index, IMO that should be opt-in.
>> >>
>> >> But for the record, it is no longer just two shitty choices like
>> >> "top-level vs per-segment". There are different field types, e.g.
>> >> numeric types where the per-segment approach works efficiently.
>> >> Then you have the strings, but there is a newish middle ground for
>> >> Strings: OrdinalMap (lucene Multi* interfaces do it) which builds
>> >> top-level integers structures to speed up string-faceting, but doesnt
>> >> need *string values* in ram.
>> >> It is just integers and mostly compresses as deltas. Adrien compresses
>> >> the shit out of it.
>> >>
>> >> So I'd hate for the user to lose the option here of using docvalues to
>> >> keep faceting out of heap memory, which should not be hassling them
>> >> already in 2021.
>> >> Maybe better to refactor the code such that all these concerns aren't
>> >> unexpectedly tied together.
>> >>
>> >> On Thu, Jan 21, 2021 at 10:08 PM David Smiley <ds...@apache.org>
>> wrote:
>> >> >
>> >> > There's a JIRA issue about this from 5 years ago:
>> https://issues.apache.org/jira/browse/SOLR-7296
>> >> > I don't recall seeing any resistance to the idea of having the JSON
>> Faceting module act as a back-end to the front-end (API surface) of Solr's
>> common/classic/original/whatever faceting API.  I don't think that simple
>> API should go away; it's strength is simple/common cases that are
>> comparatively verbose in the JSON one.
>> >> >
>> >> > ~ David Smiley
>> >> > Apache Lucene/Solr Search Developer
>> >> > http://www.linkedin.com/in/davidwsmiley
>> >> >
>> >> >
>> >> > On Thu, Jan 21, 2021 at 9:57 PM Marcus Eagan <ma...@gmail.com>
>> wrote:
>> >> >>
>> >> >> Hi all,
>> >> >>
>> >> >> Sorry to spam the list. I am querying the list in such quick
>> succession because of a realization I came to while on Twitter. Is it time
>> to deprecate the Legacy Facet API?
>> >> >>
>> >> >> I understood in the past that they behaved slightly differently.
>> Now, I'm wondering if it makes sense to keep the legacy facets package as
>> it adds a burden of maintenance to the project. If some activists really
>> want it, I will abandon the effort. If the interest is very light, I
>> suppose they can package it up in a plugin. In fact, I would help if they
>> run into trouble and I am able to help.
>> >> >>
>> >> >> Anyway, let me know what you think. If it's a good idea, I will
>> head over to the chopping block.
>> >> >>
>> >> >> --
>> >> >> Marcus Eagan
>> >> >>
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: dev-help@lucene.apache.org
>> >>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>

Re: Is it Time to Deprecate the Legacy Facets API

Posted by Ishan Chattopadhyaya <ic...@gmail.com>.
> Personally I'd love to see us stop maintaining the duplicated code of
> the underlying implementations.  I wouldn't mind losing the legacy
> syntax as well - I'll take a clear, verbose API over a less-clear,
> concise one any day.  But I'm probably a minority there.

+1, agree with Jason here, fully.

On Wed, Oct 27, 2021 at 8:37 PM Ishan Chattopadhyaya <
ichattopadhyaya@gmail.com> wrote:

> Should we deprecate classic faceting in 9x now?
>
> > It's worth investigating deprecating the stats component also. I believe
> JSON facets covers that functionality as well. It will be painful for users
> though to switch over unfortunately.
>
> +1, lets deprecate stats component too.
>
>
> On Thu, Jan 28, 2021 at 5:22 AM Joel Bernstein <jo...@gmail.com> wrote:
>
>> It's worth investigating deprecating the stats component also. I believe
>> JSON facets covers that functionality as well. It will be painful for users
>> though to switch over unfortunately.
>>
>>
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>>
>>
>> On Fri, Jan 22, 2021 at 1:14 PM Jason Gerlowski <ge...@gmail.com>
>> wrote:
>>
>>> Personally I'd love to see us stop maintaining the duplicated code of
>>> the underlying implementations.  I wouldn't mind losing the legacy
>>> syntax as well - I'll take a clear, verbose API over a less-clear,
>>> concise one any day.  But I'm probably a minority there.
>>>
>>> Either way I agree with Michael when he said above that the first step
>>> would have to be a parity investigation for features and performance.
>>>
>>> Best,
>>>
>>> Jason
>>>
>>> On Fri, Jan 22, 2021 at 10:05 AM Michael Gibney
>>> <mi...@michaelgibney.net> wrote:
>>> >
>>> > I agree it would make long-term sense to consolidate the backend
>>> implementation. I think leaving the "classic" user-facing facet API (with
>>> JSON Facet module as a backend) would be a good idea. Either way, I think a
>>> first step would be checking for parity between existing backend
>>> implementations -- possibly in terms of features [1], but certainly in
>>> terms of performance for common use cases [2].
>>> >
>>> > I think removal of the "classic" user-facing API would cause a lot of
>>> consternation in the user community. I can even see a
>>> non-backward-compatibility argument for preserving the "classic"
>>> user-facing API: it's simpler for simple use cases. _If_ the ultimate goal
>>> is removal of the "classic" user-facing API (not presuming that it is),
>>> that approach could be facilitated in the short term by enticing users
>>> towards "JSON Facet" API ... basically with a "feature freeze" on the
>>> legacy implementation. No new features [3], no new optimizations [4] for
>>> "classic"; concentrate such efforts on JSON Facet. This seems to already be
>>> the de facto case, but it could be a more intentional decision -- e.g. in
>>> [3] it's straightforward to extend the the proposed "facet cache" to the
>>> "classic" impl ... but I could see an argument for intentionally not doing
>>> so.
>>> >
>>> > Robert, I think your concerns about UninvertedField could be addressed
>>> by the `uninvertible="false"` property (currently defaults to "true" for
>>> backward compatibility iiuc; but could default to "false", or at least
>>> provide the ability to set the default for all fields to "false" at node
>>> level solr.xml? -- I know I've wished for the latter!). Also fwiw I'm not
>>> aware of any JSON Facet processors that work with string values in RAM ...
>>> I do think all JSON Facet processors use OrdinalMap now, where relevant.
>>> >
>>> > [1] https://issues.apache.org/jira/browse/SOLR-14921
>>> > [2] https://issues.apache.org/jira/browse/SOLR-14764
>>> > [3] https://issues.apache.org/jira/browse/SOLR-13807
>>> > [4] https://issues.apache.org/jira/browse/SOLR-10732
>>> >
>>> > On Fri, Jan 22, 2021 at 12:46 AM Robert Muir <rc...@gmail.com> wrote:
>>> >>
>>> >> Do these two options conflate concerns of input format vs. actual
>>> >> algorithm? That was always my disappointment.
>>> >>
>>> >> I feel like the java apis are off here at the lower level, and it
>>> >> hurts the user.
>>> >> I don't talk about the input format from the user, instead I mean the
>>> >> execution of the faceting query.
>>> >>
>>> >> IMO: building top-level caches (e.g. uninvertedfield) or
>>> >> on-the-fly-caches (e.g. fieldcache) is totally trappy already.
>>> >> But with the uninvertedfield of json facets it does its own thing,
>>> >> even if you went thru the trouble to enable docvalues at index time:
>>> >> that's sad.
>>> >>
>>> >> the code by default should not give the user jvm
>>> >> heap/garbage-collector hell. If you want to do that to yourself, for a
>>> >> totally static index, IMO that should be opt-in.
>>> >>
>>> >> But for the record, it is no longer just two shitty choices like
>>> >> "top-level vs per-segment". There are different field types, e.g.
>>> >> numeric types where the per-segment approach works efficiently.
>>> >> Then you have the strings, but there is a newish middle ground for
>>> >> Strings: OrdinalMap (lucene Multi* interfaces do it) which builds
>>> >> top-level integers structures to speed up string-faceting, but doesnt
>>> >> need *string values* in ram.
>>> >> It is just integers and mostly compresses as deltas. Adrien compresses
>>> >> the shit out of it.
>>> >>
>>> >> So I'd hate for the user to lose the option here of using docvalues to
>>> >> keep faceting out of heap memory, which should not be hassling them
>>> >> already in 2021.
>>> >> Maybe better to refactor the code such that all these concerns aren't
>>> >> unexpectedly tied together.
>>> >>
>>> >> On Thu, Jan 21, 2021 at 10:08 PM David Smiley <ds...@apache.org>
>>> wrote:
>>> >> >
>>> >> > There's a JIRA issue about this from 5 years ago:
>>> https://issues.apache.org/jira/browse/SOLR-7296
>>> >> > I don't recall seeing any resistance to the idea of having the JSON
>>> Faceting module act as a back-end to the front-end (API surface) of Solr's
>>> common/classic/original/whatever faceting API.  I don't think that simple
>>> API should go away; it's strength is simple/common cases that are
>>> comparatively verbose in the JSON one.
>>> >> >
>>> >> > ~ David Smiley
>>> >> > Apache Lucene/Solr Search Developer
>>> >> > http://www.linkedin.com/in/davidwsmiley
>>> >> >
>>> >> >
>>> >> > On Thu, Jan 21, 2021 at 9:57 PM Marcus Eagan <ma...@gmail.com>
>>> wrote:
>>> >> >>
>>> >> >> Hi all,
>>> >> >>
>>> >> >> Sorry to spam the list. I am querying the list in such quick
>>> succession because of a realization I came to while on Twitter. Is it time
>>> to deprecate the Legacy Facet API?
>>> >> >>
>>> >> >> I understood in the past that they behaved slightly differently.
>>> Now, I'm wondering if it makes sense to keep the legacy facets package as
>>> it adds a burden of maintenance to the project. If some activists really
>>> want it, I will abandon the effort. If the interest is very light, I
>>> suppose they can package it up in a plugin. In fact, I would help if they
>>> run into trouble and I am able to help.
>>> >> >>
>>> >> >> Anyway, let me know what you think. If it's a good idea, I will
>>> head over to the chopping block.
>>> >> >>
>>> >> >> --
>>> >> >> Marcus Eagan
>>> >> >>
>>> >>
>>> >> ---------------------------------------------------------------------
>>> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> >> For additional commands, e-mail: dev-help@lucene.apache.org
>>> >>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>
>>>