You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Ankur (Jira)" <ji...@apache.org> on 2020/08/05 01:39:00 UTC
[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch
facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17171215#comment-17171215 ]
Ankur edited comment on LUCENE-9444 at 8/5/20, 1:38 AM:
--------------------------------------------------------
Thanks for your response [~mikemccand]
Yes, having _dim_ as additional parameter makes sense.
Regarding _*BinaryDocValues*_ iterator, initially I was thinking of providing a concrete implementation - *_TaxonomyFacetsLabels_* of abstract class *_TaxonomyFacets_* and add a constructor that accepts a *_LeafReaderContext_* which will then be used to instantiate and reuse the BinaryDocValues iterator between multiple calls to *getLabels(docId, dim).* That way a caller does not need to know if a _*BinaryDocValues*_ field existed at all. The downside is that caller will need to create a new instance of *_TaxonomyFacetsLabels_* for each different *_LeafReaderContext._*
But thinking more, I feel its simpler to pass BinaryDocValues iterator as a 3rd argument to *getLabels().* In order to take care of hierarchical fields, I think it makes sense to return FacetLabel[] instead of String[].
The proposed API signature would look like this
{{public static FacetLabel[] getLabels(int docId, String dim, BinaryDocValues)}}
was (Author: goankur):
Thanks for your response [~mikemccand]
Yes, having _dim_ as additional parameter makes sense.
Regarding _*BinaryDocValues*_ iterator, initially I was thinking of providing a concrete implementation - *_TaxonomyFacetsLabels_* of abstract class *_TaxonomyFacets_* and add a constructor that accepts a *_LeafReaderContext_* which will then be used to instantiate and reuse the BinaryDocValues iterator between multiple calls to *getLabels(docId, dim).* That way a caller does not need to know if a _*BinaryDocValues*_ field existed at all. The downside is that caller will need to create a new instance of *_TaxonomyFacetsLabels_* for each different *_LeafReaderContext._*
But thinking more, I feel its simpler to pass BinaryDocValues iterator as a 3rd argument to *getLabels().*
In order to take care of hierarchical fields, I think it makes sense to return FacetLabel[] instead of String[].
One last thing, should we make the API _*static*_ ?
The proposed API signature would look like this
{{public static FacetLabel[] getLabels(int docId, String dim, BinaryDocValues)}}
> Need an API to easily fetch facet labels for a field in a document
> ------------------------------------------------------------------
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/facet
> Affects Versions: 8.6
> Reporter: Ankur
> Priority: Major
>
> A facet field may be included in the list of fields whose values are to be returned for each hit.
> In order to get the facet labels for each hit we need to
> # Create an instance of _DocValuesOrdinalsReader_ and invoke _getReader(LeafReaderContext context)_ method to obtain an instance of _OrdinalsSegmentReader()_
> # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then used to fetch and decode the binary payload in the document's BinaryDocValues field. This provides the ordinals that refer to facet labels in the taxonomy.**
> # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be returned.
>
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides all the above details and gives us the string labels. This can be part of *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org