You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Robert Muir (Jira)" <ji...@apache.org> on 2021/08/24 15:38:00 UTC

[jira] [Commented] (LUCENE-10062) Explore using SORTED_NUMERIC doc values to encode taxonomy ordinals for faceting

    [ https://issues.apache.org/jira/browse/LUCENE-10062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17403885#comment-17403885 ] 

Robert Muir commented on LUCENE-10062:
--------------------------------------

+1 for this experiment. I think you are correct: some comparisons/benchmarks were done early on when numerics were very green. It would be interesting to see if the situation has improved.

> Explore using SORTED_NUMERIC doc values to encode taxonomy ordinals for faceting
> --------------------------------------------------------------------------------
>
>                 Key: LUCENE-10062
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10062
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/facet
>            Reporter: Greg Miller
>            Assignee: Greg Miller
>            Priority: Minor
>
> We currently encode taxonomy ordinals using varint style packing in a binary doc values field. I suspect there have been a number of improvements to SortedNumericDocValues since taxonomy faceting was first introduced, and I plan to explore replacing the custom binary format we have today with a SORTED_NUMERIC type dv field instead.
> I'll report benchmark results and index size impact here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org