You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Robert Muir (Jira)" <ji...@apache.org> on 2021/04/14 21:26:00 UTC

[jira] [Commented] (LUCENE-9843) Remove compression option on doc values

    [ https://issues.apache.org/jira/browse/LUCENE-9843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17321327#comment-17321327 ] 

Robert Muir commented on LUCENE-9843:
-------------------------------------

I moved this issue to a blocker for 9.0 because i've already seen multiple instances where these compression settings are set inappropriately, and from a back-compat perspective we need to stop the bleeding before we have to support all these variants for a long time.

I'll summarize my proposal above again:
* remove the option for SORTED term dictionaries, just compress always. does not impact speed of per-doc ordinals.
* remove the option for BINARY, don't compress. it is a catch-all and we don't know the use-case. Supply a different codec if someone wants to do block compression over binary, but avoid back compat hassle.

Seems the issue could be easily split into two tasks.

> Remove compression option on doc values
> ---------------------------------------
>
>                 Key: LUCENE-9843
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9843
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Priority: Blocker
>
> Options on file formats add complexity and put a big tax on backward-compatibility testing. I'm the one who introduced it LUCENE-9378 but I would now like to think about what we can do to remove this option.
> For the record, compression was initially introduced because some binary fields have so much redundancy that it's wasteful not to compress them at all. But unfortunately, this slowed down some search workloads and we decided to introduce this option as a way to let users choose the trade-off they want.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org