You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Lu Xugang (Jira)" <ji...@apache.org> on 2021/05/14 10:08:00 UTC

[jira] [Commented] (LUCENE-9957) Use DirectMonotonicWriter to store sorted Values in NumericDocValues/SortedNumericDocValues

    [ https://issues.apache.org/jira/browse/LUCENE-9957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17344510#comment-17344510 ] 

Lu Xugang commented on LUCENE-9957:
-----------------------------------

I did some simple tests: indexing 10million documents into one Segment。

UniqueValues >= 256:
||Loop||Branch(Main)||Branch(PR)||Storage||UniqueValues|| ||
|0|20014826B|14129376B|-29.405%|6321130| |
|1|20011970B|13768928B|-31.196%|6322006| |
|2|20014826B|14145670B|29.324%|6321066| |
|3|20014826B|14031072B|-29.896%|6319892| |
|4|20014826B|14276230B|-28.671%|6321111| |
|5|20014826B|13998304B|-30.060%|6320938| |
|6|20014826B|13932768B|-30.387%|6320997| |
|7|20014826B|13801696B|-31.042%|6321756| |
|8|20014826B|13768928B|-31.206%|6322336| |
|9|20014826B|14260448B|-28.750%|6321014| |
| | | | | | |
 
UniqueValues < 256:
||Loop||Branch(Main)||Branch(PR)||Storage||UniqueValues|| ||
|0|2500076B|66064B|-97.35%|2| |
|1|2500076B|66064B|-97.35%|2| |
|2|5000076B|82454B|-98.35%|4| |
|3|5000076B|82454B|-98.35%|4| |
|4|5000076B|115234B|-97.69%|8| |
|5|5000076B|115234B|-97.69%|8| |
|6|10000076B|180794B|-98.19%|16| |
|7|10000076B|180794B|-98.19%|16| |
|8|10000076B|311914B|-96.88%|32| |
|9|10000076B|311914B|-96.88%|32| |
|10|10000076B|574154B|-94.25%|64| |
|11|10000076B|574154B|-94.25%|64| |
|12|10000076B|1098634B|-89.01%|128| |
|13|10000076B|1098634B|-89.01%|128| |
|14|10000076B|1303509B|-86.96%|255| |
|15|10000076B|1303509B|-86.96%|255|

> Use DirectMonotonicWriter to store sorted Values in NumericDocValues/SortedNumericDocValues
> -------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-9957
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9957
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/codecs
>    Affects Versions: 8.8.2
>            Reporter: Lu Xugang
>            Priority: Major
>
> When all values were sorted, use DirectMonotonicWriter to store them can get relatively impressive compression



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org