You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Adrien Grand (JIRA)" <ji...@apache.org> on 2018/05/28 08:15:00 UTC

[jira] [Resolved] (LUCENE-4892) Create a compressed LiveDocsFormat

     [ https://issues.apache.org/jira/browse/LUCENE-4892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Adrien Grand resolved LUCENE-4892.
----------------------------------
    Resolution: Won't Fix

I was hoping that the smaller memory footprint might make it comparable to FixedBitSet in speed in spite of the higher overhead, but apparently it is not enough:

{noformat}
                    TaskQPS baseline      StdDev   QPS patch      StdDev                Pct diff
       HighTermMonthSort     1221.82      (8.7%)      738.86      (3.4%)  -39.5% ( -47% -  -30%)
                  IntNRQ      126.95      (4.5%)       78.65      (1.4%)  -38.0% ( -42% -  -33%)
                 Prefix3      464.10      (2.5%)      331.94      (1.4%)  -28.5% ( -31% -  -25%)
                HighTerm      572.63      (1.4%)      416.42      (0.7%)  -27.3% ( -28% -  -25%)
   HighTermDayOfYearSort      324.51      (1.9%)      241.01      (1.4%)  -25.7% ( -28% -  -22%)
               OrHighLow      511.62      (1.0%)      387.64      (0.5%)  -24.2% ( -25% -  -22%)
                 MedTerm     1609.19      (1.8%)     1225.70      (0.9%)  -23.8% ( -26% -  -21%)
              OrHighHigh      144.60      (5.0%)      118.70      (2.9%)  -17.9% ( -24% -  -10%)
               OrHighMed      421.88      (4.0%)      349.80      (2.7%)  -17.1% ( -22% -  -10%)
                 LowTerm     3987.51      (6.4%)     3503.75      (3.5%)  -12.1% ( -20% -   -2%)
                Wildcard      241.67      (2.9%)      216.89      (1.7%)  -10.3% ( -14% -   -5%)
              HighPhrase      178.98      (3.9%)      165.85      (3.8%)   -7.3% ( -14% -    0%)
            HighSpanNear      144.91      (2.7%)      134.36      (2.6%)   -7.3% ( -12% -   -2%)
             AndHighHigh      214.80      (1.0%)      199.49      (1.2%)   -7.1% (  -9% -   -4%)
        HighSloppyPhrase      249.43      (3.5%)      232.35      (3.4%)   -6.8% ( -13% -    0%)
                  Fuzzy1      159.45      (2.0%)      149.14      (2.6%)   -6.5% ( -10% -   -1%)
               MedPhrase      175.70      (2.6%)      165.17      (2.5%)   -6.0% ( -10% -    0%)
             MedSpanNear      645.69      (3.0%)      609.97      (2.7%)   -5.5% ( -10% -    0%)
         MedSloppyPhrase      144.37      (2.5%)      136.76      (2.6%)   -5.3% ( -10% -    0%)
               LowPhrase      285.30      (2.5%)      270.93      (2.8%)   -5.0% ( -10% -    0%)
         LowSloppyPhrase      398.34      (1.9%)      383.09      (2.0%)   -3.8% (  -7% -    0%)
             LowSpanNear      517.04      (1.4%)      506.96      (1.4%)   -1.9% (  -4% -    0%)
              AndHighMed     1090.21      (4.2%)     1077.55      (4.1%)   -1.2% (  -9% -    7%)
                 Respell      171.53      (2.8%)      171.52      (2.1%)   -0.0% (  -4% -    5%)
              AndHighLow     1346.57      (2.4%)     1367.21      (1.9%)    1.5% (  -2% -    5%)
                  Fuzzy2       65.94      (8.6%)       67.84     (11.5%)    2.9% ( -15% -   25%)
{noformat}

> Create a compressed LiveDocsFormat
> ----------------------------------
>
>                 Key: LUCENE-4892
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4892
>             Project: Lucene - Core
>          Issue Type: New Feature
>            Reporter: Adrien Grand
>            Priority: Trivial
>              Labels: gsoc2014
>         Attachments: LUCENE-4892.patch
>
>
> There are lots of use-cases where the number of deleted documents is low. This makes live docs very dense and I think it would be interesting to study the impact of some bitmap compression techniques on performance (intuitively I imagine it will be slower, but since it would make data smaller maybe CPU caches could help us so I'd be curious to see how it would behave). This format would make a good addition to our CheapBastardCodec.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org