You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2018/12/20 15:08:39 UTC

[GitHub] richardstartin opened a new pull request #6764: Consider using RoaringBitmapWriter for bitmap construction

richardstartin opened a new pull request #6764: Consider using RoaringBitmapWriter for bitmap construction
URL: https://github.com/apache/incubator-druid/pull/6764
 
 
   I noticed you are investigating using RoaringBitmap 0.7.30. If you do so, it is worth considering a different mechanism to build your bitmaps which favours ordered insertions by buffering them into a container and appending to the bitmap as late as possible (just before the bitmap is queried, or when the next multiple of 2^16 is crossed). There is no need to manually run optimise bitmaps built this way, because they are run optimised at the container level whenever it is appended to the bitmap.
   
   I am not a druid user and am unlikely to become one soon, so this PR is intended as an FYI about the feature only. 
   
   I ran the `BitmapIterationBenchmark.constructAndIter` on JDK8 on Ubuntu 16.0.4 at `7a09cde4de1953eee75c5033e863cfde8f94d6c1` and got:
   
   ```
   Benchmark                                     (bitmapAlgo)  (n)  (prob)   (size)  Mode  Cnt          Score          Error  Units
   BitmapIterationBenchmark.constructAndIter           bitset  N/A     0.0  1000000  avgt    5         11.958 ±        0.411  ns/op
   BitmapIterationBenchmark.constructAndIter           bitset  N/A   0.001  1000000  avgt    5      55820.663 ±     4765.600  ns/op
   BitmapIterationBenchmark.constructAndIter           bitset  N/A     0.1  1000000  avgt    5     853821.933 ±    10916.693  ns/op
   BitmapIterationBenchmark.constructAndIter           bitset  N/A     0.5  1000000  avgt    5    3014089.931 ±    65283.409  ns/op
   BitmapIterationBenchmark.constructAndIter           bitset  N/A    0.99  1000000  avgt    5    5628379.542 ±   227488.488  ns/op
   BitmapIterationBenchmark.constructAndIter           bitset  N/A     1.0  1000000  avgt    5    5612304.605 ±    54199.692  ns/op
   BitmapIterationBenchmark.constructAndIter          concise  N/A     0.0  1000000  avgt    5          8.073 ±        0.178  ns/op
   BitmapIterationBenchmark.constructAndIter          concise  N/A   0.001  1000000  avgt    5      27473.710 ±      626.528  ns/op
   BitmapIterationBenchmark.constructAndIter          concise  N/A     0.1  1000000  avgt    5    3635751.625 ±    56888.246  ns/op
   BitmapIterationBenchmark.constructAndIter          concise  N/A     0.5  1000000  avgt    5    9798233.678 ±   237069.198  ns/op
   BitmapIterationBenchmark.constructAndIter          concise  N/A    0.99  1000000  avgt    5    9588921.943 ±   214705.602  ns/op
   BitmapIterationBenchmark.constructAndIter          concise  N/A     1.0  1000000  avgt    5    8077899.901 ±   118088.071  ns/op
   BitmapIterationBenchmark.constructAndIter          roaring  N/A     0.0  1000000  avgt    5        131.791 ±        2.237  ns/op
   BitmapIterationBenchmark.constructAndIter          roaring  N/A   0.001  1000000  avgt    5      46860.367 ±     1753.583  ns/op
   BitmapIterationBenchmark.constructAndIter          roaring  N/A     0.1  1000000  avgt    5    1709465.928 ±    38854.875  ns/op
   BitmapIterationBenchmark.constructAndIter          roaring  N/A     0.5  1000000  avgt    5    6898408.274 ±   210501.998  ns/op
   BitmapIterationBenchmark.constructAndIter          roaring  N/A    0.99  1000000  avgt    5   13340397.558 ±   283832.841  ns/op
   BitmapIterationBenchmark.constructAndIter          roaring  N/A     1.0  1000000  avgt    5   13415893.194 ±   170437.084  ns/op
   ```
   
   At `b313193c81ed868a9afe04c658306705f63daaef` I got:
   ```
   Benchmark                                  (bitmapAlgo)  (prob)   (size)  Mode  Cnt         Score         Error  Units
   BitmapIterationBenchmark.constructAndIter        bitset     0.0  1000000  avgt    5        12.665 ±       1.104  ns/op
   BitmapIterationBenchmark.constructAndIter        bitset   0.001  1000000  avgt    5     74471.073 ±   54445.042  ns/op
   BitmapIterationBenchmark.constructAndIter        bitset     0.1  1000000  avgt    5    887366.201 ±   35143.327  ns/op
   BitmapIterationBenchmark.constructAndIter        bitset     0.5  1000000  avgt    5   3166248.403 ±  495669.632  ns/op
   BitmapIterationBenchmark.constructAndIter        bitset    0.99  1000000  avgt    5   6324809.163 ± 1012027.080  ns/op
   BitmapIterationBenchmark.constructAndIter        bitset     1.0  1000000  avgt    5   5913067.177 ±  132629.211  ns/op
   BitmapIterationBenchmark.constructAndIter       concise     0.0  1000000  avgt    5         8.068 ±       0.115  ns/op
   BitmapIterationBenchmark.constructAndIter       concise   0.001  1000000  avgt    5     27547.146 ±     546.018  ns/op
   BitmapIterationBenchmark.constructAndIter       concise     0.1  1000000  avgt    5   3635772.683 ±   56079.798  ns/op
   BitmapIterationBenchmark.constructAndIter       concise     0.5  1000000  avgt    5  10400194.474 ±  147495.368  ns/op
   BitmapIterationBenchmark.constructAndIter       concise    0.99  1000000  avgt    5   9409295.891 ±  122748.484  ns/op
   BitmapIterationBenchmark.constructAndIter       concise     1.0  1000000  avgt    5   8641773.847 ±  193212.416  ns/op
   BitmapIterationBenchmark.constructAndIter       roaring     0.0  1000000  avgt    5        75.142 ±       1.408  ns/op
   BitmapIterationBenchmark.constructAndIter       roaring   0.001  1000000  avgt    5     13348.323 ±     205.987  ns/op
   BitmapIterationBenchmark.constructAndIter       roaring     0.1  1000000  avgt    5   1300962.745 ±   30107.110  ns/op
   BitmapIterationBenchmark.constructAndIter       roaring     0.5  1000000  avgt    5   5255974.593 ±  133759.112  ns/op
   BitmapIterationBenchmark.constructAndIter       roaring    0.99  1000000  avgt    5  11122438.742 ±  168793.197  ns/op
   BitmapIterationBenchmark.constructAndIter       roaring     1.0  1000000  avgt    5  11555370.664 ±  141606.255  ns/op
   ```
   These quick results are quite noisy so require more careful consideration on your part. I am also unaware of how likely out of order insertions are in a typical druid workload, where there would be no penalty for using this abstraction, but there will be no benefit.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org