You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2018/08/04 05:12:34 UTC
[GitHub] clintropolis opened a new pull request #6107: Order rows during
incremental index persist when rollup is disabled.
clintropolis opened a new pull request #6107: Order rows during incremental index persist when rollup is disabled.
URL: https://github.com/apache/incubator-druid/pull/6107
Resolves #6066 by modifying the `FactsHolder` interface to include a new method `Iterable<IncrementalIndexRow> getPersistIterable()` and using this when persisting incremental indexes. Added an additional benchmark generator schema with 4 low cardinality dimensions to enable testing this scenario.
Before this patch:
```
Benchmark (rollup) (rollupOpportunity) (rowsPerSegment) (schema) Mode Cnt Score Error Units
IndexPersistBenchmark.persistV9 true none 75000 rollo avgt 25 429409.821 ± 17771.526 us/op
IndexPersistBenchmark.persistV9 true moderate 75000 rollo avgt 25 57578.929 ± 2650.596 us/op
IndexPersistBenchmark.persistV9 true high 75000 rollo avgt 25 11023.976 ± 461.142 us/op
IndexPersistBenchmark.persistV9 false none 75000 rollo avgt 25 414289.365 ± 16384.902 us/op
IndexPersistBenchmark.persistV9 false moderate 75000 rollo avgt 25 407060.720 ± 16965.695 us/op
IndexPersistBenchmark.persistV9 false high 75000 rollo avgt 25 400008.825 ± 19613.728 us/op
size [2262258] bytes.
size [276631] bytes.
size [47597] bytes.
size [2280590] bytes.
size [2095354] bytes.
size [2094972] bytes.
```
After:
```
Benchmark (rollup) (rollupOpportunity) (rowsPerSegment) (schema) Mode Cnt Score Error Units
IndexPersistBenchmark.persistV9 true none 75000 rollo avgt 25 436966.463 ± 45936.358 us/op
IndexPersistBenchmark.persistV9 true moderate 75000 rollo avgt 25 54724.237 ± 7500.566 us/op
IndexPersistBenchmark.persistV9 true high 75000 rollo avgt 25 11010.033 ± 718.345 us/op
IndexPersistBenchmark.persistV9 false none 75000 rollo avgt 25 464730.668 ± 30413.613 us/op
IndexPersistBenchmark.persistV9 false moderate 75000 rollo avgt 25 523597.179 ± 43443.648 us/op
IndexPersistBenchmark.persistV9 false high 75000 rollo avgt 25 535282.839 ± 46529.297 us/op
size [2262258] bytes.
size [276631] bytes.
size [47597] bytes.
size [2269144] bytes.
size [1475402] bytes.
size [1357298] bytes.
```
Actual difference in segment size will vary quite a lot from this contrived scenario, but should generally be smaller, at the cost of slower index persist time.
Query performance should be unaffected. See #6066 for additional benchmarks and discussion.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org