You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Paulo Motta (JIRA)" <ji...@apache.org> on 2016/08/31 17:27:20 UTC

[jira] [Commented] (CASSANDRA-12580) Fix merkle tree size calculation

    [ https://issues.apache.org/jira/browse/CASSANDRA-12580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15452820#comment-15452820 ] 

Paulo Motta commented on CASSANDRA-12580:
-----------------------------------------

Attaching patch to fix the calculation formula to:

{{int depth = numPartitions > 0 ? (int) Math.min(Math.ceil(Math.log(numPartitions) / Math.log(2)), 20) : 0;}}

Besides fixing from {{ln}} to {{lg}}, this also changes the rounding formula from {{floor}} to  {{ceil}}, so we overestimate the depth rather than underestimate.

I added a new test on {{ValidationTest}} that runs a validation compaction with N=128 and N=1500 keys and expect the merkle tree depth to be {{ceil(lg(N))}}. I also modified the other tests on this class to use a {{ListenableFuture}} ({{CompletableFuture}} on 3.0+) instead of {{SimpleCondition}}, since the JUnit assertions are not enforced in other threads.


Patch and tests available below:
||2.1||2.2||3.0||trunk||
|[branch|https://github.com/apache/cassandra/compare/cassandra-2.1...pauloricardomg:2.1-CASSANDRA-12580]|[branch|https://github.com/apache/cassandra/compare/cassandra-2.2...pauloricardomg:2.2-CASSANDRA-12580]|[branch|https://github.com/apache/cassandra/compare/cassandra-3.0...pauloricardomg:3.0-CASSANDRA-12580]|[branch|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-CASSANDRA-12580]|
|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.1-CASSANDRA-12580-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.2-CASSANDRA-12580-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.0-CASSANDRA-12580-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-CASSANDRA-12580-testall/lastCompletedBuild/testReport/]|
|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.1-CASSANDRA-12580-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.2-CASSANDRA-12580-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.0-CASSANDRA-12580-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-CASSANDRA-12580-dtest/lastCompletedBuild/testReport/]|

> Fix merkle tree size calculation
> --------------------------------
>
>                 Key: CASSANDRA-12580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12580
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Paulo Motta
>            Assignee: Paulo Motta
>
> On CASSANDRA-5263 it was introduced dynamic merkle tree sizing based on estimated number of partitions as {{estimatedDepth = lg(numPartitions)}}, but on [CompactionManager.doValidationCompaction|https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L1052] this is being calculated as:
> {{int depth = numPartitions > 0 ? (int) Math.min(Math.floor(Math.log(numPartitions)), 20) : 0;}}
> This is actually calculating {{ln(numPartitions)}} (base-e) instead of {{lg(numPartitions)}} (base-2), which causes merkle trees to lose resolution, what may result in overstreaming.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)