You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Minh Do (JIRA)" <ji...@apache.org> on 2014/02/03 08:18:10 UTC
[jira] [Commented] (CASSANDRA-5263) Allow Merkle tree maximum depth to be configurable

    [ https://issues.apache.org/jira/browse/CASSANDRA-5263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889277#comment-13889277 ] 

Minh Do commented on CASSANDRA-5263:
------------------------------------

Using some generated sstable files within a token range, I ran a test  on building the Merkle tree at 20 depth and then add the computed hash values for rows (69M added rows).  These 2 steps together are equivalent to a validation compaction process on a token range if I am not missing anything.

1. Tree building uses, on the average, 15-18% total CPU resources, and no I/O
2. SSTables scanning and row hash computation use, on the average, 10-12% total 
   CPU resources, and I/O resources limited by the configurable global compaction rate limiter 


Given the Jonathan's pointer on using SSTR.estimatedKeysForRanges() to calculate number of rows for a SSTable file and no overlapping among SSTable files (worst case), we can estimate how many data rows in a given token range.  

>From what I understood, here is the formula to calculate the Merkle tree's depth (assuming each data row has a unique hash value):

1. If number of rows from all SSTables in a given range is approximately equal to the maximum number of hash entries in that range (subject to a CF's partitioner), thenvwe build the tree at 20 level depth (in the densest case)
2. When number of rows from all SSTables in a given range does not cover the full hash range or in sparse case, we build a Merkle tree with a depth less than 20. How do we come up with the right depth?  
     depth = 20 * (n rows / max rows) 
where n is the total number of rows in all SSTables and max is the maximum number of hash entries in that token range.

However, since different partitions give different max numbers, is there anything we can assume to make it easy here like assuming all partitions would have the same hash entries in a given token range?  

> Allow Merkle tree maximum depth to be configurable
> --------------------------------------------------
>
>                 Key: CASSANDRA-5263
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5263
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Config
>    Affects Versions: 1.1.9
>            Reporter: Ahmed Bashir
>            Assignee: Minh Do
>
> Currently, the maximum depth allowed for Merkle trees is hardcoded as 15.  This value should be configurable, just like phi_convict_treshold and other properties.
> Given a cluster with nodes responsible for a large number of row keys, Merkle tree comparisons can result in a large amount of unnecessary row keys being streamed.
> Empirical testing indicates that reasonable changes to this depth (18, 20, etc) don't affect the Merkle tree generation and differencing timings all that much, and they can significantly reduce the amount of data being streamed during repair. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)