You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Nadav Har'El (JIRA)" <ji...@apache.org> on 2015/11/18 13:27:10 UTC
[jira] [Created] (CASSANDRA-10728) Hash used in repair does not
include partition key
Nadav Har'El created CASSANDRA-10728:
----------------------------------------
Summary: Hash used in repair does not include partition key
Key: CASSANDRA-10728
URL: https://issues.apache.org/jira/browse/CASSANDRA-10728
Project: Cassandra
Issue Type: Bug
Reporter: Nadav Har'El
Priority: Minor
When the repair code builds the Merkle Tree, it appears to be using AbstractCompactedRow.update() to calculate a partition's hash. This method's documentation states that it calculates a "digest with the data bytes of the row (not including row key or row size).". The code itself seems to agree with this comment.
However, I believe that not including the row (actually, partition) key in the hash function is a mistake: This means that if two nodes have the same data but different key, repair would not notice this discrepancy. Moreover, if two different keys have their data switched - or have the same data - again this would not be noticed by repair. Actually running across this problem in a real repair is not very likely, but I can imagine seeing it easily in an hypothetical use case where all partitions have exactly the same data and just the partition key matters.
I am sorry if I'm mistaken and the partition key is actually taken into account in the Merkle tree, but I tried to find evidence that it does and failed. Glancing over the code, it almost seems that it does use the key: Validator.add() calculates rowHash() which includes the digest (without the partition key) *and* the key's token. But then, the code calls MerkleTree.TreeRange.addHash() on that tuple, and that function conspicuously ignores the token, and only uses the digest.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)