You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/11/18 18:45:19 UTC

[GitHub] [incubator-druid] michaelschiff commented on issue #8882: equals method in SegmentId looks buggy

michaelschiff commented on issue #8882: equals method in SegmentId looks buggy
URL: https://github.com/apache/incubator-druid/issues/8882#issuecomment-555153707
 
 
   I'm also not certain that the equals method is correct as written
   
   The hashcode is calculated via: (comments removed for readability)
   ```
   int hashCode = partitionNum;
   hashCode = hashCode * 1000003 + version.hashCode();
   hashCode = hashCode * 1000003 + dataSource.hashCode();
   hashCode = hashCode * 1000003 + Long.hashCode(intervalStartMillis);
   hashCode = hashCode * 1000003 + Long.hashCode(intervalEndMillis);
   hashCode = hashCode * 1000003 + Objects.hashCode(intervalChronology);
   ```
   and the comment from equals:
   ```
   // Compare hashCode instead of partitionNum: break the chain quicker if the objects are not equal. If the hashCodes
   // are equal as well as all other fields used to compute them, the partitionNums are also guaranteed to be equal.
   ```
   
   That logic `If the hashCodes are equal as well as all other fields used to compute them, the partitionNums are also guaranteed to be equal.` will hold over an infinite field, but over a finite field (like `int`) it is possible that you can start with two different partition numbers, multiply them by the same sequence of numbers, and arrive at the same answer `% 2^32`.
   
   Take as a simple example, partitions x & y, where `x != y` and there is only one other "field" used to compute the hash-code (which in this example is held constant) and we are using the finite field Mod 8 
   ```
   x = 1
   y = 3
   
   x * 4 Mod 8 = 4
   
   y * 4 Mod 8 = 4
   ```
   
   While a full size counter example is constructed, I want to point out that this discussion could be obviated by adding an additional comparison to the equals method (checking the partition number too).  The comparison of the hashcode can be left in for quicker exit still (since if the hashcodes are different, they are certainly different), but in the case of a hashcode collision all fields needs to be checked.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org