You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Liya Fan (Jira)" <ji...@apache.org> on 2020/02/28 07:19:00 UTC

[jira] [Created] (CALCITE-3836) The hash codes of RelNodes are unreliable

Liya Fan created CALCITE-3836:
---------------------------------

             Summary: The hash codes of RelNodes are unreliable
                 Key: CALCITE-3836
                 URL: https://issues.apache.org/jira/browse/CALCITE-3836
             Project: Calcite
          Issue Type: Bug
          Components: core
            Reporter: Liya Fan


For all sub-classes of AbstractRelNode, the {{hashCode}} methods depend on {{AbstractRelNode#hashCode}}, because it is declared as final. 

{{AbstractRelNode#hashCode}} depends on {{Object#hashCode}}, which is called identify hash code. The details of identity hash code depends on the specific JVM implementation. For many JVMs, the implementation is based on the object address in the memory. The problem is that, the address of an object may change in a JVM, due to GC, memory contraction, etc. So the hash code of an object may change, even if the content of the object is not changed (This can be confirmed from the JavaDoc of {{Object#hashCode}}). 

This problem may cause severe issues that are hard to diagnose and debug, like an object is in the hash table, but cannot be retrieved; duplicate objects in the hash map, etc. 

To solve the problem, we compute a hash code solely from the node id. This is consistent with the previous semantics, and solves the above problem. 





--
This message was sent by Atlassian Jira
(v8.3.4#803005)