You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Danny Chen (Jira)" <ji...@apache.org> on 2020/06/16 09:30:00 UTC

[jira] [Comment Edited] (CALCITE-3786) Add Digest interface to enable efficient hashCode(equals) for RexNode and RelNode

    [ https://issues.apache.org/jira/browse/CALCITE-3786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17136473#comment-17136473 ] 

Danny Chen edited comment on CALCITE-3786 at 6/16/20, 9:29 AM:
---------------------------------------------------------------

Thanks [~julianhyde] and [~laurent] for the feedback ~

Compared to the old code this patch does these things:
- Use #equals to compare 2 RexCalls so that we can avoid the OOM like CALCITE-3784
- Use cached hashCode for quick #equals of the digest, it also replace the pure string comparison with object equals
- Include the hints in the digest
- Makes the RexCall normalization more general, more cases are normalized

So in general it definitely do not make the memory consumption bigger.

[~laurent] It seems that you have some promotion ideas for the Digest code, feel free to do that, i tried to answer each message you left in the PR, hope it helps. But i would not revert the PR for no other strong reason, sorry.

I would try to give a mem benchmark then ~


was (Author: danny0405):
Thanks [~julianhyde] and [~laurent] for the feedback ~

Compared to the old code this patch does these things:
- Use #equals to compare 2 RexCalls so that we can avoid the OOM like CALCITE-3784
- Use cached hashCode for quick #equals of the digest, it also replace the pure string comparison with object equals
- Include the hints in the digest
- Makes the RexCall normalization more general, more cases are normalized

So in general it definitely do not make the memory consumption bigger.

[~laurent] It seems that you have some promotion ideas for the Digest code, feel free to do that, i tried to answer each message you left in the PR, hope it helps. But i would not revert the PR for no other strong reason, sorry.

> Add Digest interface to enable efficient hashCode(equals) for RexNode and RelNode
> ---------------------------------------------------------------------------------
>
>                 Key: CALCITE-3786
>                 URL: https://issues.apache.org/jira/browse/CALCITE-3786
>             Project: Calcite
>          Issue Type: New Feature
>          Components: core
>    Affects Versions: 1.21.0
>            Reporter: Vladimir Sitnikov
>            Assignee: Danny Chen
>            Priority: Major
>             Fix For: 1.24.0
>
>          Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> Current digests for RexNode, RelNode, RelType, and similar cases use String concatenation.
> It is easy to implement, however, it has drawbacks:
> 1) String objects cannot be reused. For instance, RexCall has operands, however, the digest is duplicated. It causes extra memory use and extra CPU for string copying
> 2) There's no way to have multiple #toString() methods. RelType might need multiple digests: "including field names", "excluding field names".
> A suggested resolution might be behind the lines of
> {code:java}
> class Digest { // immutable
>   final int hashCode; // speedup hashCode and equals
>   final Object[] contents; // The values are either other Digest objects or Strings
>   String toString(); // e.g. for debugging purposes
>   int compareTo(Digest); // e.g. for debugging purposes.
> }
> {code}
> Note how fields in Kotlin are aligned much better, and it makes it easier to read:
> {code:java}
> class Digest { // immutable
>   val hashCode: Int // speedup hashCode and equals
>   val contents: Array<Any> // The values are either other Digest objects or Strings
>   fun toString(): String // e.g. for debugging purposes
>   fun compareTo(other: Digest): Int // e.g. for debugging purposes.
> }
> {code}
> Then the digest for RexCall could be the bits relevant to RexCall itself + digests of the operands (which can be reused as is)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)