You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/12/05 10:44:00 UTC

[jira] [Commented] (KAFKA-6308) Connect: Struct equals/hashCode method should use Arrays#deep* methods

    [ https://issues.apache.org/jira/browse/KAFKA-6308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16278375#comment-16278375 ] 

ASF GitHub Bot commented on KAFKA-6308:
---------------------------------------

GitHub user tobiasgies opened a pull request:

    https://github.com/apache/kafka/pull/4293

    KAFKA-6308: Connect Struct should use deepEquals/deepHashCode

    This changes the Struct's equals and hashCode method to use
    Arrays#deepEquals and Arrays#deepHashCode, respectively. This resolves
    a problem where two structs with values of type byte[] would not be considered
    equal even though the byte arrays' contents are equal. By using deepEquals,
    the byte arrays' contents are compared instead of ther identity.
    
    Since this changes the behavior of the equals method for byte array values,
    the behavior of hashCode must change alongside it to ensure the methods
    still fulfill the general contract of "equal objects must have equal hashCodes".
    
    Test rationale:
    All existing unit tests for equals were untouched and continue to work.
    A new test method was added to verify the behavior of equals and hashCode
    for Struct instances that contain a byte array value. I verify the reflixivity and
    transitivity of equals as well as the fact that equal Structs have equal hashCodes
    and not-equal structs do not have equal hashCodes. 
    
    ### Committer Checklist (excluded from commit message)
    - [ ] Verify design and implementation 
    - [ ] Verify test coverage and CI build status
    - [ ] Verify documentation (including upgrade notes)


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tobiasgies/kafka feature/kafka-6308-deepequals

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/kafka/pull/4293.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4293
    
----
commit 0e4764d98bdda4246299bfdcc722795ab7834d8e
Author: Tobias Gies <to...@trivago.com>
Date:   2017-12-04T17:19:02Z

    KAFKA-6308 use deepEquals instead of equals for values array comparison

commit 4d2ea566179ea1f584cbc5f6304b5c131c72d2a9
Author: Tobias Gies <to...@trivago.com>
Date:   2017-12-04T17:45:04Z

    KAFKA-6308 use deepHashCode instead of hashCode to ensure contract stability between equals and hashCode

commit 55a3164c27dc2098ab270419032254106c13c533
Author: Tobias Gies <to...@tobiasgies.de>
Date:   2017-12-04T22:02:07Z

    Merge branch 'trunk' into feature/kafka-6308-deepequals

commit 3876c1dd2dcde35fd544bb43c2ea124d84860f2a
Author: Tobias Gies <to...@trivago.com>
Date:   2017-12-05T10:26:41Z

    KAFKA-6308 restructure test assertions based on method contracts

----


> Connect: Struct equals/hashCode method should use Arrays#deep* methods
> ----------------------------------------------------------------------
>
>                 Key: KAFKA-6308
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6308
>             Project: Kafka
>          Issue Type: Bug
>          Components: KafkaConnect
>    Affects Versions: 1.0.0
>            Reporter: Tobias Gies
>              Labels: easyfix, newbie
>
> At the moment, {{org.apache.kafka.connect.data.Struct#equals}} checks two things, after ensuring the incoming {{Object o}} is indeed of the correct type:
> * Whether the schemas of {{this}} and {{o}} are equal, via {{Objects#equals}}
> * Whether the values of {{this}} and {{o}} are qual, via {{Arrays#equals}}.
> The latter check is problematic. {{Arrays#equals}} is meant for one-dimensional arrays of any kind, and thus simply checks the {{equals}} methods of all corresponding elements of its parameters {{a1}} and {{a2}}. However, elements of the {{Struct#values}} array may themselves be arrays in a specific case, namely if a field has a {{BYTES}} Schema Type and the user's input for this field is of type {{byte[]}}.
> Given that, I would suggest to use {{Arrays#deepEquals}} to compare the {{values}} arrays of two {{Struct}} instances. With similar reasoning, I would also suggest to use {{Arrays#deepHashCode}} in the Struct's {{hashCode}} method.
> This would allow to properly compare and hash structs that get byte arrays passed in as field values instead of the recommended ByteBuffers. An alternative might be to automatically wrap byte arrays passed into any {{put}} method in a ByteBuffer.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)