You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Charles Givre (Jira)" <ji...@apache.org> on 2020/11/26 13:25:00 UTC

[jira] [Updated] (DRILL-7812) Broken equals/hashcode contract

     [ https://issues.apache.org/jira/browse/DRILL-7812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Charles Givre updated DRILL-7812:
---------------------------------
    Affects Version/s: 1.17.0

> Broken equals/hashcode contract 
> --------------------------------
>
>                 Key: DRILL-7812
>                 URL: https://issues.apache.org/jira/browse/DRILL-7812
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.17.0
>            Reporter: Rymar Maksym
>            Assignee: Rymar Maksym
>            Priority: Major
>
> *MaterializedField* class [has broken equals/hashCode contract|https://github.com/apache/drill/blob/31d6086c4f814c1d7fc476095611e37cc3d95d1c/exec/vector/src/main/java/org/apache/drill/exec/record/MaterializedField.java#L192]:
> {{If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.}}
> In our case *{{equals()}}* method depends on 2 fields: name and type. While *{{hashCode()}}* method depends on 3 fields: name, type and child. This is leading to serious bugs. For example, it can occurs in *SortRecordBatchBuilder* class [there|https://github.com/apache/drill/blob/31d6086c4f814c1d7fc476095611e37cc3d95d1c/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/sort/SortRecordBatchBuilder.java#L142] :
> {code:java}
> if (batches.keySet().size() > 1) {
>    throw UserException.validationError(null)
>       .message("Sort currently only supports a single schema.")
>       .build(logger);
> }
> {code}
> *Batches* is *{{ArrayListMultimap<BatchSchema, RecordBatchData> and}}* when *{{RecordBatchData}}* is insert with *{{BatchSchema}}* key – occurs not expected behaivor, because *{{RecordBatchData}}* hashCode is based on hashCode of MaterializedField:
> {code:java}
> @Override
> public int hashCode() {
>   final int prime = 31;
>   int result = 1;
>   result = prime * result + ((fields == null) ? 0 : fields.hashCode());
>   result = prime * result + ((selectionVectorMode == null) ? 0 : selectionVectorMode.hashCode());
>   return result;
> }{code}
> So *{{RecordBatchData}}* with equals *{{BatchSchema}}* are going to be add to *{{ArrayListMultimap}}* as different entries. It's not common situation, and most easily can be reproduced with json tables.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)