You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Rymar Maksym (Jira)" <ji...@apache.org> on 2020/11/25 16:05:00 UTC

[jira] [Created] (DRILL-7812) Broken equals/hashcode contract

Rymar Maksym created DRILL-7812:
-----------------------------------

             Summary: Broken equals/hashcode contract 
                 Key: DRILL-7812
                 URL: https://issues.apache.org/jira/browse/DRILL-7812
             Project: Apache Drill
          Issue Type: Bug
            Reporter: Rymar Maksym
            Assignee: Rymar Maksym


*MaterializedField* class [has broken equals/hashCode contract|https://github.com/apache/drill/blob/31d6086c4f814c1d7fc476095611e37cc3d95d1c/exec/vector/src/main/java/org/apache/drill/exec/record/MaterializedField.java#L192]:

{{If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.}}

In our case *{{equals()}}* method depends on 2 fields: name and type. While *{{hashCode()}}* method depends on 3 fields: name, type and child. This is leading to serious bugs. For example, it can occurs in *SortRecordBatchBuilder* class [there|https://github.com/apache/drill/blob/31d6086c4f814c1d7fc476095611e37cc3d95d1c/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/sort/SortRecordBatchBuilder.java#L142] :
{code:java}
if (batches.keySet().size() > 1) {
   throw UserException.validationError(null)
      .message("Sort currently only supports a single schema.")
      .build(logger);
}
{code}
*Batches* is *{{ArrayListMultimap<BatchSchema, RecordBatchData> and}}* when *{{RecordBatchData}}* is insert with *{{BatchSchema}}* key – occurs not expected behaivor, because *{{RecordBatchData}}* hashCode is based on hashCode of MaterializedField:
{code:java}
@Override
public int hashCode() {
  final int prime = 31;
  int result = 1;
  result = prime * result + ((fields == null) ? 0 : fields.hashCode());
  result = prime * result + ((selectionVectorMode == null) ? 0 : selectionVectorMode.hashCode());
  return result;
}{code}
So *{{RecordBatchData}}* with equals *{{BatchSchema}}* are going to be add to *{{ArrayListMultimap}}* as different entries. It's not common situation, and most easily can be reproduced with json tables.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)