You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by GitBox <gi...@apache.org> on 2020/03/02 16:23:40 UTC

[GitHub] [drill] vvysotskyi opened a new pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value

vvysotskyi opened a new pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value
URL: https://github.com/apache/drill/pull/2006
 
 
   # [DRILL-7615](https://issues.apache.org/jira/browse/DRILL-7615): UNION ALL query returns the wrong result for the decimal value
   
   ## Description
   Added checks for union operator to create casts for the case when both input columns have a decimal type, but its scale differs. Enhanced calculation of resulting scale and precision for the decimal data type.
   
   ## Documentation
   NA
   
   ## Testing
   Added unit tests to check the fix, ran full tests suite.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value

Posted by GitBox <gi...@apache.org>.
vvysotskyi commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value
URL: https://github.com/apache/drill/pull/2006#discussion_r388156185
 
 

 ##########
 File path: exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/TestVarlenDecimal.java
 ##########
 @@ -139,8 +145,32 @@ public void testWriteReadCsv() throws Exception {
           .baselineValues(new BigDecimal(bigDecimalValue))
           .go();
     } finally {
-      resetSessionOption(ExecConstants.OUTPUT_FORMAT_OPTION);
-      test("drop table if exists dfs.tmp.%s", tableName);
+      client.resetSession(ExecConstants.OUTPUT_FORMAT_OPTION);
+      run("drop table if exists dfs.tmp.%s", tableName);
+    }
+  }
+
+  @Test
+  public void testUnionAllWithDifferentScales() throws Exception {
+    try {
+      run("create table dfs.tmp.t as select cast(999999999999999 as decimal(15,0)) as d");
+
+      String query = "select cast(1000 as decimal(10,1)) as d\n" +
+          "union all \n" +
+          "select 596.000 as d \n" +
+          "union all \n" +
+          "select d from dfs.tmp.t";
+
+      testBuilder()
+          .sqlQuery(query)
+          .unOrdered()
+          .baselineColumns("d")
+          .baselineValues(new BigDecimal("1000.000"))
+          .baselineValues(new BigDecimal("596.000"))
+          .baselineValues(new BigDecimal("999999999999999.000"))
 
 Review comment:
   @arina-ielchiieva, thanks for pointing to the example, initially I tried a similar approach, but union operator returns multiple batches, so in the case of using `client.queryBuilder().sql(sql).rowSet()`, it will fail with [`IllegalStateException`](https://github.com/apache/drill/blob/09b805aea4dafe50555b23945302cf8f6c491de8/exec/java-exec/src/test/java/org/apache/drill/test/QueryBuilder.java#L359). I also tried to use `rowSetIterator()`, as it was advised in the error message, but it returns `QueryRowSetIterator` which actually caused complications for me. I tried several approaches, one of them is iterating through it and comparing separate row sets, which corresponds to returned batches, but such a check would look obscure since the first batch was empty.
   So I ended up with the simpler check for schema using `testBuilder()`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] arina-ielchiieva commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value

Posted by GitBox <gi...@apache.org>.
arina-ielchiieva commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value
URL: https://github.com/apache/drill/pull/2006#discussion_r388235923
 
 

 ##########
 File path: exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/TestVarlenDecimal.java
 ##########
 @@ -139,8 +145,32 @@ public void testWriteReadCsv() throws Exception {
           .baselineValues(new BigDecimal(bigDecimalValue))
           .go();
     } finally {
-      resetSessionOption(ExecConstants.OUTPUT_FORMAT_OPTION);
-      test("drop table if exists dfs.tmp.%s", tableName);
+      client.resetSession(ExecConstants.OUTPUT_FORMAT_OPTION);
+      run("drop table if exists dfs.tmp.%s", tableName);
+    }
+  }
+
+  @Test
+  public void testUnionAllWithDifferentScales() throws Exception {
+    try {
+      run("create table dfs.tmp.t as select cast(999999999999999 as decimal(15,0)) as d");
+
+      String query = "select cast(1000 as decimal(10,1)) as d\n" +
+          "union all \n" +
+          "select 596.000 as d \n" +
+          "union all \n" +
+          "select d from dfs.tmp.t";
+
+      testBuilder()
+          .sqlQuery(query)
+          .unOrdered()
+          .baselineColumns("d")
+          .baselineValues(new BigDecimal("1000.000"))
+          .baselineValues(new BigDecimal("596.000"))
+          .baselineValues(new BigDecimal("999999999999999.000"))
 
 Review comment:
   Thanks for explanation.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] paul-rogers commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value

Posted by GitBox <gi...@apache.org>.
paul-rogers commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value
URL: https://github.com/apache/drill/pull/2006#discussion_r386776641
 
 

 ##########
 File path: common/src/main/java/org/apache/drill/common/types/Types.java
 ##########
 @@ -795,15 +807,25 @@ public static boolean isSortable(MinorType type) {
    */
   public static MajorType.Builder calculateTypePrecisionAndScale(MajorType leftType, MajorType rightType, MajorType.Builder typeBuilder) {
     if (leftType.getMinorType().equals(rightType.getMinorType())) {
-      final boolean isScalarString = Types.isScalarStringType(leftType) && Types.isScalarStringType(rightType);
-      final boolean isDecimal = isDecimalType(leftType);
+      boolean isScalarString = Types.isScalarStringType(leftType) && Types.isScalarStringType(rightType);
+      boolean isDecimal = isDecimalType(leftType);
 
-      if ((isScalarString || isDecimal) && leftType.hasPrecision() && rightType.hasPrecision()) {
+      if (isScalarString && leftType.hasPrecision() && rightType.hasPrecision()) {
         typeBuilder.setPrecision(Math.max(leftType.getPrecision(), rightType.getPrecision()));
       }
 
-      if (isDecimal && leftType.hasScale() && rightType.hasScale()) {
-        typeBuilder.setScale(Math.max(leftType.getScale(), rightType.getScale()));
+      if (isDecimal) {
+        int scale = Math.max(leftType.getScale(), rightType.getScale());
+        // resulting precision should take into account resulting scale value and be calculated as
+        // sum of two components:
+        // - max integer digits number (precision - scale) for left and right;
+        // - resulting scale.
+        // So for the case of cast(9999 as decimal(4,0)) and cast(1.23 as decimal(3,2))
+        // resulting scale would be Max(0, 2) = 2 and resulting precision would be Max(4 - 0, 3 - 2) + 2 = 6.
+        // In this case, both values would fit into decimal(6, 2): 9999.00, 1.23
+        int precision = Math.max(leftType.getPrecision() - leftType.getScale(), rightType.getPrecision() - rightType.getScale()) + scale;
+        typeBuilder.setPrecision(precision);
+        typeBuilder.setScale(scale);
 
 Review comment:
   Nit: please split the long line.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] paul-rogers commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value

Posted by GitBox <gi...@apache.org>.
paul-rogers commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value
URL: https://github.com/apache/drill/pull/2006#discussion_r386777176
 
 

 ##########
 File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/union/UnionAllRecordBatch.java
 ##########
 @@ -184,38 +184,43 @@ private void createUnionAller(RecordBatch inputBatch) {
       ValueVector vvIn = vw.getValueVector();
       ValueVector vvOut = container.getValueVector(index).getValueVector();
 
-      final ErrorCollector collector = new ErrorCollectorImpl();
-      // According to input data names, Minortypes, Datamodes, choose to
+      MaterializedField inField = vvIn.getField();
+      MaterializedField outputField = vvOut.getField();
+
+      ErrorCollector collector = new ErrorCollectorImpl();
+      // According to input data names, MinorTypes, DataModes, choose to
       // transfer directly,
       // rename columns or
-      // cast data types (Minortype or DataMode)
-      if (container.getSchema().getColumn(index).hasSameTypeAndMode(vvIn.getField())
-          && vvIn.getField().getType().getMinorType() != TypeProtos.MinorType.MAP // Per DRILL-5521, existing bug for map transfer
-          ) {
+      // cast data types (MinorType or DataMode)
+      if (outputField.hasSameTypeAndMode(inField)
 
 Review comment:
   Nit: Seems cleaner to add this predicate to `Types`: `isSameTypeAndMode()`, then say:
   
   ```
   if (Types.isSameTypeAndMode(outputField.getType(), inputField.getType()) {
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value

Posted by GitBox <gi...@apache.org>.
vvysotskyi commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value
URL: https://github.com/apache/drill/pull/2006#discussion_r386997680
 
 

 ##########
 File path: exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/TestVarlenDecimal.java
 ##########
 @@ -139,8 +145,32 @@ public void testWriteReadCsv() throws Exception {
           .baselineValues(new BigDecimal(bigDecimalValue))
           .go();
     } finally {
-      resetSessionOption(ExecConstants.OUTPUT_FORMAT_OPTION);
-      test("drop table if exists dfs.tmp.%s", tableName);
+      client.resetSession(ExecConstants.OUTPUT_FORMAT_OPTION);
+      run("drop table if exists dfs.tmp.%s", tableName);
+    }
+  }
+
+  @Test
+  public void testUnionAllWithDifferentScales() throws Exception {
+    try {
+      run("create table dfs.tmp.t as select cast(999999999999999 as decimal(15,0)) as d");
+
+      String query = "select cast(1000 as decimal(10,1)) as d\n" +
+          "union all \n" +
+          "select 596.000 as d \n" +
+          "union all \n" +
+          "select d from dfs.tmp.t";
+
+      testBuilder()
+          .sqlQuery(query)
+          .unOrdered()
+          .baselineColumns("d")
+          .baselineValues(new BigDecimal("1000.000"))
+          .baselineValues(new BigDecimal("596.000"))
+          .baselineValues(new BigDecimal("999999999999999.000"))
 
 Review comment:
   Thanks, added check for the schema. `testBuilder()` was still used, since query returns several batches, so it became too complex.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] arina-ielchiieva commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value

Posted by GitBox <gi...@apache.org>.
arina-ielchiieva commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value
URL: https://github.com/apache/drill/pull/2006#discussion_r388138604
 
 

 ##########
 File path: exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/TestVarlenDecimal.java
 ##########
 @@ -139,8 +145,32 @@ public void testWriteReadCsv() throws Exception {
           .baselineValues(new BigDecimal(bigDecimalValue))
           .go();
     } finally {
-      resetSessionOption(ExecConstants.OUTPUT_FORMAT_OPTION);
-      test("drop table if exists dfs.tmp.%s", tableName);
+      client.resetSession(ExecConstants.OUTPUT_FORMAT_OPTION);
+      run("drop table if exists dfs.tmp.%s", tableName);
+    }
+  }
+
+  @Test
+  public void testUnionAllWithDifferentScales() throws Exception {
+    try {
+      run("create table dfs.tmp.t as select cast(999999999999999 as decimal(15,0)) as d");
+
+      String query = "select cast(1000 as decimal(10,1)) as d\n" +
+          "union all \n" +
+          "select 596.000 as d \n" +
+          "union all \n" +
+          "select d from dfs.tmp.t";
+
+      testBuilder()
+          .sqlQuery(query)
+          .unOrdered()
+          .baselineColumns("d")
+          .baselineValues(new BigDecimal("1000.000"))
+          .baselineValues(new BigDecimal("596.000"))
+          .baselineValues(new BigDecimal("999999999999999.000"))
 
 Review comment:
   Not sure why this will be complex.
   Here is an example:
   ```
       RowSet actual = client.queryBuilder().sql(sql).rowSet();
   
       TupleMetadata expectedSchema = new SchemaBuilder()
           // add your schema here 
           .buildSchema();
       RowSet expected = new RowSetBuilder(client.allocator(), expectedSchema)
          // add your columns here
           .build();
       RowSetUtilities.verify(expected, actual);
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value

Posted by GitBox <gi...@apache.org>.
vvysotskyi commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value
URL: https://github.com/apache/drill/pull/2006#discussion_r386916026
 
 

 ##########
 File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/union/UnionAllRecordBatch.java
 ##########
 @@ -184,38 +184,43 @@ private void createUnionAller(RecordBatch inputBatch) {
       ValueVector vvIn = vw.getValueVector();
       ValueVector vvOut = container.getValueVector(index).getValueVector();
 
-      final ErrorCollector collector = new ErrorCollectorImpl();
-      // According to input data names, Minortypes, Datamodes, choose to
+      MaterializedField inField = vvIn.getField();
+      MaterializedField outputField = vvOut.getField();
+
+      ErrorCollector collector = new ErrorCollectorImpl();
+      // According to input data names, MinorTypes, DataModes, choose to
       // transfer directly,
       // rename columns or
-      // cast data types (Minortype or DataMode)
-      if (container.getSchema().getColumn(index).hasSameTypeAndMode(vvIn.getField())
-          && vvIn.getField().getType().getMinorType() != TypeProtos.MinorType.MAP // Per DRILL-5521, existing bug for map transfer
-          ) {
+      // cast data types (MinorType or DataMode)
+      if (outputField.hasSameTypeAndMode(inField)
 
 Review comment:
   Thanks for the suggestion, done.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] paul-rogers commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value

Posted by GitBox <gi...@apache.org>.
paul-rogers commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value
URL: https://github.com/apache/drill/pull/2006#discussion_r386777577
 
 

 ##########
 File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/union/UnionAllRecordBatch.java
 ##########
 @@ -184,38 +184,43 @@ private void createUnionAller(RecordBatch inputBatch) {
       ValueVector vvIn = vw.getValueVector();
       ValueVector vvOut = container.getValueVector(index).getValueVector();
 
-      final ErrorCollector collector = new ErrorCollectorImpl();
-      // According to input data names, Minortypes, Datamodes, choose to
+      MaterializedField inField = vvIn.getField();
+      MaterializedField outputField = vvOut.getField();
+
+      ErrorCollector collector = new ErrorCollectorImpl();
+      // According to input data names, MinorTypes, DataModes, choose to
       // transfer directly,
       // rename columns or
-      // cast data types (Minortype or DataMode)
-      if (container.getSchema().getColumn(index).hasSameTypeAndMode(vvIn.getField())
-          && vvIn.getField().getType().getMinorType() != TypeProtos.MinorType.MAP // Per DRILL-5521, existing bug for map transfer
-          ) {
+      // cast data types (MinorType or DataMode)
+      if (outputField.hasSameTypeAndMode(inField)
+          && inField.getType().getMinorType() != TypeProtos.MinorType.MAP // Per DRILL-5521, existing bug for map transfer
+          && (!Types.areDecimalTypes(inField.getType(), outputField.getType())
+              || (Types.areDecimalTypes(outputField.getType(), inField.getType()) // scale should match for decimal data types
+                  && outputField.getType().getScale() == inField.getType().getScale()))) {
 
 Review comment:
   If checking precision and scale (hard to follow), then `Types.isSameType()` might be what you want.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value

Posted by GitBox <gi...@apache.org>.
vvysotskyi commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value
URL: https://github.com/apache/drill/pull/2006#discussion_r386956424
 
 

 ##########
 File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/union/UnionAllRecordBatch.java
 ##########
 @@ -184,38 +184,43 @@ private void createUnionAller(RecordBatch inputBatch) {
       ValueVector vvIn = vw.getValueVector();
       ValueVector vvOut = container.getValueVector(index).getValueVector();
 
-      final ErrorCollector collector = new ErrorCollectorImpl();
-      // According to input data names, Minortypes, Datamodes, choose to
+      MaterializedField inField = vvIn.getField();
+      MaterializedField outputField = vvOut.getField();
+
+      ErrorCollector collector = new ErrorCollectorImpl();
+      // According to input data names, MinorTypes, DataModes, choose to
       // transfer directly,
       // rename columns or
-      // cast data types (Minortype or DataMode)
-      if (container.getSchema().getColumn(index).hasSameTypeAndMode(vvIn.getField())
-          && vvIn.getField().getType().getMinorType() != TypeProtos.MinorType.MAP // Per DRILL-5521, existing bug for map transfer
-          ) {
+      // cast data types (MinorType or DataMode)
+      if (outputField.hasSameTypeAndMode(inField)
+          && inField.getType().getMinorType() != TypeProtos.MinorType.MAP // Per DRILL-5521, existing bug for map transfer
+          && (!Types.areDecimalTypes(inField.getType(), outputField.getType())
+              || (Types.areDecimalTypes(outputField.getType(), inField.getType()) // scale should match for decimal data types
+                  && outputField.getType().getScale() == inField.getType().getScale()))) {
         // Transfer column
         TransferPair tp = vvIn.makeTransferPair(vvOut);
         transfers.add(tp);
-      } else if (vvIn.getField().getType().getMinorType() == TypeProtos.MinorType.NULL) {
+      } else if (inField.getType().getMinorType() == TypeProtos.MinorType.NULL) {
         continue;
       } else { // Copy data in order to rename the column
-        SchemaPath inputPath = SchemaPath.getSimplePath(vvIn.getField().getName());
-        MaterializedField inField = vvIn.getField();
-        MaterializedField outputField = vvOut.getField();
+        SchemaPath inputPath = SchemaPath.getSimplePath(inField.getName());
 
         LogicalExpression expr = ExpressionTreeMaterializer.materialize(inputPath, inputBatch, collector, context.getFunctionRegistry());
         collector.reportErrors(logger);
 
         // If the inputs' DataMode is required and the outputs' DataMode is not required
         // cast to the one with the least restriction
-        if(inField.getType().getMode() == TypeProtos.DataMode.REQUIRED
+        if (inField.getType().getMode() == TypeProtos.DataMode.REQUIRED
             && outputField.getType().getMode() != TypeProtos.DataMode.REQUIRED) {
           expr = ExpressionTreeMaterializer.convertToNullableType(expr, inField.getType().getMinorType(), context.getFunctionRegistry(), collector);
           collector.reportErrors(logger);
         }
 
-        // If two inputs' MinorTypes are different,
-        // Insert a cast before the Union operation
-        if(inField.getType().getMinorType() != outputField.getType().getMinorType()) {
+        // If two inputs' MinorTypes are different or types are decimal with different scales,
+        // inserts a cast before the Union operation
+        if (inField.getType().getMinorType() != outputField.getType().getMinorType()
+            || (Types.areDecimalTypes(outputField.getType(), inField.getType())
+                && outputField.getType().getScale() != inField.getType().getScale())) {
 
 Review comment:
   This condition differs from the condition above. I didn't find similar checks in Drill code. I have extracted this condition into a separate method.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] paul-rogers commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value

Posted by GitBox <gi...@apache.org>.
paul-rogers commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value
URL: https://github.com/apache/drill/pull/2006#discussion_r386776641
 
 

 ##########
 File path: common/src/main/java/org/apache/drill/common/types/Types.java
 ##########
 @@ -795,15 +807,25 @@ public static boolean isSortable(MinorType type) {
    */
   public static MajorType.Builder calculateTypePrecisionAndScale(MajorType leftType, MajorType rightType, MajorType.Builder typeBuilder) {
     if (leftType.getMinorType().equals(rightType.getMinorType())) {
-      final boolean isScalarString = Types.isScalarStringType(leftType) && Types.isScalarStringType(rightType);
-      final boolean isDecimal = isDecimalType(leftType);
+      boolean isScalarString = Types.isScalarStringType(leftType) && Types.isScalarStringType(rightType);
+      boolean isDecimal = isDecimalType(leftType);
 
-      if ((isScalarString || isDecimal) && leftType.hasPrecision() && rightType.hasPrecision()) {
+      if (isScalarString && leftType.hasPrecision() && rightType.hasPrecision()) {
         typeBuilder.setPrecision(Math.max(leftType.getPrecision(), rightType.getPrecision()));
       }
 
-      if (isDecimal && leftType.hasScale() && rightType.hasScale()) {
-        typeBuilder.setScale(Math.max(leftType.getScale(), rightType.getScale()));
+      if (isDecimal) {
+        int scale = Math.max(leftType.getScale(), rightType.getScale());
+        // resulting precision should take into account resulting scale value and be calculated as
+        // sum of two components:
+        // - max integer digits number (precision - scale) for left and right;
+        // - resulting scale.
+        // So for the case of cast(9999 as decimal(4,0)) and cast(1.23 as decimal(3,2))
+        // resulting scale would be Max(0, 2) = 2 and resulting precision would be Max(4 - 0, 3 - 2) + 2 = 6.
+        // In this case, both values would fit into decimal(6, 2): 9999.00, 1.23
+        int precision = Math.max(leftType.getPrecision() - leftType.getScale(), rightType.getPrecision() - rightType.getScale()) + scale;
+        typeBuilder.setPrecision(precision);
+        typeBuilder.setScale(scale);
 
 Review comment:
   Nit: please split the long lint.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value

Posted by GitBox <gi...@apache.org>.
vvysotskyi commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value
URL: https://github.com/apache/drill/pull/2006#discussion_r386907067
 
 

 ##########
 File path: common/src/main/java/org/apache/drill/common/types/Types.java
 ##########
 @@ -131,6 +132,17 @@ public static boolean isIntervalType(MinorType type) {
     }
   }
 
+  /**
+   * Returns true if all specified types are decimal data types.
+   *
+   * @param types types to check
+   * @return true if all specified types are decimal data type.
+   */
+  public static boolean areDecimalTypes(MajorType... types) {
 
 Review comment:
   This method is used in 3 places, so I think it better to introduce it rather than copy and paste the same code.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] arina-ielchiieva commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value

Posted by GitBox <gi...@apache.org>.
arina-ielchiieva commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value
URL: https://github.com/apache/drill/pull/2006#discussion_r386504880
 
 

 ##########
 File path: common/src/main/java/org/apache/drill/common/types/Types.java
 ##########
 @@ -131,6 +132,17 @@ public static boolean isIntervalType(MinorType type) {
     }
   }
 
+  /**
+   * Returns true if all specified types are decimal data types.
+   *
+   * @param types type to check
 
 Review comment:
   ```suggestion
      * @param types types to check
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] paul-rogers commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value

Posted by GitBox <gi...@apache.org>.
paul-rogers commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value
URL: https://github.com/apache/drill/pull/2006#discussion_r386779081
 
 

 ##########
 File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/union/UnionAllRecordBatch.java
 ##########
 @@ -184,38 +184,43 @@ private void createUnionAller(RecordBatch inputBatch) {
       ValueVector vvIn = vw.getValueVector();
       ValueVector vvOut = container.getValueVector(index).getValueVector();
 
-      final ErrorCollector collector = new ErrorCollectorImpl();
-      // According to input data names, Minortypes, Datamodes, choose to
+      MaterializedField inField = vvIn.getField();
+      MaterializedField outputField = vvOut.getField();
+
+      ErrorCollector collector = new ErrorCollectorImpl();
+      // According to input data names, MinorTypes, DataModes, choose to
       // transfer directly,
       // rename columns or
-      // cast data types (Minortype or DataMode)
-      if (container.getSchema().getColumn(index).hasSameTypeAndMode(vvIn.getField())
-          && vvIn.getField().getType().getMinorType() != TypeProtos.MinorType.MAP // Per DRILL-5521, existing bug for map transfer
-          ) {
+      // cast data types (MinorType or DataMode)
+      if (outputField.hasSameTypeAndMode(inField)
+          && inField.getType().getMinorType() != TypeProtos.MinorType.MAP // Per DRILL-5521, existing bug for map transfer
+          && (!Types.areDecimalTypes(inField.getType(), outputField.getType())
+              || (Types.areDecimalTypes(outputField.getType(), inField.getType()) // scale should match for decimal data types
+                  && outputField.getType().getScale() == inField.getType().getScale()))) {
         // Transfer column
         TransferPair tp = vvIn.makeTransferPair(vvOut);
         transfers.add(tp);
-      } else if (vvIn.getField().getType().getMinorType() == TypeProtos.MinorType.NULL) {
+      } else if (inField.getType().getMinorType() == TypeProtos.MinorType.NULL) {
         continue;
       } else { // Copy data in order to rename the column
-        SchemaPath inputPath = SchemaPath.getSimplePath(vvIn.getField().getName());
-        MaterializedField inField = vvIn.getField();
-        MaterializedField outputField = vvOut.getField();
+        SchemaPath inputPath = SchemaPath.getSimplePath(inField.getName());
 
         LogicalExpression expr = ExpressionTreeMaterializer.materialize(inputPath, inputBatch, collector, context.getFunctionRegistry());
         collector.reportErrors(logger);
 
         // If the inputs' DataMode is required and the outputs' DataMode is not required
         // cast to the one with the least restriction
-        if(inField.getType().getMode() == TypeProtos.DataMode.REQUIRED
+        if (inField.getType().getMode() == TypeProtos.DataMode.REQUIRED
             && outputField.getType().getMode() != TypeProtos.DataMode.REQUIRED) {
           expr = ExpressionTreeMaterializer.convertToNullableType(expr, inField.getType().getMinorType(), context.getFunctionRegistry(), collector);
           collector.reportErrors(logger);
         }
 
-        // If two inputs' MinorTypes are different,
-        // Insert a cast before the Union operation
-        if(inField.getType().getMinorType() != outputField.getType().getMinorType()) {
+        // If two inputs' MinorTypes are different or types are decimal with different scales,
+        // inserts a cast before the Union operation
+        if (inField.getType().getMinorType() != outputField.getType().getMinorType()
+            || (Types.areDecimalTypes(outputField.getType(), inField.getType())
+                && outputField.getType().getScale() != inField.getType().getScale())) {
 
 Review comment:
   Would seem to be a general case. Maybe define a `Types.areAssignableTypes(MajorType type1, MajorType type2)` Would return true only if one type is assignable from the other directly. We probably have the same condition in many places.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value

Posted by GitBox <gi...@apache.org>.
vvysotskyi commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value
URL: https://github.com/apache/drill/pull/2006#discussion_r388156185
 
 

 ##########
 File path: exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/TestVarlenDecimal.java
 ##########
 @@ -139,8 +145,32 @@ public void testWriteReadCsv() throws Exception {
           .baselineValues(new BigDecimal(bigDecimalValue))
           .go();
     } finally {
-      resetSessionOption(ExecConstants.OUTPUT_FORMAT_OPTION);
-      test("drop table if exists dfs.tmp.%s", tableName);
+      client.resetSession(ExecConstants.OUTPUT_FORMAT_OPTION);
+      run("drop table if exists dfs.tmp.%s", tableName);
+    }
+  }
+
+  @Test
+  public void testUnionAllWithDifferentScales() throws Exception {
+    try {
+      run("create table dfs.tmp.t as select cast(999999999999999 as decimal(15,0)) as d");
+
+      String query = "select cast(1000 as decimal(10,1)) as d\n" +
+          "union all \n" +
+          "select 596.000 as d \n" +
+          "union all \n" +
+          "select d from dfs.tmp.t";
+
+      testBuilder()
+          .sqlQuery(query)
+          .unOrdered()
+          .baselineColumns("d")
+          .baselineValues(new BigDecimal("1000.000"))
+          .baselineValues(new BigDecimal("596.000"))
+          .baselineValues(new BigDecimal("999999999999999.000"))
 
 Review comment:
   @arina-ielchiieva, thanks for pointing to the example, initially I tried a similar approach, but union operator returns multiple batches, so in the case of using `client.queryBuilder().sql(sql).rowSet()`, it will fail with (`IllegalStateException`)[https://github.com/apache/drill/blob/09b805aea4dafe50555b23945302cf8f6c491de8/exec/java-exec/src/test/java/org/apache/drill/test/QueryBuilder.java#L359]. I also tried to use `rowSetIterator()`, as it was advised in the error message, but it returns `QueryRowSetIterator` which actually caused complications for me. I tried several approaches, one of them is iterating through it and comparing separate row sets, which corresponds to returned batches, but such a check would look obscure since the first batch was empty.
   So I ended up with the simpler check for schema using `testBuilder()`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] paul-rogers commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value

Posted by GitBox <gi...@apache.org>.
paul-rogers commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value
URL: https://github.com/apache/drill/pull/2006#discussion_r386776273
 
 

 ##########
 File path: common/src/main/java/org/apache/drill/common/types/Types.java
 ##########
 @@ -131,6 +132,17 @@ public static boolean isIntervalType(MinorType type) {
     }
   }
 
+  /**
+   * Returns true if all specified types are decimal data types.
+   *
+   * @param types types to check
+   * @return true if all specified types are decimal data type.
+   */
+  public static boolean areDecimalTypes(MajorType... types) {
 
 Review comment:
   General enough to define as a method? Seems we'd end up having similar methods for many predicates.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value

Posted by GitBox <gi...@apache.org>.
vvysotskyi commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value
URL: https://github.com/apache/drill/pull/2006#discussion_r386928587
 
 

 ##########
 File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/union/UnionAllRecordBatch.java
 ##########
 @@ -184,38 +184,43 @@ private void createUnionAller(RecordBatch inputBatch) {
       ValueVector vvIn = vw.getValueVector();
       ValueVector vvOut = container.getValueVector(index).getValueVector();
 
-      final ErrorCollector collector = new ErrorCollectorImpl();
-      // According to input data names, Minortypes, Datamodes, choose to
+      MaterializedField inField = vvIn.getField();
+      MaterializedField outputField = vvOut.getField();
+
+      ErrorCollector collector = new ErrorCollectorImpl();
+      // According to input data names, MinorTypes, DataModes, choose to
       // transfer directly,
       // rename columns or
-      // cast data types (Minortype or DataMode)
-      if (container.getSchema().getColumn(index).hasSameTypeAndMode(vvIn.getField())
-          && vvIn.getField().getType().getMinorType() != TypeProtos.MinorType.MAP // Per DRILL-5521, existing bug for map transfer
-          ) {
+      // cast data types (MinorType or DataMode)
+      if (outputField.hasSameTypeAndMode(inField)
+          && inField.getType().getMinorType() != TypeProtos.MinorType.MAP // Per DRILL-5521, existing bug for map transfer
+          && (!Types.areDecimalTypes(inField.getType(), outputField.getType())
+              || (Types.areDecimalTypes(outputField.getType(), inField.getType()) // scale should match for decimal data types
+                  && outputField.getType().getScale() == inField.getType().getScale()))) {
 
 Review comment:
   For Decimal data type, the scale is only essential here, since data transfer may be done for the case of different precisions without corrupting the data. Extracted whole this condition into a separate method to make code more readable.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] asfgit closed pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value

Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value
URL: https://github.com/apache/drill/pull/2006
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] arina-ielchiieva commented on issue #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value

Posted by GitBox <gi...@apache.org>.
arina-ielchiieva commented on issue #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value
URL: https://github.com/apache/drill/pull/2006#issuecomment-595182240
 
 
   +1 from my side, @paul-rogers is PR ready to go?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] arina-ielchiieva commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value

Posted by GitBox <gi...@apache.org>.
arina-ielchiieva commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value
URL: https://github.com/apache/drill/pull/2006#discussion_r388138604
 
 

 ##########
 File path: exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/TestVarlenDecimal.java
 ##########
 @@ -139,8 +145,32 @@ public void testWriteReadCsv() throws Exception {
           .baselineValues(new BigDecimal(bigDecimalValue))
           .go();
     } finally {
-      resetSessionOption(ExecConstants.OUTPUT_FORMAT_OPTION);
-      test("drop table if exists dfs.tmp.%s", tableName);
+      client.resetSession(ExecConstants.OUTPUT_FORMAT_OPTION);
+      run("drop table if exists dfs.tmp.%s", tableName);
+    }
+  }
+
+  @Test
+  public void testUnionAllWithDifferentScales() throws Exception {
+    try {
+      run("create table dfs.tmp.t as select cast(999999999999999 as decimal(15,0)) as d");
+
+      String query = "select cast(1000 as decimal(10,1)) as d\n" +
+          "union all \n" +
+          "select 596.000 as d \n" +
+          "union all \n" +
+          "select d from dfs.tmp.t";
+
+      testBuilder()
+          .sqlQuery(query)
+          .unOrdered()
+          .baselineColumns("d")
+          .baselineValues(new BigDecimal("1000.000"))
+          .baselineValues(new BigDecimal("596.000"))
+          .baselineValues(new BigDecimal("999999999999999.000"))
 
 Review comment:
   Not sure why this will be complex.
   Here is an example:
   ```
       RowSet actual = client.queryBuilder().sql(sql).rowSet();
   
       TupleMetadata expectedSchema = new SchemaBuilder()
           // add your schema here 
           .buildSchema();
       RowSet expected = new RowSetBuilder(client.allocator(), expectedSchema)
          // add your columns here
           .build();
       RowSetUtilities.verify(expected, actual);
   ```
   See `TestCsvIgnoreHeaders` for example.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] paul-rogers commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value

Posted by GitBox <gi...@apache.org>.
paul-rogers commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value
URL: https://github.com/apache/drill/pull/2006#discussion_r386779320
 
 

 ##########
 File path: exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/TestVarlenDecimal.java
 ##########
 @@ -139,8 +145,32 @@ public void testWriteReadCsv() throws Exception {
           .baselineValues(new BigDecimal(bigDecimalValue))
           .go();
     } finally {
-      resetSessionOption(ExecConstants.OUTPUT_FORMAT_OPTION);
-      test("drop table if exists dfs.tmp.%s", tableName);
+      client.resetSession(ExecConstants.OUTPUT_FORMAT_OPTION);
+      run("drop table if exists dfs.tmp.%s", tableName);
+    }
+  }
+
+  @Test
+  public void testUnionAllWithDifferentScales() throws Exception {
+    try {
+      run("create table dfs.tmp.t as select cast(999999999999999 as decimal(15,0)) as d");
+
+      String query = "select cast(1000 as decimal(10,1)) as d\n" +
+          "union all \n" +
+          "select 596.000 as d \n" +
+          "union all \n" +
+          "select d from dfs.tmp.t";
+
+      testBuilder()
+          .sqlQuery(query)
+          .unOrdered()
+          .baselineColumns("d")
+          .baselineValues(new BigDecimal("1000.000"))
+          .baselineValues(new BigDecimal("596.000"))
+          .baselineValues(new BigDecimal("999999999999999.000"))
 
 Review comment:
   This tests *values*. Would it make sense to the use the `RowSet` version so you can also verify *types*, which is the focus of this PR?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value

Posted by GitBox <gi...@apache.org>.
vvysotskyi commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value
URL: https://github.com/apache/drill/pull/2006#discussion_r386526648
 
 

 ##########
 File path: common/src/main/java/org/apache/drill/common/types/Types.java
 ##########
 @@ -131,6 +132,17 @@ public static boolean isIntervalType(MinorType type) {
     }
   }
 
+  /**
+   * Returns true if all specified types are decimal data types.
+   *
+   * @param types type to check
 
 Review comment:
   Thanks, fixed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services