You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by kiszk <gi...@git.apache.org> on 2018/02/19 19:34:10 UTC

[GitHub] spark pull request #20637: Remove redundant null checks in generated Java co...

GitHub user kiszk opened a pull request:

    https://github.com/apache/spark/pull/20637

    Remove redundant null checks in generated Java code by GenerateUnsafeProjection

    ## What changes were proposed in this pull request?
    
    This PR works for one of TODOs in `GenerateUnsafeProjection` "if the nullability of field is correct, we can use it to save null check" to simplify generated code.  
    When `nullable=false` in `DataType`, `GenerateUnsafeProjection` removed code for null checks in the generated Java code.
    
    The following is an example.
    
    Source code
    ```
        val dataType3 = (new StructType)
          .add("a", StringType, nullable = false)
          .add("b", StringType, nullable = false)
          .add("c", StringType, nullable = false)
        val exprs3 = BoundReference(0, dataType3, nullable = false) :: Nil
        val projection3 = GenerateUnsafeProjection.generate(exprs3)
        projection3.apply(InternalRow(AlwaysNonNull))
    ```
    Generated code without this PR
    ```
    /* 001 */ public java.lang.Object generate(Object[] references) {
    /* 002 */   return new SpecificUnsafeProjection(references);
    /* 003 */ }
    /* 004 */
    /* 005 */ class SpecificUnsafeProjection extends org.apache.spark.sql.catalyst.expressions.UnsafeProjection {
    /* 006 */
    /* 007 */   private Object[] references;
    /* 008 */   private org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder[] mutableStateArray1 = new org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder[1];
    /* 009 */   private org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter[] mutableStateArray2 = new org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter[2];
    /* 010 */   private UnsafeRow[] mutableStateArray = new UnsafeRow[1];
    /* 011 */
    /* 012 */   public SpecificUnsafeProjection(Object[] references) {
    /* 013 */     this.references = references;
    /* 014 */     mutableStateArray[0] = new UnsafeRow(1);
    /* 015 */     mutableStateArray1[0] = new org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder(mutableStateArray[0], 32);
    /* 016 */     mutableStateArray2[0] = new org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter(mutableStateArray1[0], 1);
    /* 017 */     mutableStateArray2[1] = new org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter(mutableStateArray1[0], 3);
    /* 018 */
    /* 019 */   }
    /* 020 */
    /* 021 */   public void initialize(int partitionIndex) {
    /* 022 */
    /* 023 */   }
    /* 024 */
    /* 025 */   // Scala.Function1 need this
    /* 026 */   public java.lang.Object apply(java.lang.Object row) {
    /* 027 */     return apply((InternalRow) row);
    /* 028 */   }
    /* 029 */
    /* 030 */   public UnsafeRow apply(InternalRow i) {
    /* 031 */     mutableStateArray1[0].reset();
    /* 032 */
    /* 033 */     InternalRow value = i.getStruct(0, 3);
    /* 034 */     // Remember the current cursor so that we can calculate how many bytes are
    /* 035 */     // written later.
    /* 036 */     final int tmpCursor = mutableStateArray1[0].cursor;
    /* 037 */
    /* 038 */     final InternalRow tmpInput = value;
    /* 039 */     if (tmpInput instanceof UnsafeRow) {
    /* 040 */
    /* 041 */       final int sizeInBytes = ((UnsafeRow) tmpInput).getSizeInBytes();
    /* 042 */       // grow the global buffer before writing data.
    /* 043 */       mutableStateArray1[0].grow(sizeInBytes);
    /* 044 */       ((UnsafeRow) tmpInput).writeToMemory(mutableStateArray1[0].buffer, mutableStateArray1[0].cursor);
    /* 045 */       mutableStateArray1[0].cursor += sizeInBytes;
    /* 046 */
    /* 047 */     } else {
    /* 048 */       mutableStateArray2[1].reset();
    /* 049 */
    /* 050 */
    /* 051 */       if (tmpInput.isNullAt(0)) {
    /* 052 */         mutableStateArray2[1].setNullAt(0);
    /* 053 */       } else {
    /* 054 */         mutableStateArray2[1].write(0, tmpInput.getUTF8String(0));
    /* 055 */       }
    /* 056 */
    /* 057 */
    /* 058 */       if (tmpInput.isNullAt(1)) {
    /* 059 */         mutableStateArray2[1].setNullAt(1);
    /* 060 */       } else {
    /* 061 */         mutableStateArray2[1].write(1, tmpInput.getUTF8String(1));
    /* 062 */       }
    /* 063 */
    /* 064 */
    /* 065 */       if (tmpInput.isNullAt(2)) {
    /* 066 */         mutableStateArray2[1].setNullAt(2);
    /* 067 */       } else {
    /* 068 */         mutableStateArray2[1].write(2, tmpInput.getUTF8String(2));
    /* 069 */       }
    /* 070 */     }
    /* 071 */
    /* 072 */     mutableStateArray2[0].setOffsetAndSize(0, tmpCursor, mutableStateArray1[0].cursor - tmpCursor);
    /* 073 */     mutableStateArray[0].setTotalSize(mutableStateArray1[0].totalSize());
    /* 074 */     return mutableStateArray[0];
    /* 075 */   }
    /* 076 */
    /* 077 */
    /* 078 */ }
    ```
    
    Generated code with this PR
    ```
    /* 001 */ public java.lang.Object generate(Object[] references) {
    /* 002 */   return new SpecificUnsafeProjection(references);
    /* 003 */ }
    /* 004 */
    /* 005 */ class SpecificUnsafeProjection extends org.apache.spark.sql.catalyst.expressions.UnsafeProjection {
    /* 006 */
    /* 007 */   private Object[] references;
    /* 008 */   private org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder[] mutableStateArray1 = new org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder[1];
    /* 009 */   private org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter[] mutableStateArray2 = new org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter[2];
    /* 010 */   private UnsafeRow[] mutableStateArray = new UnsafeRow[1];
    /* 011 */
    /* 012 */   public SpecificUnsafeProjection(Object[] references) {
    /* 013 */     this.references = references;
    /* 014 */     mutableStateArray[0] = new UnsafeRow(1);
    /* 015 */     mutableStateArray1[0] = new org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder(mutableStateArray[0], 32);
    /* 016 */     mutableStateArray2[0] = new org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter(mutableStateArray1[0], 1);
    /* 017 */     mutableStateArray2[1] = new org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter(mutableStateArray1[0], 3);
    /* 018 */
    /* 019 */   }
    /* 020 */
    /* 021 */   public void initialize(int partitionIndex) {
    /* 022 */
    /* 023 */   }
    /* 024 */
    /* 025 */   // Scala.Function1 need this
    /* 026 */   public java.lang.Object apply(java.lang.Object row) {
    /* 027 */     return apply((InternalRow) row);
    /* 028 */   }
    /* 029 */
    /* 030 */   public UnsafeRow apply(InternalRow i) {
    /* 031 */     mutableStateArray1[0].reset();
    /* 032 */
    /* 033 */     InternalRow value = i.getStruct(0, 3);
    /* 034 */     // Remember the current cursor so that we can calculate how many bytes are
    /* 035 */     // written later.
    /* 036 */     final int tmpCursor = mutableStateArray1[0].cursor;
    /* 037 */
    /* 038 */     final InternalRow tmpInput = value;
    /* 039 */     if (tmpInput instanceof UnsafeRow) {
    /* 040 */
    /* 041 */       final int sizeInBytes = ((UnsafeRow) tmpInput).getSizeInBytes();
    /* 042 */       // grow the global buffer before writing data.
    /* 043 */       mutableStateArray1[0].grow(sizeInBytes);
    /* 044 */       ((UnsafeRow) tmpInput).writeToMemory(mutableStateArray1[0].buffer, mutableStateArray1[0].cursor);
    /* 045 */       mutableStateArray1[0].cursor += sizeInBytes;
    /* 046 */
    /* 047 */     } else {
    /* 048 */       mutableStateArray2[1].reset();
    /* 049 */
    /* 050 */
    /* 051 */       mutableStateArray2[1].write(0, tmpInput.getUTF8String(0));
    /* 052 */
    /* 053 */
    /* 054 */       mutableStateArray2[1].write(1, tmpInput.getUTF8String(1));
    /* 055 */
    /* 056 */
    /* 057 */       mutableStateArray2[1].write(2, tmpInput.getUTF8String(2));
    /* 058 */     }
    /* 059 */
    /* 060 */     mutableStateArray2[0].setOffsetAndSize(0, tmpCursor, mutableStateArray1[0].cursor - tmpCursor);
    /* 061 */     mutableStateArray[0].setTotalSize(mutableStateArray1[0].totalSize());
    /* 062 */     return mutableStateArray[0];
    /* 063 */   }
    /* 064 */
    /* 065 */
    /* 066 */ }
    ```
    
    ## How was this patch tested?
    
    Added new test cases into `GenerateUnsafeProjectionSuite`

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/kiszk/spark SPARK-23466

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20637.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20637
    
----
commit e2e9e3604284949d9d762274e2e1f55348851073
Author: Kazuaki Ishizaki <is...@...>
Date:   2018-02-19T19:31:36Z

    initial commit

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r211693601
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala ---
    @@ -43,25 +45,30 @@ object GenerateUnsafeProjection extends CodeGenerator[Seq[Expression], UnsafePro
         case _ => false
       }
     
    -  // TODO: if the nullability of field is correct, we can use it to save null check.
       private def writeStructToBuffer(
           ctx: CodegenContext,
           input: String,
           index: String,
    -      fieldTypes: Seq[DataType],
    +      fieldTypeAndNullables: Seq[Schema],
           rowWriter: String): String = {
         // Puts `input` in a local variable to avoid to re-evaluate it if it's a statement.
         val tmpInput = ctx.freshName("tmpInput")
    -    val fieldEvals = fieldTypes.zipWithIndex.map { case (dt, i) =>
    -      ExprCode(
    -        JavaCode.isNullExpression(s"$tmpInput.isNullAt($i)"),
    -        JavaCode.expression(CodeGenerator.getValue(tmpInput, dt, i.toString), dt))
    +    val fieldEvals = fieldTypeAndNullables.zipWithIndex.map { case (dtNullable, i) =>
    +      val isNull = if (dtNullable.nullable) {
    +        JavaCode.isNullExpression(s"$tmpInput.isNullAt($i)")
    +      } else {
    +        FalseLiteral
    +      }
    +      ExprCode(isNull, JavaCode.expression(
    +        CodeGenerator.getValue(tmpInput, dtNullable.dataType, i.toString), dtNullable.dataType))
         }
     
         val rowWriterClass = classOf[UnsafeRowWriter].getName
         val structRowWriter = ctx.addMutableState(rowWriterClass, "rowWriter",
           v => s"$v = new $rowWriterClass($rowWriter, ${fieldEvals.length});")
         val previousCursor = ctx.freshName("previousCursor")
    +    val structExpressions = writeExpressionsToBuffer(
    +      ctx, tmpInput, fieldEvals, fieldTypeAndNullables.map(_.dataType), structRowWriter)
    --- End diff --
    
    Ah, i see. `expression.genCode()` called in `ctx.generateExpressions()` looks conservation.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #87548 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87548/testReport)** for PR 20637 at commit [`e2e9e36`](https://github.com/apache/spark/commit/e2e9e3604284949d9d762274e2e1f55348851073).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2334/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: Remove redundant null checks in generated Java code by G...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87544/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: Remove redundant null checks in generated Java code by G...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94973/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r211132393
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala ---
    @@ -43,25 +45,30 @@ object GenerateUnsafeProjection extends CodeGenerator[Seq[Expression], UnsafePro
         case _ => false
       }
     
    -  // TODO: if the nullability of field is correct, we can use it to save null check.
       private def writeStructToBuffer(
           ctx: CodegenContext,
           input: String,
           index: String,
    -      fieldTypes: Seq[DataType],
    +      fieldTypeAndNullables: Seq[Schema],
           rowWriter: String): String = {
         // Puts `input` in a local variable to avoid to re-evaluate it if it's a statement.
         val tmpInput = ctx.freshName("tmpInput")
    -    val fieldEvals = fieldTypes.zipWithIndex.map { case (dt, i) =>
    -      ExprCode(
    -        JavaCode.isNullExpression(s"$tmpInput.isNullAt($i)"),
    -        JavaCode.expression(CodeGenerator.getValue(tmpInput, dt, i.toString), dt))
    +    val fieldEvals = fieldTypeAndNullables.zipWithIndex.map { case (dtNullable, i) =>
    +      val isNull = if (dtNullable.nullable) {
    +        JavaCode.isNullExpression(s"$tmpInput.isNullAt($i)")
    +      } else {
    +        FalseLiteral
    +      }
    +      ExprCode(isNull, JavaCode.expression(
    +        CodeGenerator.getValue(tmpInput, dtNullable.dataType, i.toString), dtNullable.dataType))
         }
     
         val rowWriterClass = classOf[UnsafeRowWriter].getName
         val structRowWriter = ctx.addMutableState(rowWriterClass, "rowWriter",
           v => s"$v = new $rowWriterClass($rowWriter, ${fieldEvals.length});")
         val previousCursor = ctx.freshName("previousCursor")
    +    val structExpressions = writeExpressionsToBuffer(
    +      ctx, tmpInput, fieldEvals, fieldTypeAndNullables.map(_.dataType), structRowWriter)
    --- End diff --
    
    Don't we need to pass nullable to `writeExpressionsToBuffer`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r212817682
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala ---
    @@ -223,8 +223,9 @@ trait ExpressionEvalHelper extends GeneratorDrivenPropertyChecks with PlanTestBa
               }
             } else {
               val lit = InternalRow(expected, expected)
    +          val dtAsNullable = expression.dataType.asNullable
    --- End diff --
    
    Sorry, I think I have not explained properly my idea/suggestion and I think there is somthing wrong in what you are saying. Without #22126 and with this PR we have two scenarios:
    
     - Using this `asNullable`: the test fails with incorrect evaluation.
     - Without this `asNullable` the test fails with NullPointerException (it is not passing unexpectedly).
    
    So, IMHO, the right behavior is the second one. For 2 reasons: this is what a user would get in such a scenario; in the first case what you return is a wrong result, which may also be equal to the right one in very weird conditions (with the risk of non-detecting error situation).
    
    So I am strongly against introducing this `asNullable` here. I think we just need to leave it as before.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #95044 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95044/testReport)** for PR 20637 at commit [`d379718`](https://github.com/apache/spark/commit/d379718466f26d38ad5e3a87d22d654aa924d2ae).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r211365612
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala ---
    @@ -43,25 +45,30 @@ object GenerateUnsafeProjection extends CodeGenerator[Seq[Expression], UnsafePro
         case _ => false
       }
     
    -  // TODO: if the nullability of field is correct, we can use it to save null check.
       private def writeStructToBuffer(
           ctx: CodegenContext,
           input: String,
           index: String,
    -      fieldTypes: Seq[DataType],
    +      fieldTypeAndNullables: Seq[Schema],
           rowWriter: String): String = {
         // Puts `input` in a local variable to avoid to re-evaluate it if it's a statement.
         val tmpInput = ctx.freshName("tmpInput")
    -    val fieldEvals = fieldTypes.zipWithIndex.map { case (dt, i) =>
    -      ExprCode(
    -        JavaCode.isNullExpression(s"$tmpInput.isNullAt($i)"),
    -        JavaCode.expression(CodeGenerator.getValue(tmpInput, dt, i.toString), dt))
    +    val fieldEvals = fieldTypeAndNullables.zipWithIndex.map { case (dtNullable, i) =>
    +      val isNull = if (dtNullable.nullable) {
    +        JavaCode.isNullExpression(s"$tmpInput.isNullAt($i)")
    +      } else {
    +        FalseLiteral
    +      }
    +      ExprCode(isNull, JavaCode.expression(
    +        CodeGenerator.getValue(tmpInput, dtNullable.dataType, i.toString), dtNullable.dataType))
         }
     
         val rowWriterClass = classOf[UnsafeRowWriter].getName
         val structRowWriter = ctx.addMutableState(rowWriterClass, "rowWriter",
           v => s"$v = new $rowWriterClass($rowWriter, ${fieldEvals.length});")
         val previousCursor = ctx.freshName("previousCursor")
    +    val structExpressions = writeExpressionsToBuffer(
    +      ctx, tmpInput, fieldEvals, fieldTypeAndNullables.map(_.dataType), structRowWriter)
    --- End diff --
    
    I think `nullable` is passed thru `fieldEvals`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    with the test removed, do we still need this change? https://github.com/apache/spark/pull/20637/files#diff-41747ec3f56901eb7bfb95d2a217e94dR226


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r208677695
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala ---
    @@ -308,10 +319,10 @@ object GenerateUnsafeProjection extends CodeGenerator[Seq[Expression], UnsafePro
           expressions: Seq[Expression],
           useSubexprElimination: Boolean = false): ExprCode = {
         val exprEvals = ctx.generateExpressions(expressions, useSubexprElimination)
    -    val exprTypes = expressions.map(_.dataType)
    +    val exprTypeAndNullables = expressions.map(e => (e.dataType, e.nullable))
     
    -    val numVarLenFields = exprTypes.count {
    -      case dt if UnsafeRow.isFixedLength(dt) => false
    +    val numVarLenFields = exprTypeAndNullables.count {
    +      case (dt, _) if UnsafeRow.isFixedLength(dt) => false
    --- End diff --
    
    sure


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #95205 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95205/testReport)** for PR 20637 at commit [`d7ef82c`](https://github.com/apache/spark/commit/d7ef82c53c10e6953d077801045ba1438c6670ab).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #94464 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94464/testReport)** for PR 20637 at commit [`3ec2b19`](https://github.com/apache/spark/commit/3ec2b19a0a4f5b1b7090f4a18ac153f6751efeba).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r209160237
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala ---
    @@ -43,25 +43,29 @@ object GenerateUnsafeProjection extends CodeGenerator[Seq[Expression], UnsafePro
         case _ => false
       }
     
    -  // TODO: if the nullability of field is correct, we can use it to save null check.
       private def writeStructToBuffer(
           ctx: CodegenContext,
           input: String,
           index: String,
    -      fieldTypes: Seq[DataType],
    +      fieldTypeAndNullables: Seq[(DataType, Boolean)],
    --- End diff --
    
    shall we create a class for `(DataType, Boolean)`? it can also be used in https://github.com/apache/spark/pull/22063


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r211468226
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala ---
    @@ -43,25 +45,30 @@ object GenerateUnsafeProjection extends CodeGenerator[Seq[Expression], UnsafePro
         case _ => false
       }
     
    -  // TODO: if the nullability of field is correct, we can use it to save null check.
       private def writeStructToBuffer(
           ctx: CodegenContext,
           input: String,
           index: String,
    -      fieldTypes: Seq[DataType],
    +      fieldTypeAndNullables: Seq[Schema],
           rowWriter: String): String = {
         // Puts `input` in a local variable to avoid to re-evaluate it if it's a statement.
         val tmpInput = ctx.freshName("tmpInput")
    -    val fieldEvals = fieldTypes.zipWithIndex.map { case (dt, i) =>
    -      ExprCode(
    -        JavaCode.isNullExpression(s"$tmpInput.isNullAt($i)"),
    -        JavaCode.expression(CodeGenerator.getValue(tmpInput, dt, i.toString), dt))
    +    val fieldEvals = fieldTypeAndNullables.zipWithIndex.map { case (dtNullable, i) =>
    +      val isNull = if (dtNullable.nullable) {
    +        JavaCode.isNullExpression(s"$tmpInput.isNullAt($i)")
    +      } else {
    +        FalseLiteral
    +      }
    +      ExprCode(isNull, JavaCode.expression(
    +        CodeGenerator.getValue(tmpInput, dtNullable.dataType, i.toString), dtNullable.dataType))
         }
     
         val rowWriterClass = classOf[UnsafeRowWriter].getName
         val structRowWriter = ctx.addMutableState(rowWriterClass, "rowWriter",
           v => s"$v = new $rowWriterClass($rowWriter, ${fieldEvals.length});")
         val previousCursor = ctx.freshName("previousCursor")
    +    val structExpressions = writeExpressionsToBuffer(
    +      ctx, tmpInput, fieldEvals, fieldTypeAndNullables.map(_.dataType), structRowWriter)
    --- End diff --
    
    I see here, but another call of `writeExpressionsToBuffer` from `createCode` should pass nullable to `writeExpressionsToBuffer` because `exprEvals.isNull` there is not always `FalseLiteral` even if an expression is non-nullable?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r212832680
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala ---
    @@ -223,8 +223,9 @@ trait ExpressionEvalHelper extends GeneratorDrivenPropertyChecks with PlanTestBa
               }
             } else {
               val lit = InternalRow(expected, expected)
    +          val dtAsNullable = expression.dataType.asNullable
    --- End diff --
    
    I understood what you said but this is not what I have experienced. I applied your patch on current master. Then I reverted #22126. What I saw (and this is what I'd expect indeed) is that:
    
     - with your patch as it is (ie. with `asNullable`) the test failed because the result is incorrect;
     - without `asNullable` the test failed with a NPE.
    
    Probably someone else can also try and check who is wrong here. cc @ueshin @cloud-fan 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95550/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    LGTM


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95479/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2701/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #87548 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87548/testReport)** for PR 20637 at commit [`e2e9e36`](https://github.com/apache/spark/commit/e2e9e3604284949d9d762274e2e1f55348851073).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95109/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #87549 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87549/testReport)** for PR 20637 at commit [`e2e9e36`](https://github.com/apache/spark/commit/e2e9e3604284949d9d762274e2e1f55348851073).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Now the topic becomes detecting bad UTs, instead of "Remove redundant null checks".
    
    Can we focus on "Remove redundant null checks" in this PR and send another PR for detecting bad UTs?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2448/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r212569851
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala ---
    @@ -70,7 +76,7 @@ object GenerateUnsafeProjection extends CodeGenerator[Seq[Expression], UnsafePro
            |  // Remember the current cursor so that we can calculate how many bytes are
            |  // written later.
            |  final int $previousCursor = $rowWriter.cursor();
    -       |  ${writeExpressionsToBuffer(ctx, tmpInput, fieldEvals, fieldTypes, structRowWriter)}
    --- End diff --
    
    nit: this seems an unneeded change


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r211132711
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelperSuite.scala ---
    @@ -35,6 +35,24 @@ class ExpressionEvalHelperSuite extends SparkFunSuite with ExpressionEvalHelper
         val e = intercept[RuntimeException] { checkEvaluation(BadCodegenExpression(), 10) }
         assert(e.getMessage.contains("some_variable"))
       }
    +
    +  test("SPARK-23466: checkEvaluationWithUnsafeProjection should fail if null is compared with " +
    +    "primitive default value") {
    +    val expected = Array(null, -1, 0, 1)
    +    val catalystValue = CatalystTypeConverters.convertToCatalyst(expected)
    +
    +    val expression1 = CreateArray(
    +      Seq(Literal(null, IntegerType), Literal(-1), Literal(0), Literal(1)))
    +    assert(expression1.dataType.containsNull)
    +    checkEvaluationWithUnsafeProjection(expression1, catalystValue)
    +
    +    val expression2 = CreateArray(Seq(Literal(0, IntegerType), Literal(-1), Literal(0), Literal(1)))
    +    assert(!expression2.dataType.containsNull)
    +    val e = intercept[RuntimeException] {
    +      checkEvaluationWithUnsafeProjection(expression2, catalystValue)
    --- End diff --
    
    We should use `checkExceptionInExpression`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    BTW there are so many kinds of bad UTs, we should clearly define the scope when fixing it.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1975/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r208698654
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala ---
    @@ -219,15 +235,17 @@ object GenerateUnsafeProjection extends CodeGenerator[Seq[Expression], UnsafePro
            |  // Remember the current cursor so that we can write numBytes of key array later.
            |  final int $tmpCursor = $rowWriter.cursor();
            |
    -       |  ${writeArrayToBuffer(ctx, s"$tmpInput.keyArray()", keyType, rowWriter)}
    +       |  ${writeArrayToBuffer(
    +               ctx, s"$tmpInput.keyArray()", keyType, false, rowWriter)}
    --- End diff --
    
    this can be on one line right?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87548/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #94896 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94896/testReport)** for PR 20637 at commit [`675409e`](https://github.com/apache/spark/commit/675409e7de8b25082b12433eb3867a20ec6786f4).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Thanks! merging to master.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #94899 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94899/testReport)** for PR 20637 at commit [`84961b4`](https://github.com/apache/spark/commit/84961b44d0f846e241c322f0f80d8dc032f6008d).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r212926645
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala ---
    @@ -223,8 +223,9 @@ trait ExpressionEvalHelper extends GeneratorDrivenPropertyChecks with PlanTestBa
               }
             } else {
               val lit = InternalRow(expected, expected)
    +          val dtAsNullable = expression.dataType.asNullable
    --- End diff --
    
    thanks for your comment @ueshin. If I am the only one with a different opinion that's ok, but I don't really agree, as it means that we are testing something different from what is run by users. Moreover the same can happen everywhere else in the codebase where we don't check nullabilities.
    
    Anyway, if I am the only one with a different opinion, that's fine. Thanks.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #94973 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94973/testReport)** for PR 20637 at commit [`cac9484`](https://github.com/apache/spark/commit/cac948426c7f40fafe829ea10317cf5759e6d78a).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r212882312
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala ---
    @@ -223,8 +223,9 @@ trait ExpressionEvalHelper extends GeneratorDrivenPropertyChecks with PlanTestBa
               }
             } else {
               val lit = InternalRow(expected, expected)
    +          val dtAsNullable = expression.dataType.asNullable
    --- End diff --
    
    I think we need `asNullable` there.
    From the example of `MapZipWith` with wrong nullabilities (before #22126), the test [HigherOrderFunctionsSuite.scala#L312-L314](https://github.com/apache/spark/blob/d7ef82c53c10e6953d077801045ba1438c6670ab/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/HigherOrderFunctionsSuite.scala#L312-L314) should apparently fail and with `asNullable` can detect it, whereas without `asNullable`, the test passes mistakenly.
    
    I can understand that we should fail with a NPE, but as for primitive types with wrong nullabilities (in case where it should be `true` but was `false` by mistake), we use the default value for the type, and create a wrong result instead of throwing a NPE. If without `asNullable`, the expected value will also be converted to the wrong value and the comparison will success between both wrong values, but if with `asNullable`, the expected value will stay the value as is without converting to the wrong value, then we can detect the wrong actual value.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #95550 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95550/testReport)** for PR 20637 at commit [`88c74c6`](https://github.com/apache/spark/commit/88c74c61d5ab64eb860b91261d013702983ac49c).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95044/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #94539 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94539/testReport)** for PR 20637 at commit [`0f7ae11`](https://github.com/apache/spark/commit/0f7ae11df7e53cdcf4576b7f558f3135e951dcb7).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #95479 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95479/testReport)** for PR 20637 at commit [`37dc4d8`](https://github.com/apache/spark/commit/37dc4d8da0d9654c36494b516cfdfb869e66afc2).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Note that if we miss non-primitive type tests unfortunately, we can't detect the bad UTs without `asNullable` because they won't throw any exceptions but just use the default value.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    +1 for ^


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r212832157
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala ---
    @@ -223,8 +223,9 @@ trait ExpressionEvalHelper extends GeneratorDrivenPropertyChecks with PlanTestBa
               }
             } else {
               val lit = InternalRow(expected, expected)
    +          val dtAsNullable = expression.dataType.asNullable
    --- End diff --
    
    > Without this asNullable the test fails with NullPointerException (it is not passing unexpectedly).
    
    As I wrote [here](https://github.com/apache/spark/pull/20637#discussion_r212805339) as `The unexpected test pass can be recreated without #22126 and with #20367 by removing asNullable.`, the following behavior occurs without #22126,
    
    * Using this `asNullable`: this test fails since the result of `MapZipWith` is incorrect
    * Without this `asNullable` the test passes while the result of `MapZipWith` is incorrect
    
    The second one is `not` right behavior. Without `asNullable`, we will have the risk of non-detecting error situation. Therefore, I introduced `asNullable`.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    @cloud-fan @mgaido91 Do you have any other comments on this?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r212589702
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala ---
    @@ -223,8 +223,9 @@ trait ExpressionEvalHelper extends GeneratorDrivenPropertyChecks with PlanTestBa
               }
             } else {
               val lit = InternalRow(expected, expected)
    +          val dtAsNullable = expression.dataType.asNullable
    --- End diff --
    
    Mmh, but a test using a non-nullable expression returning `null` should fail IMHO. So if without this there are test failures, I think those tests are wrong and/or we are hiding potential bugs in the codebase, right? Are there many test failures without this change?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #94904 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94904/testReport)** for PR 20637 at commit [`84961b4`](https://github.com/apache/spark/commit/84961b44d0f846e241c322f0f80d8dc032f6008d).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2283/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r213884870
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelperSuite.scala ---
    @@ -35,6 +35,24 @@ class ExpressionEvalHelperSuite extends SparkFunSuite with ExpressionEvalHelper
         val e = intercept[RuntimeException] { checkEvaluation(BadCodegenExpression(), 10) }
         assert(e.getMessage.contains("some_variable"))
       }
    +
    +  test("SPARK-23466: checkEvaluationWithUnsafeProjection should fail if null is compared with " +
    --- End diff --
    
    is this necessary? It looks to me that this is a malformed test case: you are putting a wrong expected value.
    
    It's good to improve the test framework to detect this kind of wrongly written tests, but I don't think we have to do it now.
    
    Another topic is, when the nullability doesn't match the data, how Spark should detect it. This is a different story and we can investigate later.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94881/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #94806 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94806/testReport)** for PR 20637 at commit [`99731ca`](https://github.com/apache/spark/commit/99731ca058f0d8946397530aff76d3c55fa93162).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `  case class Schema(dataType: DataType, nullable: Boolean)`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r212683385
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala ---
    @@ -223,8 +223,9 @@ trait ExpressionEvalHelper extends GeneratorDrivenPropertyChecks with PlanTestBa
               }
             } else {
               val lit = InternalRow(expected, expected)
    +          val dtAsNullable = expression.dataType.asNullable
    --- End diff --
    
    Without this change (`dataType.asNullable`), `MapZipWith` test case would not work correctly. `NullPointerException` occurs when `UnsafeArrayData` for `expected` is created.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2214/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/962/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r212805339
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala ---
    @@ -223,8 +223,9 @@ trait ExpressionEvalHelper extends GeneratorDrivenPropertyChecks with PlanTestBa
               }
             } else {
               val lit = InternalRow(expected, expected)
    +          val dtAsNullable = expression.dataType.asNullable
    --- End diff --
    
    I am sorry that I made a mistake to explain somewhere. Let me explain this again.
    
    `MapZipWith` without #22126 was incorrect since it generated `DataType(MapType(IntegerType, IntegerType, valueContainsNull = false))` for the result `Map(1 -> -10, 2 -> -80, 3 -> null, 4 -> null)`. This problem was fixed by #22126. Such a mistake will be deteched if the same issue would occur in the future.
    
    Now, we assume that another function `A` would generate `DataType(MapType(IntegerType, IntegerType, valueContainsNull = false))` for the result `Map(1 -> null)` if the expected result is `Map(1 -> null)`.   
    `ExpressionEvalHelper.checkEvaluationWithUnsafeProjection()` generates `Unsafe...` for `expected`. If this PR (#20367) is not applied, the generated projection code always inserts `isNullAt()`. Therefore, the `UnsafeMapData(1 -> null)` is generated while `DataType(MapType(IntegerType, IntegerType, valueContainsNull = false))`. 
     
    As you know `ExpressionEvalHelper.checkEvaluationWithUnsafeProjection()` uses `DataType` of `expression` when `Unsafe...` for `expected` is generated. This PR (#20367) removes `isNullAt()` from the generated projection code if `DataType.nullable` is `false`. If `Map(1 -> null)` with `DataType(MapType(IntegerType, IntegerType, valueContainsNull = false))` will be converted to `UnsafeArrayData` for `expect`, the test is succeeded **unexpectedly**. This is because `nullbits` in  `UnsafeMapData` for `expect` is not set. This `UnsafeMapData` is equal to the `UnsafeMapData` in `expression`.
    
    If `expression.dataType.asNullable` is applied, the generated projection code for `expected` has `isNullAt()`. Thus, while `expression.dataType` is incorrect, the `expected` is correctly converted into `Unsafe...` with `null` elements.
    
    The **unexpected** test pass can be recreated without #22126 and with #20367 by removing `asNullable`.
    
    The **correct** test failure can be produced without #22126 and with #20367 by keeping `asNullable` as follows:
    
    ```
    org.scalatest.exceptions.TestFailedException: Incorrect evaluation in unsafe mode: map_zip_with(keys: [1,2,3], values: [10,20,30], keys: [1,2,4], values: [-1,-2,-4], lambdafunction(((lambda arg1#10 * lambda arg2#11) * lambda arg3#12), lambda arg1#10, lambda arg2#11, lambda arg3#12, false)), actual: [0,1800000048,6000000048,20,4,0,200000001,400000003,4,0,ffffffb0fffffff6,0,20,4,0,200000001,400000003,4,0,ffffffb0fffffff6,0], expected: [0,1800000048,6000000048,20,4,0,200000001,400000003,4,c,ffffffb0fffffff6,0,20,4,0,200000001,400000003,4,c,ffffffb0fffffff6,0]
    ```
    
    I would appreciate it if you would have better solution.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    cc @ueshin @cloud-fan 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2393/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2027/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r211131717
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala ---
    @@ -43,25 +45,30 @@ object GenerateUnsafeProjection extends CodeGenerator[Seq[Expression], UnsafePro
         case _ => false
       }
     
    -  // TODO: if the nullability of field is correct, we can use it to save null check.
       private def writeStructToBuffer(
           ctx: CodegenContext,
           input: String,
           index: String,
    -      fieldTypes: Seq[DataType],
    +      fieldTypeAndNullables: Seq[Schema],
           rowWriter: String): String = {
         // Puts `input` in a local variable to avoid to re-evaluate it if it's a statement.
         val tmpInput = ctx.freshName("tmpInput")
    -    val fieldEvals = fieldTypes.zipWithIndex.map { case (dt, i) =>
    -      ExprCode(
    -        JavaCode.isNullExpression(s"$tmpInput.isNullAt($i)"),
    -        JavaCode.expression(CodeGenerator.getValue(tmpInput, dt, i.toString), dt))
    +    val fieldEvals = fieldTypeAndNullables.zipWithIndex.map { case (dtNullable, i) =>
    --- End diff --
    
    nit: how about `case (Schema(dt, nullable), i) =>`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #94878 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94878/testReport)** for PR 20637 at commit [`99731ca`](https://github.com/apache/spark/commit/99731ca058f0d8946397530aff76d3c55fa93162).
     * This patch **fails due to an unknown error code, -9**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `  case class Schema(dataType: DataType, nullable: Boolean)`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r209692496
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala ---
    @@ -43,25 +43,29 @@ object GenerateUnsafeProjection extends CodeGenerator[Seq[Expression], UnsafePro
         case _ => false
       }
     
    -  // TODO: if the nullability of field is correct, we can use it to save null check.
       private def writeStructToBuffer(
           ctx: CodegenContext,
           input: String,
           index: String,
    -      fieldTypes: Seq[DataType],
    +      fieldTypeAndNullables: Seq[(DataType, Boolean)],
    --- End diff --
    
    @cloud-fan I found this case class `case class Schema(dataType: DataType, nullable: Boolean)` at two places.
    1. `ScalaReflection.Schema`
    1. `SchemaConverters.SchemaType`
    
    Do we use one of them? Or do we define `org.apache.spark.sql.types.Schema`?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r212625801
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala ---
    @@ -223,8 +223,9 @@ trait ExpressionEvalHelper extends GeneratorDrivenPropertyChecks with PlanTestBa
               }
             } else {
               val lit = InternalRow(expected, expected)
    +          val dtAsNullable = expression.dataType.asNullable
    --- End diff --
    
    sorry, I am not sure I got this. #22126 was merged, so if we remove this fix from this PR, are there failures? If so are they only related to `MapZipWith`? Because if that is the case, I'd rather fix those tests than this. Am I missing something? Thanks.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: Remove redundant null checks in generated Java code by G...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/20637


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95560/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r213013507
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala ---
    @@ -223,8 +223,9 @@ trait ExpressionEvalHelper extends GeneratorDrivenPropertyChecks with PlanTestBa
               }
             } else {
               val lit = InternalRow(expected, expected)
    +          val dtAsNullable = expression.dataType.asNullable
    --- End diff --
    
    @ueshin @cloud-fan Thank you for good summary.
    
    I think that this does not reduce test coverage.  
    This ` dtAsNullable = expression.dataType.asNullable` is used only for generating `expected`. This `asNullable` does not change `dataType` of `expression`. Thus, this does not change our optimization assumption.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94896/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r206841463
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala ---
    @@ -142,7 +143,7 @@ object GenerateUnsafeProjection extends CodeGenerator[Seq[Expression], UnsafePro
               case _ => s"$rowWriter.write($index, ${input.value});"
             }
     
    -        if (input.isNull == "false") {
    +        if (input.isNull == "false" || !nullable) {
    --- End diff --
    
    aren't those checks equivalent?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #94539 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94539/testReport)** for PR 20637 at commit [`0f7ae11`](https://github.com/apache/spark/commit/0f7ae11df7e53cdcf4576b7f558f3135e951dcb7).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r213689187
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelperSuite.scala ---
    @@ -35,6 +35,24 @@ class ExpressionEvalHelperSuite extends SparkFunSuite with ExpressionEvalHelper
         val e = intercept[RuntimeException] { checkEvaluation(BadCodegenExpression(), 10) }
         assert(e.getMessage.contains("some_variable"))
       }
    +
    +  test("SPARK-23466: checkEvaluationWithUnsafeProjection should fail if null is compared with " +
    --- End diff --
    
    @cloud-fan This test tries to confirm whether `Array(null, -1, 0, 1)` with `ArrayType(IntegerType, false)` in `expression2` should fail when it is compared with `expected = Array(null, -1, 0, 1) `.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #94878 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94878/testReport)** for PR 20637 at commit [`99731ca`](https://github.com/apache/spark/commit/99731ca058f0d8946397530aff76d3c55fa93162).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    I believe we still need this change.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #94881 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94881/testReport)** for PR 20637 at commit [`99731ca`](https://github.com/apache/spark/commit/99731ca058f0d8946397530aff76d3c55fa93162).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `  case class Schema(dataType: DataType, nullable: Boolean)`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2525/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #95479 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95479/testReport)** for PR 20637 at commit [`37dc4d8`](https://github.com/apache/spark/commit/37dc4d8da0d9654c36494b516cfdfb869e66afc2).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r211812298
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala ---
    @@ -110,7 +116,7 @@ object GenerateUnsafeProjection extends CodeGenerator[Seq[Expression], UnsafePro
             }
     
             val writeField = writeElement(ctx, input.value, index.toString, dt, rowWriter)
    -        if (input.isNull == FalseLiteral) {
    +        if (input.isNull == FalseLiteral || !nullable) {
    --- End diff --
    
    `input.isNull == FalseLiteral || ` is not needed?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #95560 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95560/testReport)** for PR 20637 at commit [`88c74c6`](https://github.com/apache/spark/commit/88c74c61d5ab64eb860b91261d013702983ac49c).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #94899 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94899/testReport)** for PR 20637 at commit [`84961b4`](https://github.com/apache/spark/commit/84961b44d0f846e241c322f0f80d8dc032f6008d).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1960/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r212799852
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala ---
    @@ -223,8 +223,9 @@ trait ExpressionEvalHelper extends GeneratorDrivenPropertyChecks with PlanTestBa
               }
             } else {
               val lit = InternalRow(expected, expected)
    +          val dtAsNullable = expression.dataType.asNullable
    --- End diff --
    
    Mmmh... I am not sure. I just applied you patch on the current master and removed this `asNullable`. I run `MapZipWith` test and it passed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    We need to detect the correctly written test with a wrong result. Let us think about the following `map_zip_with` without #22126.
    
    In the following example, `map_zip_with` without #22126 return `expr` that has `Map(1 -> -10, 4 -> null)` with `MapType(IntegerType, IntegerType, valueContainsNull = false)`. The `DataTypeP is not incorrect since `Map(...)` includes `null`. Thus, the test must be failed.
    
    As described in [the comment](https://github.com/apache/spark/pull/20637#discussion_r212817682),
    
    * With `asNullable`: the test fails with incorrect evaluation (it is not passing expectedly)
    * Without `asNullable`: the test fails with NullPointerException (it is not passing unexpectedly).
    
    Since #22126 has been merged, the test is passed. If someone would unintentionally generate incorrect dataType, we must detect the mistake by failing the test without an exception. To avoid the exception, we need `asNullable`.
    
    Is this an answer to your question?
    
    
    ```
    val miia = Literal.create(Map(1 -> 10),
      MapType(IntegerType, IntegerType, valueContainsNull = false))
    val miib = Literal.create(Map(1 -> -1, 4 -> -4),
      MapType(IntegerType, IntegerType, valueContainsNull = false))
    
    val expr = map_zip_with(miia, miib, multiplyKeyWithValues)
    checkEvaluation(expr, Map(1 -> -10, 4 -> null))
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r212621229
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala ---
    @@ -223,8 +223,9 @@ trait ExpressionEvalHelper extends GeneratorDrivenPropertyChecks with PlanTestBa
               }
             } else {
               val lit = InternalRow(expected, expected)
    +          val dtAsNullable = expression.dataType.asNullable
    --- End diff --
    
    Without this change (`asNullable`), `NullPointerException` occurs if `expected` has `null` for a primitive type (e.g. `int`).
    If you run `"MapZipWith"` test suite without this PR, you can see `NullPointerException`. If you run `"MapZipWith"` with [the PR](https://github.com/apache/spark/pull/22126), you can see the failure (i.e. hidden potential bug).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    > If someone would unintentionally generate incorrect dataType, we must detect the mistake by failing the test without an exception.
    
    This is the part I don't agree with. If someone writes a bad UT, it is good to get an exception. It is also an hint that the problem is in the UT itself, rather than in the code. But we already discussed this and we have different opinion on this. :)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #87549 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87549/testReport)** for PR 20637 at commit [`e2e9e36`](https://github.com/apache/spark/commit/e2e9e3604284949d9d762274e2e1f55348851073).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94464/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87549/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Jenkins, retest this please.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: Remove redundant null checks in generated Java code by G...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #87544 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87544/testReport)** for PR 20637 at commit [`e2e9e36`](https://github.com/apache/spark/commit/e2e9e3604284949d9d762274e2e1f55348851073).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r208677733
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala ---
    @@ -142,7 +143,7 @@ object GenerateUnsafeProjection extends CodeGenerator[Seq[Expression], UnsafePro
               case _ => s"$rowWriter.write($index, ${input.value});"
             }
     
    -        if (input.isNull == "false") {
    +        if (input.isNull == "false" || !nullable) {
    --- End diff --
    
    good catch, thanks


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r209822349
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala ---
    @@ -43,25 +43,29 @@ object GenerateUnsafeProjection extends CodeGenerator[Seq[Expression], UnsafePro
         case _ => false
       }
     
    -  // TODO: if the nullability of field is correct, we can use it to save null check.
       private def writeStructToBuffer(
           ctx: CodegenContext,
           input: String,
           index: String,
    -      fieldTypes: Seq[DataType],
    +      fieldTypeAndNullables: Seq[(DataType, Boolean)],
    --- End diff --
    
    I think you can define one within this class for readability. Probably we should take a look to deduplicate them for the instances you pointed out, but not sure yet for now if how much that deduplication improve the readability and if that's worth. Those are all look rather defined and used within small scopes rather locally.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1965/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #94896 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94896/testReport)** for PR 20637 at commit [`675409e`](https://github.com/apache/spark/commit/675409e7de8b25082b12433eb3867a20ec6786f4).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2266/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    can you give a concrete example of how this can detect incorrect results? The removed test case only shows how we can detect a wrongly written test.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94904/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #94806 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94806/testReport)** for PR 20637 at commit [`99731ca`](https://github.com/apache/spark/commit/99731ca058f0d8946397530aff76d3c55fa93162).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r212649063
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala ---
    @@ -223,8 +223,9 @@ trait ExpressionEvalHelper extends GeneratorDrivenPropertyChecks with PlanTestBa
               }
             } else {
               val lit = InternalRow(expected, expected)
    +          val dtAsNullable = expression.dataType.asNullable
    --- End diff --
    
    Without #22126, `MapZipWith` made incorrect `DataType`. However, no test case can detect the failure. This PR can detect that incorrect `DataType` in `MapZipWith` without #22126.
    
    I would appreciate it if you would run the test case without #22126, without and with this PR.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95205/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94878/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #95560 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95560/testReport)** for PR 20637 at commit [`88c74c6`](https://github.com/apache/spark/commit/88c74c61d5ab64eb860b91261d013702983ac49c).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r209180525
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala ---
    @@ -43,25 +43,29 @@ object GenerateUnsafeProjection extends CodeGenerator[Seq[Expression], UnsafePro
         case _ => false
       }
     
    -  // TODO: if the nullability of field is correct, we can use it to save null check.
       private def writeStructToBuffer(
           ctx: CodegenContext,
           input: String,
           index: String,
    -      fieldTypes: Seq[DataType],
    +      fieldTypeAndNullables: Seq[(DataType, Boolean)],
    --- End diff --
    
    I think that it would be good since it is used at `JavaTypeInference` and `higherOrderFunctions`.
    cc @ueshin


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r213556478
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelperSuite.scala ---
    @@ -35,6 +35,24 @@ class ExpressionEvalHelperSuite extends SparkFunSuite with ExpressionEvalHelper
         val e = intercept[RuntimeException] { checkEvaluation(BadCodegenExpression(), 10) }
         assert(e.getMessage.contains("some_variable"))
       }
    +
    +  test("SPARK-23466: checkEvaluationWithUnsafeProjection should fail if null is compared with " +
    --- End diff --
    
    what is this test doing? Let the test framework discover wrongly written test cases?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2287/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #95109 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95109/testReport)** for PR 20637 at commit [`34a4f7b`](https://github.com/apache/spark/commit/34a4f7bc9cd6c79739f06a7ff691856877930005).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94445/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94899/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r212960763
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala ---
    @@ -223,8 +223,9 @@ trait ExpressionEvalHelper extends GeneratorDrivenPropertyChecks with PlanTestBa
               }
             } else {
               val lit = InternalRow(expected, expected)
    +          val dtAsNullable = expression.dataType.asNullable
    --- End diff --
    
    I didn't go through the entire thread. But my opinion is that, the data type nullable should match the real data.
    
    BTW will this reduce test coverage? It seems the optimization for non-nullable fields is not tested if we always assume the expression is nullable.
    
    > we use the default value for the type, and create a wrong result instead of throwing a NPE.
    This is expected and I think it's a common symptom for nullable-mismatch problems. Why can't our test expose it?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r209416053
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala ---
    @@ -43,25 +43,29 @@ object GenerateUnsafeProjection extends CodeGenerator[Seq[Expression], UnsafePro
         case _ => false
       }
     
    -  // TODO: if the nullability of field is correct, we can use it to save null check.
       private def writeStructToBuffer(
           ctx: CodegenContext,
           input: String,
           index: String,
    -      fieldTypes: Seq[DataType],
    +      fieldTypeAndNullables: Seq[(DataType, Boolean)],
    --- End diff --
    
    @cloud-fan What name of the case class do you suggest?  `DataTypeNullable`, or others?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Jenkins, retest this please.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #95109 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95109/testReport)** for PR 20637 at commit [`34a4f7b`](https://github.com/apache/spark/commit/34a4f7bc9cd6c79739f06a7ff691856877930005).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #95044 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95044/testReport)** for PR 20637 at commit [`d379718`](https://github.com/apache/spark/commit/d379718466f26d38ad5e3a87d22d654aa924d2ae).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    The failure of `org.apache.spark.sql.catalyst.expressions.JsonExpressionsSuite.from_json missing fields` is due to passing `null` while the schema has `nullable=false`.
    
    This inconsistency is agreed in the discussion at [SPARK-23173](https://issues.apache.org/jira/browse/SPARK-23173). 
    `Assume that each field in schema passed to from_json is nullable, and ignore the nullability information set in the passed schema.`
    
    When `spark.sql.fromJsonForceNullableSchema=false`, I think that a test is invalid to pass `nullable=false` in the corresponding schema to the missing field.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/963/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #94445 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94445/testReport)** for PR 20637 at commit [`2d5c2eb`](https://github.com/apache/spark/commit/2d5c2ebbc68add2a6a639c83af496bfdd74b3808).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r212668425
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala ---
    @@ -223,8 +223,9 @@ trait ExpressionEvalHelper extends GeneratorDrivenPropertyChecks with PlanTestBa
               }
             } else {
               val lit = InternalRow(expected, expected)
    +          val dtAsNullable = expression.dataType.asNullable
    --- End diff --
    
    Ok, thanks now I think I got it. But I think that the NullPointerException is right to be thrown. That is what a user would get in such a situation. So I don't really think this change should be added.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r213977445
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelperSuite.scala ---
    @@ -35,6 +35,24 @@ class ExpressionEvalHelperSuite extends SparkFunSuite with ExpressionEvalHelper
         val e = intercept[RuntimeException] { checkEvaluation(BadCodegenExpression(), 10) }
         assert(e.getMessage.contains("some_variable"))
       }
    +
    +  test("SPARK-23466: checkEvaluationWithUnsafeProjection should fail if null is compared with " +
    --- End diff --
    
    I see. Let me drop this test case.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    I see. Let me remove the change regarding `asNullable` from this PR. I will create another PR after this PR is merged.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r212584773
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala ---
    @@ -223,8 +223,9 @@ trait ExpressionEvalHelper extends GeneratorDrivenPropertyChecks with PlanTestBa
               }
             } else {
               val lit = InternalRow(expected, expected)
    +          val dtAsNullable = expression.dataType.asNullable
    --- End diff --
    
    Good question. This change related to the part in the description
    ```
    On the other hand, expected will be converted to UnsafeArrayData(null) considering its value.
    ```
    
    Since this PR removes nullchecks in the generated code based on DataType of `expression`,  `null` in `expected` never converted to `null`. To forcefully insert null check in the generated code to convert `experted` to `UnsafeArrayData`, we need `asNullable`.
    
    Does it answer to this question? 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    > When spark.sql.fromJsonForceNullableSchema=false, I think that a test is invalid to pass nullable=false in the corresponding schema to the missing field.
    
    +1. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94806/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #94973 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94973/testReport)** for PR 20637 at commit [`cac9484`](https://github.com/apache/spark/commit/cac948426c7f40fafe829ea10317cf5759e6d78a).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: Remove redundant null checks in generated Java code by G...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/961/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r209161074
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala ---
    @@ -170,6 +174,23 @@ object GenerateUnsafeProjection extends CodeGenerator[Seq[Expression], UnsafePro
     
         val element = CodeGenerator.getValue(tmpInput, et, index)
     
    +    val primitiveTypeName = if (CodeGenerator.isPrimitiveType(jt)) {
    --- End diff --
    
    where do we use it?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: Remove redundant null checks in generated Java code by G...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #87544 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87544/testReport)** for PR 20637 at commit [`e2e9e36`](https://github.com/apache/spark/commit/e2e9e3604284949d9d762274e2e1f55348851073).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94539/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2263/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #94464 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94464/testReport)** for PR 20637 at commit [`3ec2b19`](https://github.com/apache/spark/commit/3ec2b19a0a4f5b1b7090f4a18ac153f6751efeba).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r208696353
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala ---
    @@ -70,7 +72,8 @@ object GenerateUnsafeProjection extends CodeGenerator[Seq[Expression], UnsafePro
            |  // Remember the current cursor so that we can calculate how many bytes are
            |  // written later.
            |  final int $previousCursor = $rowWriter.cursor();
    -       |  ${writeExpressionsToBuffer(ctx, tmpInput, fieldEvals, fieldTypes, structRowWriter)}
    +       |  ${writeExpressionsToBuffer(
    --- End diff --
    
    nit: in situations like this I have been told to move the function call before and assign it to a variable


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r212570823
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala ---
    @@ -223,8 +223,9 @@ trait ExpressionEvalHelper extends GeneratorDrivenPropertyChecks with PlanTestBa
               }
             } else {
               val lit = InternalRow(expected, expected)
    +          val dtAsNullable = expression.dataType.asNullable
    --- End diff --
    
    why do we need `asNullable` here?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r206702526
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala ---
    @@ -308,10 +319,10 @@ object GenerateUnsafeProjection extends CodeGenerator[Seq[Expression], UnsafePro
           expressions: Seq[Expression],
           useSubexprElimination: Boolean = false): ExprCode = {
         val exprEvals = ctx.generateExpressions(expressions, useSubexprElimination)
    -    val exprTypes = expressions.map(_.dataType)
    +    val exprTypeAndNullables = expressions.map(e => (e.dataType, e.nullable))
     
    -    val numVarLenFields = exprTypes.count {
    -      case dt if UnsafeRow.isFixedLength(dt) => false
    +    val numVarLenFields = exprTypeAndNullables.count {
    +      case (dt, _) if UnsafeRow.isFixedLength(dt) => false
    --- End diff --
    
    is `.count { case (dt, _) => !UnsafeRow.isFixedLength(dt) }` more straightforward?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #94881 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94881/testReport)** for PR 20637 at commit [`99731ca`](https://github.com/apache/spark/commit/99731ca058f0d8946397530aff76d3c55fa93162).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r209178573
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala ---
    @@ -170,6 +174,23 @@ object GenerateUnsafeProjection extends CodeGenerator[Seq[Expression], UnsafePro
     
         val element = CodeGenerator.getValue(tmpInput, et, index)
     
    +    val primitiveTypeName = if (CodeGenerator.isPrimitiveType(jt)) {
    --- End diff --
    
    good catch


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2749/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2744/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2280/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #94445 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94445/testReport)** for PR 20637 at commit [`2d5c2eb`](https://github.com/apache/spark/commit/2d5c2ebbc68add2a6a639c83af496bfdd74b3808).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #95205 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95205/testReport)** for PR 20637 at commit [`d7ef82c`](https://github.com/apache/spark/commit/d7ef82c53c10e6953d077801045ba1438c6670ab).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20637#discussion_r211370785
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelperSuite.scala ---
    @@ -35,6 +35,24 @@ class ExpressionEvalHelperSuite extends SparkFunSuite with ExpressionEvalHelper
         val e = intercept[RuntimeException] { checkEvaluation(BadCodegenExpression(), 10) }
         assert(e.getMessage.contains("some_variable"))
       }
    +
    +  test("SPARK-23466: checkEvaluationWithUnsafeProjection should fail if null is compared with " +
    +    "primitive default value") {
    +    val expected = Array(null, -1, 0, 1)
    +    val catalystValue = CatalystTypeConverters.convertToCatalyst(expected)
    +
    +    val expression1 = CreateArray(
    +      Seq(Literal(null, IntegerType), Literal(-1), Literal(0), Literal(1)))
    +    assert(expression1.dataType.containsNull)
    +    checkEvaluationWithUnsafeProjection(expression1, catalystValue)
    +
    +    val expression2 = CreateArray(Seq(Literal(0, IntegerType), Literal(-1), Literal(0), Literal(1)))
    +    assert(!expression2.dataType.containsNull)
    +    val e = intercept[RuntimeException] {
    +      checkEvaluationWithUnsafeProjection(expression2, catalystValue)
    --- End diff --
    
    Unfortunately, `checkEvaluationWithUnsafeProjection()`'s return type is `Unit`...


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #94904 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94904/testReport)** for PR 20637 at commit [`84961b4`](https://github.com/apache/spark/commit/84961b44d0f846e241c322f0f80d8dc032f6008d).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20637: [SPARK-23466][SQL] Remove redundant null checks in gener...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20637
  
    **[Test build #95550 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95550/testReport)** for PR 20637 at commit [`88c74c6`](https://github.com/apache/spark/commit/88c74c61d5ab64eb860b91261d013702983ac49c).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org