You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "bersprockets (via GitHub)" <gi...@apache.org> on 2023/02/08 18:29:05 UTC

[GitHub] [spark] bersprockets commented on a diff in pull request #39945: [SPARK-42384][SQL] Check for null input in generated code for mask function

bersprockets commented on code in PR #39945:
URL: https://github.com/apache/spark/pull/39945#discussion_r1100528061


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/maskExpressions.scala:
##########
@@ -223,16 +223,32 @@ case class Mask(
     val fifthGen = children(4).genCode(ctx)
     val resultCode =
       f(firstGen.value, secondGen.value, thirdGen.value, fourthGen.value, fifthGen.value)
-    ev.copy(
-      code = code"""
+    if (nullable) {
+      // this function is somewhat like a `UnaryExpression`, in that only the first child
+      // determines whether the result is null
+      val nullSafeEval = ctx.nullSafeExec(children(0).nullable, firstGen.isNull)(resultCode)
+      ev.copy(code = code"""
+        ${firstGen.code}
+        ${secondGen.code}
+        ${thirdGen.code}
+        ${fourthGen.code}
+        ${fifthGen.code}
+        boolean ${ev.isNull} = ${firstGen.isNull};
+        ${CodeGenerator.javaType(dataType)} ${ev.value} = ${CodeGenerator.defaultValue(dataType)};
+        $nullSafeEval
+      """)
+    } else {

Review Comment:
   Btw, the new generated code will look like this (if the first child is null, then the result is null):
   ```
   /* 031 */     boolean isNull_1 = i.isNullAt(0);
   /* 032 */     UTF8String value_1 = isNull_1 ?
   /* 033 */     null : (i.getUTF8String(0));
   /* 034 */
   /* 035 */
   /* 036 */
   /* 037 */
   /* 038 */     boolean isNull_0 = isNull_1;
   /* 039 */     UTF8String value_0 = null;
   /* 040 */
   /* 041 */     if (!isNull_1) {
   /* 042 */       value_0 = org.apache.spark.sql.catalyst.expressions.Mask.transformInput(value_1, ((UTF8String) references[0] /* literal */), ((UTF8String) references[1] /* literal */), ((UTF8String) references[2] /* literal */), ((UTF8String) references[3] /* literal */));;
   /* 043 */     }
   /* 044 */     if (isNull_0) {
   /* 045 */       mutableStateArray_0[0].setNullAt(0);
   /* 046 */     } else {
   /* 047 */       mutableStateArray_0[0].write(0, value_0);
   /* 048 */     }
   /* 049 */     return (mutableStateArray_0[0].getRow());
   ```
   Versus the old generated code, which looks like this (call `Mask.transformInput` even when input is null, and call `UnsafeWriter.write(0, value_0)` even if `value_0` is null):
   ```
   /* 031 */     boolean isNull_1 = i.isNullAt(0);
   /* 032 */     UTF8String value_1 = isNull_1 ?
   /* 033 */     null : (i.getUTF8String(0));
   /* 034 */
   /* 035 */
   /* 036 */
   /* 037 */
   /* 038 */     UTF8String value_0 = null;
   /* 039 */     value_0 = org.apache.spark.sql.catalyst.expressions.Mask.transformInput(value_1, ((UTF8String) references[0] /* literal */), ((UTF8String) references[1] /* literal */), ((UTF8String) references[2] /* literal */), ((UTF8String) references[3] /* literal */));;
   /* 040 */     if (false) {
   /* 041 */       mutableStateArray_0[0].setNullAt(0);
   /* 042 */     } else {
   /* 043 */       mutableStateArray_0[0].write(0, value_0);
   /* 044 */     }
   /* 045 */     return (mutableStateArray_0[0].getRow());
   /* 046 */   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org