You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "beliefer (via GitHub)" <gi...@apache.org> on 2023/03/27 10:23:55 UTC

[GitHub] [spark] beliefer opened a new pull request, #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor array_append and array_prepend with `RuntimeReplaceable`

beliefer opened a new pull request, #40563:
URL: https://github.com/apache/spark/pull/40563

   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   'No'.
   Just update the inner implementation.
   
   
   ### How was this patch tested?
   Exists test case.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] infoankitp commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable`

Posted by "infoankitp (via GitHub)" <gi...@apache.org>.
infoankitp commented on code in PR #40563:
URL: https://github.com/apache/spark/pull/40563#discussion_r1151952607


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1400,120 +1400,24 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
-// scalastyle:off line.size.limit
-@ExpressionDescription(
-  usage = """
-      _FUNC_(array, element) - Add the element at the beginning of the array passed as first
-      argument. Type of element should be the same as the type of the elements of the array.
-      Null element is also prepended to the array. But if the array passed is NULL
-      output is NULL
-    """,
-  examples = """
-    Examples:
-      > SELECT _FUNC_(array('b', 'd', 'c', 'a'), 'd');
-       ["d","b","d","c","a"]
-      > SELECT _FUNC_(array(1, 2, 3, null), null);
-       [null,1,2,3,null]
-      > SELECT _FUNC_(CAST(null as Array<Int>), 2);
-       NULL
-  """,
-  group = "array_funcs",
-  since = "3.5.0")
-case class ArrayPrepend(left: Expression, right: Expression)
-  extends BinaryExpression
-    with ImplicitCastInputTypes
-    with ComplexTypeMergingExpression
-    with QueryErrorsBase {
-
-  override def nullable: Boolean = left.nullable
+trait InsertArrayOneSide extends RuntimeReplaceable
+  with ImplicitCastInputTypes with BinaryLike[Expression] with QueryErrorsBase {

Review Comment:
   Any diff between BinaryLike[Expression] and BinaryExpression? Also, any specific reason for removing ComplexTypeMergingExpression, I think this can help in assigning values of containsNull and nullable fields ?



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -5056,128 +4950,45 @@ case class ArrayCompact(child: Expression)
   """,
   since = "3.4.0",
   group = "array_funcs")
-case class ArrayAppend(left: Expression, right: Expression)
-  extends BinaryExpression
-    with ImplicitCastInputTypes
-    with ComplexTypeMergingExpression
-    with QueryErrorsBase {
-  override def prettyName: String = "array_append"
+case class ArrayAppend(left: Expression, right: Expression) extends InsertArrayOneSide {
 
-  @transient protected lazy val elementType: DataType =
-    inputTypes.head.asInstanceOf[ArrayType].elementType
+  override lazy val replacement: Expression =
+    ArrayInsert(left, Add(ArraySize(left), Literal(1)), right)

Review Comment:
   Inspite of ArraySize(left), I think ArrayInsert supports negative values, how about using Literal(-1), to select the position.



##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala:
##########
@@ -1855,50 +1855,6 @@ class CollectionExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper
     checkEvaluation(ArrayRepeat(Literal("hi"), Literal(null, IntegerType)), null)
   }
 
-  test("SPARK-41233: ArrayPrepend") {

Review Comment:
   I hope we have reviewed that we are not missing any type of test cases in DataFrameFunctionsSuite from these. If we are missing, please help add those. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable`

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on code in PR #40563:
URL: https://github.com/apache/spark/pull/40563#discussion_r1151386273


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1400,120 +1400,24 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
-// scalastyle:off line.size.limit
-@ExpressionDescription(
-  usage = """
-      _FUNC_(array, element) - Add the element at the beginning of the array passed as first
-      argument. Type of element should be the same as the type of the elements of the array.
-      Null element is also prepended to the array. But if the array passed is NULL
-      output is NULL
-    """,
-  examples = """
-    Examples:
-      > SELECT _FUNC_(array('b', 'd', 'c', 'a'), 'd');
-       ["d","b","d","c","a"]
-      > SELECT _FUNC_(array(1, 2, 3, null), null);
-       [null,1,2,3,null]
-      > SELECT _FUNC_(CAST(null as Array<Int>), 2);
-       NULL
-  """,
-  group = "array_funcs",
-  since = "3.5.0")
-case class ArrayPrepend(left: Expression, right: Expression)
-  extends BinaryExpression
-    with ImplicitCastInputTypes
-    with ComplexTypeMergingExpression
-    with QueryErrorsBase {
-
-  override def nullable: Boolean = left.nullable
+trait InsertArrayOneSide extends RuntimeReplaceable

Review Comment:
   `InsertArrayOneSide` should be better name. Thank you.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable`

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.
LuciferYang commented on code in PR #40563:
URL: https://github.com/apache/spark/pull/40563#discussion_r1149348790


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1400,120 +1400,27 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
-// scalastyle:off line.size.limit
-@ExpressionDescription(
-  usage = """
-      _FUNC_(array, element) - Add the element at the beginning of the array passed as first
-      argument. Type of element should be the same as the type of the elements of the array.
-      Null element is also prepended to the array. But if the array passed is NULL
-      output is NULL
-    """,
-  examples = """
-    Examples:
-      > SELECT _FUNC_(array('b', 'd', 'c', 'a'), 'd');
-       ["d","b","d","c","a"]
-      > SELECT _FUNC_(array(1, 2, 3, null), null);
-       [null,1,2,3,null]
-      > SELECT _FUNC_(CAST(null as Array<Int>), 2);
-       NULL
-  """,
-  group = "array_funcs",
-  since = "3.5.0")
-case class ArrayPrepend(left: Expression, right: Expression)
-  extends BinaryExpression
-    with ImplicitCastInputTypes
-    with ComplexTypeMergingExpression
-    with QueryErrorsBase {
+trait ArrayInsertEnd extends RuntimeReplaceable

Review Comment:
   should we rename this trait? `ArrayInsertEnd` cannot describe `ArrayPrepend` well
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable`

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on code in PR #40563:
URL: https://github.com/apache/spark/pull/40563#discussion_r1152684039


##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala:
##########
@@ -1855,50 +1855,6 @@ class CollectionExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper
     checkEvaluation(ArrayRepeat(Literal("hi"), Literal(null, IntegerType)), null)
   }
 
-  test("SPARK-41233: ArrayPrepend") {

Review Comment:
   I checked and added test cases in `DataFrameFunctionsSuite`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable`

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on code in PR #40563:
URL: https://github.com/apache/spark/pull/40563#discussion_r1151511918


##########
connector/connect/common/src/test/resources/query-tests/explain-results/function_array_append.explain:
##########
@@ -1,2 +1,2 @@
-Project [array_append(e#0, 1) AS array_append(e, 1)#0]
+Project [array_insert(e#0, (size(e#0, false) + 1), 1) AS array_append(e, 1)#0]

Review Comment:
   Yeah. I forgot it. Thank you for your reminder.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable`

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #40563:
URL: https://github.com/apache/spark/pull/40563#discussion_r1155673089


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -5056,128 +4950,45 @@ case class ArrayCompact(child: Expression)
   """,
   since = "3.4.0",
   group = "array_funcs")
-case class ArrayAppend(left: Expression, right: Expression)
-  extends BinaryExpression
-    with ImplicitCastInputTypes
-    with ComplexTypeMergingExpression
-    with QueryErrorsBase {
-  override def prettyName: String = "array_append"
+case class ArrayAppend(left: Expression, right: Expression) extends InsertArrayOneSide {
 
-  @transient protected lazy val elementType: DataType =
-    inputTypes.head.asInstanceOf[ArrayType].elementType
+  override lazy val replacement: Expression =
+    ArrayInsert(left, Add(ArraySize(left), Literal(1)), right)
 
-  override def inputTypes: Seq[AbstractDataType] = {
-    (left.dataType, right.dataType) match {
-      case (ArrayType(e1, hasNull), e2) =>
-        TypeCoercion.findTightestCommonType(e1, e2) match {
-          case Some(dt) => Seq(ArrayType(dt, hasNull), dt)
-          case _ => Seq.empty
-        }
-      case _ => Seq.empty
-    }
-  }
+  override def prettyName: String = "array_append"
 
-  override def checkInputDataTypes(): TypeCheckResult = {
-    (left.dataType, right.dataType) match {
-      case (ArrayType(e1, _), e2) if DataTypeUtils.sameType(e1, e2) => TypeCheckResult.TypeCheckSuccess
-      case (ArrayType(e1, _), e2) => DataTypeMismatch(
-        errorSubClass = "ARRAY_FUNCTION_DIFF_TYPES",
-        messageParameters = Map(
-          "functionName" -> toSQLId(prettyName),
-          "leftType" -> toSQLType(left.dataType),
-          "rightType" -> toSQLType(right.dataType),
-          "dataType" -> toSQLType(ArrayType)
-        ))
-      case _ =>
-        DataTypeMismatch(
-          errorSubClass = "UNEXPECTED_INPUT_TYPE",
-          messageParameters = Map(
-            "paramIndex" -> "0",
-            "requiredType" -> toSQLType(ArrayType),
-            "inputSql" -> toSQLExpr(left),
-            "inputType" -> toSQLType(left.dataType)
-          )
-        )
-    }
+  override protected def withNewChildrenInternal(
+      newLeft: Expression, newRight: Expression): ArrayAppend = {
+    copy(left = newLeft, right = newRight)
   }
+}
 
-  override def eval(input: InternalRow): Any = {
-    val value1 = left.eval(input)
-    if (value1 == null) {
-      null
-    } else {
-      val value2 = right.eval(input)
-      nullSafeEval(value1, value2)
-    }
-  }
+@ExpressionDescription(
+  usage = """
+      _FUNC_(array, element) - Add the element at the beginning of the array passed as first
+      argument. Type of element should be the same as the type of the elements of the array.
+      Null element is also prepended to the array. But if the array passed is NULL
+      output is NULL
+    """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array('b', 'd', 'c', 'a'), 'd');
+       ["d","b","d","c","a"]
+      > SELECT _FUNC_(array(1, 2, 3, null), null);
+       [null,1,2,3,null]
+      > SELECT _FUNC_(CAST(null as Array<Int>), 2);
+       NULL
+  """,
+  group = "array_funcs",
+  since = "3.5.0")
+case class ArrayPrepend(left: Expression, right: Expression) extends InsertArrayOneSide {
 
-  override protected def nullSafeEval(arr: Any, elementData: Any): Any = {
-    val arrayData = arr.asInstanceOf[ArrayData]
-    val numberOfElements = arrayData.numElements() + 1
-    if (numberOfElements > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
-      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
-    }
-    val finalData = new Array[Any](numberOfElements)
-    arrayData.foreach(elementType, finalData.update)
-    finalData.update(arrayData.numElements(), elementData)
-    new GenericArrayData(finalData)
-  }
+  override lazy val replacement: Expression = ArrayInsert(left, Literal(0), right)

Review Comment:
   not related to this PR, but isn't the implementation of `array_insert` wrong? The doc says: `Places val into index pos of array x. Array indices start at 1, or start from the end if index is negative.`
   
   According to the doc, `array_insert(arr, 1, val)` should put `val` as the first element in this array. `array_insert(arr, 0, val)` should fail as 0 is not a valid index. However, the current implementation of `array_insert` uses 0-based index. cc @zhengruifeng @HyukjinKwon @dtenedor 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #40563: [SPARK-41233][FOLLOWUP] Refactor `array_prepend` with `RuntimeReplaceable`

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on PR #40563:
URL: https://github.com/apache/spark/pull/40563#issuecomment-1499056223

   Can we update the PR description? `array_append` is untouched now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable`

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.
zhengruifeng commented on code in PR #40563:
URL: https://github.com/apache/spark/pull/40563#discussion_r1155685702


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -5056,128 +4950,45 @@ case class ArrayCompact(child: Expression)
   """,
   since = "3.4.0",
   group = "array_funcs")
-case class ArrayAppend(left: Expression, right: Expression)
-  extends BinaryExpression
-    with ImplicitCastInputTypes
-    with ComplexTypeMergingExpression
-    with QueryErrorsBase {
-  override def prettyName: String = "array_append"
+case class ArrayAppend(left: Expression, right: Expression) extends InsertArrayOneSide {
 
-  @transient protected lazy val elementType: DataType =
-    inputTypes.head.asInstanceOf[ArrayType].elementType
+  override lazy val replacement: Expression =
+    ArrayInsert(left, Add(ArraySize(left), Literal(1)), right)
 
-  override def inputTypes: Seq[AbstractDataType] = {
-    (left.dataType, right.dataType) match {
-      case (ArrayType(e1, hasNull), e2) =>
-        TypeCoercion.findTightestCommonType(e1, e2) match {
-          case Some(dt) => Seq(ArrayType(dt, hasNull), dt)
-          case _ => Seq.empty
-        }
-      case _ => Seq.empty
-    }
-  }
+  override def prettyName: String = "array_append"
 
-  override def checkInputDataTypes(): TypeCheckResult = {
-    (left.dataType, right.dataType) match {
-      case (ArrayType(e1, _), e2) if DataTypeUtils.sameType(e1, e2) => TypeCheckResult.TypeCheckSuccess
-      case (ArrayType(e1, _), e2) => DataTypeMismatch(
-        errorSubClass = "ARRAY_FUNCTION_DIFF_TYPES",
-        messageParameters = Map(
-          "functionName" -> toSQLId(prettyName),
-          "leftType" -> toSQLType(left.dataType),
-          "rightType" -> toSQLType(right.dataType),
-          "dataType" -> toSQLType(ArrayType)
-        ))
-      case _ =>
-        DataTypeMismatch(
-          errorSubClass = "UNEXPECTED_INPUT_TYPE",
-          messageParameters = Map(
-            "paramIndex" -> "0",
-            "requiredType" -> toSQLType(ArrayType),
-            "inputSql" -> toSQLExpr(left),
-            "inputType" -> toSQLType(left.dataType)
-          )
-        )
-    }
+  override protected def withNewChildrenInternal(
+      newLeft: Expression, newRight: Expression): ArrayAppend = {
+    copy(left = newLeft, right = newRight)
   }
+}
 
-  override def eval(input: InternalRow): Any = {
-    val value1 = left.eval(input)
-    if (value1 == null) {
-      null
-    } else {
-      val value2 = right.eval(input)
-      nullSafeEval(value1, value2)
-    }
-  }
+@ExpressionDescription(
+  usage = """
+      _FUNC_(array, element) - Add the element at the beginning of the array passed as first
+      argument. Type of element should be the same as the type of the elements of the array.
+      Null element is also prepended to the array. But if the array passed is NULL
+      output is NULL
+    """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array('b', 'd', 'c', 'a'), 'd');
+       ["d","b","d","c","a"]
+      > SELECT _FUNC_(array(1, 2, 3, null), null);
+       [null,1,2,3,null]
+      > SELECT _FUNC_(CAST(null as Array<Int>), 2);
+       NULL
+  """,
+  group = "array_funcs",
+  since = "3.5.0")
+case class ArrayPrepend(left: Expression, right: Expression) extends InsertArrayOneSide {
 
-  override protected def nullSafeEval(arr: Any, elementData: Any): Any = {
-    val arrayData = arr.asInstanceOf[ArrayData]
-    val numberOfElements = arrayData.numElements() + 1
-    if (numberOfElements > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
-      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
-    }
-    val finalData = new Array[Any](numberOfElements)
-    arrayData.foreach(elementType, finalData.update)
-    finalData.update(arrayData.numElements(), elementData)
-    new GenericArrayData(finalData)
-  }
+  override lazy val replacement: Expression = ArrayInsert(left, Literal(0), right)

Review Comment:
   @cloud-fan it is still 1-based, but also treat 0 as the first item. see the discussion https://github.com/apache/spark/pull/38867#discussion_r1093243515
   
   ```
   +----------------------------------+----------------------------------+----------------------------------+
   |array_insert(array(1, 2, 3), 1, 4)|array_insert(array(1, 2, 3), 0, 4)|array_insert(array(1, 2, 3), 2, 4)|
   +----------------------------------+----------------------------------+----------------------------------+
   |                      [4, 1, 2, 3]|                      [4, 1, 2, 3]|                      [1, 4, 2, 3]|
   +----------------------------------+----------------------------------+----------------------------------+
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable`

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.
zhengruifeng commented on code in PR #40563:
URL: https://github.com/apache/spark/pull/40563#discussion_r1155700160


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -5056,128 +4950,45 @@ case class ArrayCompact(child: Expression)
   """,
   since = "3.4.0",
   group = "array_funcs")
-case class ArrayAppend(left: Expression, right: Expression)
-  extends BinaryExpression
-    with ImplicitCastInputTypes
-    with ComplexTypeMergingExpression
-    with QueryErrorsBase {
-  override def prettyName: String = "array_append"
+case class ArrayAppend(left: Expression, right: Expression) extends InsertArrayOneSide {
 
-  @transient protected lazy val elementType: DataType =
-    inputTypes.head.asInstanceOf[ArrayType].elementType
+  override lazy val replacement: Expression =
+    ArrayInsert(left, Add(ArraySize(left), Literal(1)), right)
 
-  override def inputTypes: Seq[AbstractDataType] = {
-    (left.dataType, right.dataType) match {
-      case (ArrayType(e1, hasNull), e2) =>
-        TypeCoercion.findTightestCommonType(e1, e2) match {
-          case Some(dt) => Seq(ArrayType(dt, hasNull), dt)
-          case _ => Seq.empty
-        }
-      case _ => Seq.empty
-    }
-  }
+  override def prettyName: String = "array_append"
 
-  override def checkInputDataTypes(): TypeCheckResult = {
-    (left.dataType, right.dataType) match {
-      case (ArrayType(e1, _), e2) if DataTypeUtils.sameType(e1, e2) => TypeCheckResult.TypeCheckSuccess
-      case (ArrayType(e1, _), e2) => DataTypeMismatch(
-        errorSubClass = "ARRAY_FUNCTION_DIFF_TYPES",
-        messageParameters = Map(
-          "functionName" -> toSQLId(prettyName),
-          "leftType" -> toSQLType(left.dataType),
-          "rightType" -> toSQLType(right.dataType),
-          "dataType" -> toSQLType(ArrayType)
-        ))
-      case _ =>
-        DataTypeMismatch(
-          errorSubClass = "UNEXPECTED_INPUT_TYPE",
-          messageParameters = Map(
-            "paramIndex" -> "0",
-            "requiredType" -> toSQLType(ArrayType),
-            "inputSql" -> toSQLExpr(left),
-            "inputType" -> toSQLType(left.dataType)
-          )
-        )
-    }
+  override protected def withNewChildrenInternal(
+      newLeft: Expression, newRight: Expression): ArrayAppend = {
+    copy(left = newLeft, right = newRight)
   }
+}
 
-  override def eval(input: InternalRow): Any = {
-    val value1 = left.eval(input)
-    if (value1 == null) {
-      null
-    } else {
-      val value2 = right.eval(input)
-      nullSafeEval(value1, value2)
-    }
-  }
+@ExpressionDescription(
+  usage = """
+      _FUNC_(array, element) - Add the element at the beginning of the array passed as first
+      argument. Type of element should be the same as the type of the elements of the array.
+      Null element is also prepended to the array. But if the array passed is NULL
+      output is NULL
+    """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array('b', 'd', 'c', 'a'), 'd');
+       ["d","b","d","c","a"]
+      > SELECT _FUNC_(array(1, 2, 3, null), null);
+       [null,1,2,3,null]
+      > SELECT _FUNC_(CAST(null as Array<Int>), 2);
+       NULL
+  """,
+  group = "array_funcs",
+  since = "3.5.0")
+case class ArrayPrepend(left: Expression, right: Expression) extends InsertArrayOneSide {
 
-  override protected def nullSafeEval(arr: Any, elementData: Any): Any = {
-    val arrayData = arr.asInstanceOf[ArrayData]
-    val numberOfElements = arrayData.numElements() + 1
-    if (numberOfElements > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
-      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
-    }
-    val finalData = new Array[Any](numberOfElements)
-    arrayData.foreach(elementType, finalData.update)
-    finalData.update(arrayData.numElements(), elementData)
-    new GenericArrayData(finalData)
-  }
+  override lazy val replacement: Expression = ArrayInsert(left, Literal(0), right)

Review Comment:
   got it, let me send a fix for it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable`

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.
zhengruifeng commented on PR #40563:
URL: https://github.com/apache/spark/pull/40563#issuecomment-1492810135

   ping @cloud-fan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable`

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on PR #40563:
URL: https://github.com/apache/spark/pull/40563#issuecomment-1495822244

   cc @MaxGekk 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on pull request #40563: [SPARK-41233][FOLLOWUP] Refactor `array_prepend` with `RuntimeReplaceable`

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on PR #40563:
URL: https://github.com/apache/spark/pull/40563#issuecomment-1499895163

   > Can we update the PR description? `array_append` is untouched now.
   
   Updated.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable`

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #40563:
URL: https://github.com/apache/spark/pull/40563#discussion_r1155698105


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -5056,128 +4950,45 @@ case class ArrayCompact(child: Expression)
   """,
   since = "3.4.0",
   group = "array_funcs")
-case class ArrayAppend(left: Expression, right: Expression)
-  extends BinaryExpression
-    with ImplicitCastInputTypes
-    with ComplexTypeMergingExpression
-    with QueryErrorsBase {
-  override def prettyName: String = "array_append"
+case class ArrayAppend(left: Expression, right: Expression) extends InsertArrayOneSide {
 
-  @transient protected lazy val elementType: DataType =
-    inputTypes.head.asInstanceOf[ArrayType].elementType
+  override lazy val replacement: Expression =
+    ArrayInsert(left, Add(ArraySize(left), Literal(1)), right)
 
-  override def inputTypes: Seq[AbstractDataType] = {
-    (left.dataType, right.dataType) match {
-      case (ArrayType(e1, hasNull), e2) =>
-        TypeCoercion.findTightestCommonType(e1, e2) match {
-          case Some(dt) => Seq(ArrayType(dt, hasNull), dt)
-          case _ => Seq.empty
-        }
-      case _ => Seq.empty
-    }
-  }
+  override def prettyName: String = "array_append"
 
-  override def checkInputDataTypes(): TypeCheckResult = {
-    (left.dataType, right.dataType) match {
-      case (ArrayType(e1, _), e2) if DataTypeUtils.sameType(e1, e2) => TypeCheckResult.TypeCheckSuccess
-      case (ArrayType(e1, _), e2) => DataTypeMismatch(
-        errorSubClass = "ARRAY_FUNCTION_DIFF_TYPES",
-        messageParameters = Map(
-          "functionName" -> toSQLId(prettyName),
-          "leftType" -> toSQLType(left.dataType),
-          "rightType" -> toSQLType(right.dataType),
-          "dataType" -> toSQLType(ArrayType)
-        ))
-      case _ =>
-        DataTypeMismatch(
-          errorSubClass = "UNEXPECTED_INPUT_TYPE",
-          messageParameters = Map(
-            "paramIndex" -> "0",
-            "requiredType" -> toSQLType(ArrayType),
-            "inputSql" -> toSQLExpr(left),
-            "inputType" -> toSQLType(left.dataType)
-          )
-        )
-    }
+  override protected def withNewChildrenInternal(
+      newLeft: Expression, newRight: Expression): ArrayAppend = {
+    copy(left = newLeft, right = newRight)
   }
+}
 
-  override def eval(input: InternalRow): Any = {
-    val value1 = left.eval(input)
-    if (value1 == null) {
-      null
-    } else {
-      val value2 = right.eval(input)
-      nullSafeEval(value1, value2)
-    }
-  }
+@ExpressionDescription(
+  usage = """
+      _FUNC_(array, element) - Add the element at the beginning of the array passed as first
+      argument. Type of element should be the same as the type of the elements of the array.
+      Null element is also prepended to the array. But if the array passed is NULL
+      output is NULL
+    """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array('b', 'd', 'c', 'a'), 'd');
+       ["d","b","d","c","a"]
+      > SELECT _FUNC_(array(1, 2, 3, null), null);
+       [null,1,2,3,null]
+      > SELECT _FUNC_(CAST(null as Array<Int>), 2);
+       NULL
+  """,
+  group = "array_funcs",
+  since = "3.5.0")
+case class ArrayPrepend(left: Expression, right: Expression) extends InsertArrayOneSide {
 
-  override protected def nullSafeEval(arr: Any, elementData: Any): Any = {
-    val arrayData = arr.asInstanceOf[ArrayData]
-    val numberOfElements = arrayData.numElements() + 1
-    if (numberOfElements > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
-      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
-    }
-    val finalData = new Array[Any](numberOfElements)
-    arrayData.foreach(elementType, finalData.update)
-    finalData.update(arrayData.numElements(), elementData)
-    new GenericArrayData(finalData)
-  }
+  override lazy val replacement: Expression = ArrayInsert(left, Literal(0), right)

Review Comment:
   Sorry for missing that discussion. I think failing is more consistent with other functions that use 1-based index.
   
   And here we should use `1` as the index to be more future-proof.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable`

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on code in PR #40563:
URL: https://github.com/apache/spark/pull/40563#discussion_r1152647435


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1400,120 +1400,24 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
-// scalastyle:off line.size.limit
-@ExpressionDescription(
-  usage = """
-      _FUNC_(array, element) - Add the element at the beginning of the array passed as first
-      argument. Type of element should be the same as the type of the elements of the array.
-      Null element is also prepended to the array. But if the array passed is NULL
-      output is NULL
-    """,
-  examples = """
-    Examples:
-      > SELECT _FUNC_(array('b', 'd', 'c', 'a'), 'd');
-       ["d","b","d","c","a"]
-      > SELECT _FUNC_(array(1, 2, 3, null), null);
-       [null,1,2,3,null]
-      > SELECT _FUNC_(CAST(null as Array<Int>), 2);
-       NULL
-  """,
-  group = "array_funcs",
-  since = "3.5.0")
-case class ArrayPrepend(left: Expression, right: Expression)
-  extends BinaryExpression
-    with ImplicitCastInputTypes
-    with ComplexTypeMergingExpression
-    with QueryErrorsBase {
-
-  override def nullable: Boolean = left.nullable
+trait InsertArrayOneSide extends RuntimeReplaceable
+  with ImplicitCastInputTypes with BinaryLike[Expression] with QueryErrorsBase {

Review Comment:
   `BinaryExpression` could be evaluated. In fact, the implementation of `RuntimeReplaceable` could not be evaluated.
   `ArrayInsert` already implemented the `ComplexTypeMergingExpression`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable`

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on code in PR #40563:
URL: https://github.com/apache/spark/pull/40563#discussion_r1151406434


##########
connector/connect/common/src/test/resources/query-tests/explain-results/function_array_append.explain:
##########
@@ -1,2 +1,2 @@
-Project [array_append(e#0, 1) AS array_append(e, 1)#0]
+Project [array_insert(e#0, (size(e#0, false) + 1), 1) AS array_append(e, 1)#0]

Review Comment:
   You can fix `array_insert` to have `getTagValue(FunctionRegistry.FUNC_ALIAS).getOrElse(name)`, and set the alias in `FunctionRegistry` by setting `setAlias` to `true`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable`

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.
LuciferYang commented on code in PR #40563:
URL: https://github.com/apache/spark/pull/40563#discussion_r1150655780


##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala:
##########
@@ -1855,50 +1855,6 @@ class CollectionExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper
     checkEvaluation(ArrayRepeat(Literal("hi"), Literal(null, IntegerType)), null)
   }
 
-  test("SPARK-41233: ArrayPrepend") {

Review Comment:
   Should we keep these cases? When I revert this file, it will test fail with this pr:
   
   ```
   Exception evaluating array_prepend([1,2,3,4], 0)
   ScalaTestFailureLocation: org.apache.spark.sql.catalyst.expressions.ExpressionEvalHelper at (ExpressionEvalHelper.scala:246)
   org.scalatest.exceptions.TestFailedException: Exception evaluating array_prepend([1,2,3,4], 0)
   	at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:472)
   	at org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:471)
   	at org.scalatest.funsuite.AnyFunSuite.newAssertionFailedException(AnyFunSuite.scala:1564)
   	at org.scalatest.Assertions.fail(Assertions.scala:949)
   	at org.scalatest.Assertions.fail$(Assertions.scala:945)
   	at org.scalatest.funsuite.AnyFunSuite.fail(AnyFunSuite.scala:1564)
   	at org.apache.spark.sql.catalyst.expressions.ExpressionEvalHelper.checkEvaluationWithoutCodegen(ExpressionEvalHelper.scala:246)
   	at org.apache.spark.sql.catalyst.expressions.ExpressionEvalHelper.checkEvaluationWithoutCodegen$(ExpressionEvalHelper.scala:240)
   	at org.apache.spark.sql.catalyst.expressions.CollectionExpressionsSuite.checkEvaluationWithoutCodegen(CollectionExpressionsSuite.scala:40)
   	at org.apache.spark.sql.catalyst.expressions.ExpressionEvalHelper.checkEvaluation(ExpressionEvalHelper.scala:87)
   	at org.apache.spark.sql.catalyst.expressions.ExpressionEvalHelper.checkEvaluation$(ExpressionEvalHelper.scala:82)
   	at org.apache.spark.sql.catalyst.expressions.CollectionExpressionsSuite.checkEvaluation(CollectionExpressionsSuite.scala:40)
   	at org.apache.spark.sql.catalyst.expressions.CollectionExpressionsSuite.$anonfun$new$445(CollectionExpressionsSuite.scala:1864)
   	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
   	at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
   	at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
   	at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
   	at org.scalatest.Transformer.apply(Transformer.scala:22)
   	at org.scalatest.Transformer.apply(Transformer.scala:20)
   	at org.scalatest.funsuite.AnyFunSuiteLike$$anon$1.apply(AnyFunSuiteLike.scala:226)
   	at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:221)
   	at org.scalatest.funsuite.AnyFunSuiteLike.invokeWithFixture$1(AnyFunSuiteLike.scala:224)
   	at org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTest$1(AnyFunSuiteLike.scala:236)
   	at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
   	at org.scalatest.funsuite.AnyFunSuiteLike.runTest(AnyFunSuiteLike.scala:236)
   	at org.scalatest.funsuite.AnyFunSuiteLike.runTest$(AnyFunSuiteLike.scala:218)
   	at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(SparkFunSuite.scala:67)
   	at org.scalatest.BeforeAndAfterEach.runTest(BeforeAndAfterEach.scala:234)
   	at org.scalatest.BeforeAndAfterEach.runTest$(BeforeAndAfterEach.scala:227)
   	at org.apache.spark.SparkFunSuite.runTest(SparkFunSuite.scala:67)
   	at org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTests$1(AnyFunSuiteLike.scala:269)
   	at org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:413)
   	at scala.collection.immutable.List.foreach(List.scala:431)
   	at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
   	at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:396)
   	at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:475)
   	at org.scalatest.funsuite.AnyFunSuiteLike.runTests(AnyFunSuiteLike.scala:269)
   	at org.scalatest.funsuite.AnyFunSuiteLike.runTests$(AnyFunSuiteLike.scala:268)
   	at org.scalatest.funsuite.AnyFunSuite.runTests(AnyFunSuite.scala:1564)
   	at org.scalatest.Suite.run(Suite.scala:1114)
   	at org.scalatest.Suite.run$(Suite.scala:1096)
   	at org.scalatest.funsuite.AnyFunSuite.org$scalatest$funsuite$AnyFunSuiteLike$$super$run(AnyFunSuite.scala:1564)
   	at org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$run$1(AnyFunSuiteLike.scala:273)
   	at org.scalatest.SuperEngine.runImpl(Engine.scala:535)
   	at org.scalatest.funsuite.AnyFunSuiteLike.run(AnyFunSuiteLike.scala:273)
   	at org.scalatest.funsuite.AnyFunSuiteLike.run$(AnyFunSuiteLike.scala:272)
   	at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterAll$$super$run(SparkFunSuite.scala:67)
   	at org.scalatest.BeforeAndAfterAll.liftedTree1$1(BeforeAndAfterAll.scala:213)
   	at org.scalatest.BeforeAndAfterAll.run(BeforeAndAfterAll.scala:210)
   	at org.scalatest.BeforeAndAfterAll.run$(BeforeAndAfterAll.scala:208)
   	at org.apache.spark.SparkFunSuite.run(SparkFunSuite.scala:67)
   	at org.scalatest.tools.SuiteRunner.run(SuiteRunner.scala:47)
   	at org.scalatest.tools.Runner$.$anonfun$doRunRunRunDaDoRunRun$13(Runner.scala:1321)
   	at org.scalatest.tools.Runner$.$anonfun$doRunRunRunDaDoRunRun$13$adapted(Runner.scala:1315)
   	at scala.collection.immutable.List.foreach(List.scala:431)
   	at org.scalatest.tools.Runner$.doRunRunRunDaDoRunRun(Runner.scala:1315)
   	at org.scalatest.tools.Runner$.$anonfun$runOptionallyWithPassFailReporter$24(Runner.scala:992)
   	at org.scalatest.tools.Runner$.$anonfun$runOptionallyWithPassFailReporter$24$adapted(Runner.scala:970)
   	at org.scalatest.tools.Runner$.withClassLoaderAndDispatchReporter(Runner.scala:1481)
   	at org.scalatest.tools.Runner$.runOptionallyWithPassFailReporter(Runner.scala:970)
   	at org.scalatest.tools.Runner$.run(Runner.scala:798)
   	at org.scalatest.tools.Runner.run(Runner.scala)
   	at org.jetbrains.plugins.scala.testingSupport.scalaTest.ScalaTestRunner.runScalaTest2or3(ScalaTestRunner.java:43)
   	at org.jetbrains.plugins.scala.testingSupport.scalaTest.ScalaTestRunner.main(ScalaTestRunner.java:26)
   Caused by: org.apache.spark.SparkException: [INTERNAL_ERROR] Cannot evaluate expression: array_prepend([1,2,3,4], 0)
   	at org.apache.spark.SparkException$.internalError(SparkException.scala:77)
   	at org.apache.spark.SparkException$.internalError(SparkException.scala:81)
   	at org.apache.spark.sql.errors.QueryExecutionErrors$.cannotEvaluateExpressionError(QueryExecutionErrors.scala:66)
   	at org.apache.spark.sql.catalyst.expressions.RuntimeReplaceable.eval(Expression.scala:409)
   	at org.apache.spark.sql.catalyst.expressions.RuntimeReplaceable.eval$(Expression.scala:408)
   	at org.apache.spark.sql.catalyst.expressions.ArrayPrepend.eval(collectionOperations.scala:4984)
   	at org.apache.spark.sql.catalyst.expressions.ExpressionEvalHelper.evaluateWithoutCodegen(ExpressionEvalHelper.scala:237)
   	at org.apache.spark.sql.catalyst.expressions.ExpressionEvalHelper.evaluateWithoutCodegen$(ExpressionEvalHelper.scala:231)
   	at org.apache.spark.sql.catalyst.expressions.CollectionExpressionsSuite.evaluateWithoutCodegen(CollectionExpressionsSuite.scala:40)
   	at org.apache.spark.sql.catalyst.expressions.ExpressionEvalHelper.checkEvaluationWithoutCodegen(ExpressionEvalHelper.scala:245)
   	... 57 more
   ```
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #40563: [SPARK-41233][FOLLOWUP] Refactor `array_prepend` with `RuntimeReplaceable`

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on PR #40563:
URL: https://github.com/apache/spark/pull/40563#issuecomment-1499063099

   After a second thought, this makes the performance worse. We should improve `ArrayInsert` to generate more efficient code if the position argument is a constant. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] infoankitp commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable`

Posted by "infoankitp (via GitHub)" <gi...@apache.org>.
infoankitp commented on code in PR #40563:
URL: https://github.com/apache/spark/pull/40563#discussion_r1153083910


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1400,120 +1400,24 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
-// scalastyle:off line.size.limit
-@ExpressionDescription(
-  usage = """
-      _FUNC_(array, element) - Add the element at the beginning of the array passed as first
-      argument. Type of element should be the same as the type of the elements of the array.
-      Null element is also prepended to the array. But if the array passed is NULL
-      output is NULL
-    """,
-  examples = """
-    Examples:
-      > SELECT _FUNC_(array('b', 'd', 'c', 'a'), 'd');
-       ["d","b","d","c","a"]
-      > SELECT _FUNC_(array(1, 2, 3, null), null);
-       [null,1,2,3,null]
-      > SELECT _FUNC_(CAST(null as Array<Int>), 2);
-       NULL
-  """,
-  group = "array_funcs",
-  since = "3.5.0")
-case class ArrayPrepend(left: Expression, right: Expression)
-  extends BinaryExpression
-    with ImplicitCastInputTypes
-    with ComplexTypeMergingExpression
-    with QueryErrorsBase {
-
-  override def nullable: Boolean = left.nullable
+trait InsertArrayOneSide extends RuntimeReplaceable
+  with ImplicitCastInputTypes with BinaryLike[Expression] with QueryErrorsBase {

Review Comment:
   Makes sense ! Thanks so much! 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable`

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.
LuciferYang commented on code in PR #40563:
URL: https://github.com/apache/spark/pull/40563#discussion_r1149997582


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -5056,128 +4956,45 @@ case class ArrayCompact(child: Expression)
   """,
   since = "3.4.0",
   group = "array_funcs")
-case class ArrayAppend(left: Expression, right: Expression)
-  extends BinaryExpression
-    with ImplicitCastInputTypes
-    with ComplexTypeMergingExpression
-    with QueryErrorsBase {
-  override def prettyName: String = "array_append"
+case class ArrayAppend(arr: Expression, ele: Expression) extends InsertArrayOneSide {
 
-  @transient protected lazy val elementType: DataType =
-    inputTypes.head.asInstanceOf[ArrayType].elementType
+  override lazy val replacement: Expression =
+    ArrayInsert(arr, Add(ArraySize(left), Literal(1)), ele)
 
-  override def inputTypes: Seq[AbstractDataType] = {
-    (left.dataType, right.dataType) match {
-      case (ArrayType(e1, hasNull), e2) =>
-        TypeCoercion.findTightestCommonType(e1, e2) match {
-          case Some(dt) => Seq(ArrayType(dt, hasNull), dt)
-          case _ => Seq.empty
-        }
-      case _ => Seq.empty
-    }
-  }
-
-  override def checkInputDataTypes(): TypeCheckResult = {
-    (left.dataType, right.dataType) match {
-      case (ArrayType(e1, _), e2) if DataTypeUtils.sameType(e1, e2) => TypeCheckResult.TypeCheckSuccess
-      case (ArrayType(e1, _), e2) => DataTypeMismatch(
-        errorSubClass = "ARRAY_FUNCTION_DIFF_TYPES",
-        messageParameters = Map(
-          "functionName" -> toSQLId(prettyName),
-          "leftType" -> toSQLType(left.dataType),
-          "rightType" -> toSQLType(right.dataType),
-          "dataType" -> toSQLType(ArrayType)
-        ))
-      case _ =>
-        DataTypeMismatch(
-          errorSubClass = "UNEXPECTED_INPUT_TYPE",
-          messageParameters = Map(
-            "paramIndex" -> "0",
-            "requiredType" -> toSQLType(ArrayType),
-            "inputSql" -> toSQLExpr(left),
-            "inputType" -> toSQLType(left.dataType)
-          )
-        )
-    }
-  }
+  override def prettyName: String = "array_append"
 
-  override def eval(input: InternalRow): Any = {
-    val value1 = left.eval(input)
-    if (value1 == null) {
-      null
-    } else {
-      val value2 = right.eval(input)
-      nullSafeEval(value1, value2)
-    }
+  override protected def withNewChildrenInternal(
+      newLeft: Expression, newRight: Expression): Expression = {

Review Comment:
   return type change to `ArrayAppend`?



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -5056,128 +4956,45 @@ case class ArrayCompact(child: Expression)
   """,
   since = "3.4.0",
   group = "array_funcs")
-case class ArrayAppend(left: Expression, right: Expression)
-  extends BinaryExpression
-    with ImplicitCastInputTypes
-    with ComplexTypeMergingExpression
-    with QueryErrorsBase {
-  override def prettyName: String = "array_append"
+case class ArrayAppend(arr: Expression, ele: Expression) extends InsertArrayOneSide {
 
-  @transient protected lazy val elementType: DataType =
-    inputTypes.head.asInstanceOf[ArrayType].elementType
+  override lazy val replacement: Expression =
+    ArrayInsert(arr, Add(ArraySize(left), Literal(1)), ele)
 
-  override def inputTypes: Seq[AbstractDataType] = {
-    (left.dataType, right.dataType) match {
-      case (ArrayType(e1, hasNull), e2) =>
-        TypeCoercion.findTightestCommonType(e1, e2) match {
-          case Some(dt) => Seq(ArrayType(dt, hasNull), dt)
-          case _ => Seq.empty
-        }
-      case _ => Seq.empty
-    }
-  }
-
-  override def checkInputDataTypes(): TypeCheckResult = {
-    (left.dataType, right.dataType) match {
-      case (ArrayType(e1, _), e2) if DataTypeUtils.sameType(e1, e2) => TypeCheckResult.TypeCheckSuccess
-      case (ArrayType(e1, _), e2) => DataTypeMismatch(
-        errorSubClass = "ARRAY_FUNCTION_DIFF_TYPES",
-        messageParameters = Map(
-          "functionName" -> toSQLId(prettyName),
-          "leftType" -> toSQLType(left.dataType),
-          "rightType" -> toSQLType(right.dataType),
-          "dataType" -> toSQLType(ArrayType)
-        ))
-      case _ =>
-        DataTypeMismatch(
-          errorSubClass = "UNEXPECTED_INPUT_TYPE",
-          messageParameters = Map(
-            "paramIndex" -> "0",
-            "requiredType" -> toSQLType(ArrayType),
-            "inputSql" -> toSQLExpr(left),
-            "inputType" -> toSQLType(left.dataType)
-          )
-        )
-    }
-  }
+  override def prettyName: String = "array_append"
 
-  override def eval(input: InternalRow): Any = {
-    val value1 = left.eval(input)
-    if (value1 == null) {
-      null
-    } else {
-      val value2 = right.eval(input)
-      nullSafeEval(value1, value2)
-    }
+  override protected def withNewChildrenInternal(
+      newLeft: Expression, newRight: Expression): Expression = {
+    copy(arr = newLeft, ele = newRight)
   }
+}
 
-  override protected def nullSafeEval(arr: Any, elementData: Any): Any = {
-    val arrayData = arr.asInstanceOf[ArrayData]
-    val numberOfElements = arrayData.numElements() + 1
-    if (numberOfElements > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
-      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
-    }
-    val finalData = new Array[Any](numberOfElements)
-    arrayData.foreach(elementType, finalData.update)
-    finalData.update(arrayData.numElements(), elementData)
-    new GenericArrayData(finalData)
-  }
+@ExpressionDescription(
+  usage = """
+      _FUNC_(array, element) - Add the element at the beginning of the array passed as first
+      argument. Type of element should be the same as the type of the elements of the array.
+      Null element is also prepended to the array. But if the array passed is NULL
+      output is NULL
+    """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array('b', 'd', 'c', 'a'), 'd');
+       ["d","b","d","c","a"]
+      > SELECT _FUNC_(array(1, 2, 3, null), null);
+       [null,1,2,3,null]
+      > SELECT _FUNC_(CAST(null as Array<Int>), 2);
+       NULL
+  """,
+  group = "array_funcs",
+  since = "3.5.0")
+case class ArrayPrepend(arr: Expression, ele: Expression) extends InsertArrayOneSide {
 
-  override def nullable: Boolean = left.nullable
-  override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
-    val leftGen = left.genCode(ctx)
-    val rightGen = right.genCode(ctx)
-    val f = (eval1: String, eval2: String) => {
-      val newArraySize = ctx.freshName("newArraySize")
-      val i = ctx.freshName("i")
-      val values = ctx.freshName("values")
-      val allocation = CodeGenerator.createArrayData(
-        values, elementType, newArraySize, s" $prettyName failed.")
-      val assignment = CodeGenerator.createArrayAssignment(
-        values, elementType, eval1, i, i, left.dataType.asInstanceOf[ArrayType].containsNull)
-      s"""
-         |int $newArraySize = $eval1.numElements() + 1;
-         |$allocation
-         |int $i = 0;
-         |while ($i < $eval1.numElements()) {
-         |  $assignment
-         |  $i ++;
-         |}
-         |${CodeGenerator.setArrayElement(values, elementType, i, eval2, Some(rightGen.isNull))}
-         |${ev.value} = $values;
-         |""".stripMargin
-    }
-    val resultCode = f(leftGen.value, rightGen.value)
-    if (nullable) {
-      val nullSafeEval =
-        leftGen.code + rightGen.code + ctx.nullSafeExec(left.nullable, leftGen.isNull) {
-          s"""
-            ${ev.isNull} = false; // resultCode could change nullability.
-            $resultCode
-          """
-        }
-      ev.copy(code =
-        code"""
-        boolean ${ev.isNull} = true;
-        ${CodeGenerator.javaType(dataType)} ${ev.value} = ${CodeGenerator.defaultValue(dataType)};
-        $nullSafeEval
-      """)
-    } else {
-      ev.copy(code =
-        code"""
-        ${leftGen.code}
-        ${rightGen.code}
-        ${CodeGenerator.javaType(dataType)} ${ev.value} = ${CodeGenerator.defaultValue(dataType)};
-        $resultCode""", isNull = FalseLiteral)
-    }
-  }
+  override lazy val replacement: Expression = ArrayInsert(arr, Literal(0), ele)
 
-  /**
-   * Returns the [[DataType]] of the result of evaluating this expression. It is invalid to query
-   * the dataType of an unresolved expression (i.e., when `resolved` == false).
-   */
-  override def dataType: DataType = if (right.nullable) left.dataType.asNullable else left.dataType
-  protected def withNewChildrenInternal(newLeft: Expression, newRight: Expression): ArrayAppend =
-    copy(left = newLeft, right = newRight)
+  override def prettyName: String = "array_prepend"
 
+  override protected def withNewChildrenInternal(
+      newLeft: Expression, newRight: Expression): Expression = {

Review Comment:
   return type change to `ArrayPrepend`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable`

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.
LuciferYang commented on code in PR #40563:
URL: https://github.com/apache/spark/pull/40563#discussion_r1150671855


##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala:
##########
@@ -1855,50 +1855,6 @@ class CollectionExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper
     checkEvaluation(ArrayRepeat(Literal("hi"), Literal(null, IntegerType)), null)
   }
 
-  test("SPARK-41233: ArrayPrepend") {

Review Comment:
   hmm... please ignore the comments above
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable`

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.
LuciferYang commented on code in PR #40563:
URL: https://github.com/apache/spark/pull/40563#discussion_r1155706565


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -5056,128 +4950,45 @@ case class ArrayCompact(child: Expression)
   """,
   since = "3.4.0",
   group = "array_funcs")
-case class ArrayAppend(left: Expression, right: Expression)
-  extends BinaryExpression
-    with ImplicitCastInputTypes
-    with ComplexTypeMergingExpression
-    with QueryErrorsBase {
-  override def prettyName: String = "array_append"
+case class ArrayAppend(left: Expression, right: Expression) extends InsertArrayOneSide {
 
-  @transient protected lazy val elementType: DataType =
-    inputTypes.head.asInstanceOf[ArrayType].elementType
+  override lazy val replacement: Expression =
+    ArrayInsert(left, Add(ArraySize(left), Literal(1)), right)
 
-  override def inputTypes: Seq[AbstractDataType] = {
-    (left.dataType, right.dataType) match {
-      case (ArrayType(e1, hasNull), e2) =>
-        TypeCoercion.findTightestCommonType(e1, e2) match {
-          case Some(dt) => Seq(ArrayType(dt, hasNull), dt)
-          case _ => Seq.empty
-        }
-      case _ => Seq.empty
-    }
-  }
+  override def prettyName: String = "array_append"
 
-  override def checkInputDataTypes(): TypeCheckResult = {
-    (left.dataType, right.dataType) match {
-      case (ArrayType(e1, _), e2) if DataTypeUtils.sameType(e1, e2) => TypeCheckResult.TypeCheckSuccess
-      case (ArrayType(e1, _), e2) => DataTypeMismatch(
-        errorSubClass = "ARRAY_FUNCTION_DIFF_TYPES",
-        messageParameters = Map(
-          "functionName" -> toSQLId(prettyName),
-          "leftType" -> toSQLType(left.dataType),
-          "rightType" -> toSQLType(right.dataType),
-          "dataType" -> toSQLType(ArrayType)
-        ))
-      case _ =>
-        DataTypeMismatch(
-          errorSubClass = "UNEXPECTED_INPUT_TYPE",
-          messageParameters = Map(
-            "paramIndex" -> "0",
-            "requiredType" -> toSQLType(ArrayType),
-            "inputSql" -> toSQLExpr(left),
-            "inputType" -> toSQLType(left.dataType)
-          )
-        )
-    }
+  override protected def withNewChildrenInternal(
+      newLeft: Expression, newRight: Expression): ArrayAppend = {
+    copy(left = newLeft, right = newRight)
   }
+}
 
-  override def eval(input: InternalRow): Any = {
-    val value1 = left.eval(input)
-    if (value1 == null) {
-      null
-    } else {
-      val value2 = right.eval(input)
-      nullSafeEval(value1, value2)
-    }
-  }
+@ExpressionDescription(
+  usage = """
+      _FUNC_(array, element) - Add the element at the beginning of the array passed as first
+      argument. Type of element should be the same as the type of the elements of the array.
+      Null element is also prepended to the array. But if the array passed is NULL
+      output is NULL
+    """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array('b', 'd', 'c', 'a'), 'd');
+       ["d","b","d","c","a"]
+      > SELECT _FUNC_(array(1, 2, 3, null), null);
+       [null,1,2,3,null]
+      > SELECT _FUNC_(CAST(null as Array<Int>), 2);
+       NULL
+  """,
+  group = "array_funcs",
+  since = "3.5.0")
+case class ArrayPrepend(left: Expression, right: Expression) extends InsertArrayOneSide {
 
-  override protected def nullSafeEval(arr: Any, elementData: Any): Any = {
-    val arrayData = arr.asInstanceOf[ArrayData]
-    val numberOfElements = arrayData.numElements() + 1
-    if (numberOfElements > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
-      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
-    }
-    val finalData = new Array[Any](numberOfElements)
-    arrayData.foreach(elementType, finalData.update)
-    finalData.update(arrayData.numElements(), elementData)
-    new GenericArrayData(finalData)
-  }
+  override lazy val replacement: Expression = ArrayInsert(left, Literal(0), right)

Review Comment:
   I chose to remove support for index `0` ...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable`

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on code in PR #40563:
URL: https://github.com/apache/spark/pull/40563#discussion_r1155912650


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -5056,128 +4950,45 @@ case class ArrayCompact(child: Expression)
   """,
   since = "3.4.0",
   group = "array_funcs")
-case class ArrayAppend(left: Expression, right: Expression)
-  extends BinaryExpression
-    with ImplicitCastInputTypes
-    with ComplexTypeMergingExpression
-    with QueryErrorsBase {
-  override def prettyName: String = "array_append"
+case class ArrayAppend(left: Expression, right: Expression) extends InsertArrayOneSide {
 
-  @transient protected lazy val elementType: DataType =
-    inputTypes.head.asInstanceOf[ArrayType].elementType
+  override lazy val replacement: Expression =
+    ArrayInsert(left, Add(ArraySize(left), Literal(1)), right)
 
-  override def inputTypes: Seq[AbstractDataType] = {
-    (left.dataType, right.dataType) match {
-      case (ArrayType(e1, hasNull), e2) =>
-        TypeCoercion.findTightestCommonType(e1, e2) match {
-          case Some(dt) => Seq(ArrayType(dt, hasNull), dt)
-          case _ => Seq.empty
-        }
-      case _ => Seq.empty
-    }
-  }
+  override def prettyName: String = "array_append"
 
-  override def checkInputDataTypes(): TypeCheckResult = {
-    (left.dataType, right.dataType) match {
-      case (ArrayType(e1, _), e2) if DataTypeUtils.sameType(e1, e2) => TypeCheckResult.TypeCheckSuccess
-      case (ArrayType(e1, _), e2) => DataTypeMismatch(
-        errorSubClass = "ARRAY_FUNCTION_DIFF_TYPES",
-        messageParameters = Map(
-          "functionName" -> toSQLId(prettyName),
-          "leftType" -> toSQLType(left.dataType),
-          "rightType" -> toSQLType(right.dataType),
-          "dataType" -> toSQLType(ArrayType)
-        ))
-      case _ =>
-        DataTypeMismatch(
-          errorSubClass = "UNEXPECTED_INPUT_TYPE",
-          messageParameters = Map(
-            "paramIndex" -> "0",
-            "requiredType" -> toSQLType(ArrayType),
-            "inputSql" -> toSQLExpr(left),
-            "inputType" -> toSQLType(left.dataType)
-          )
-        )
-    }
+  override protected def withNewChildrenInternal(
+      newLeft: Expression, newRight: Expression): ArrayAppend = {
+    copy(left = newLeft, right = newRight)
   }
+}
 
-  override def eval(input: InternalRow): Any = {
-    val value1 = left.eval(input)
-    if (value1 == null) {
-      null
-    } else {
-      val value2 = right.eval(input)
-      nullSafeEval(value1, value2)
-    }
-  }
+@ExpressionDescription(
+  usage = """
+      _FUNC_(array, element) - Add the element at the beginning of the array passed as first
+      argument. Type of element should be the same as the type of the elements of the array.
+      Null element is also prepended to the array. But if the array passed is NULL
+      output is NULL
+    """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array('b', 'd', 'c', 'a'), 'd');
+       ["d","b","d","c","a"]
+      > SELECT _FUNC_(array(1, 2, 3, null), null);
+       [null,1,2,3,null]
+      > SELECT _FUNC_(CAST(null as Array<Int>), 2);
+       NULL
+  """,
+  group = "array_funcs",
+  since = "3.5.0")
+case class ArrayPrepend(left: Expression, right: Expression) extends InsertArrayOneSide {
 
-  override protected def nullSafeEval(arr: Any, elementData: Any): Any = {
-    val arrayData = arr.asInstanceOf[ArrayData]
-    val numberOfElements = arrayData.numElements() + 1
-    if (numberOfElements > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
-      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
-    }
-    val finalData = new Array[Any](numberOfElements)
-    arrayData.foreach(elementType, finalData.update)
-    finalData.update(arrayData.numElements(), elementData)
-    new GenericArrayData(finalData)
-  }
+  override lazy val replacement: Expression = ArrayInsert(left, Literal(0), right)

Review Comment:
   append value to the end of array for index 0 seems hard to understand.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable`

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.
LuciferYang commented on code in PR #40563:
URL: https://github.com/apache/spark/pull/40563#discussion_r1150115729


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -5056,128 +4956,45 @@ case class ArrayCompact(child: Expression)
   """,
   since = "3.4.0",
   group = "array_funcs")
-case class ArrayAppend(left: Expression, right: Expression)
-  extends BinaryExpression
-    with ImplicitCastInputTypes
-    with ComplexTypeMergingExpression
-    with QueryErrorsBase {
-  override def prettyName: String = "array_append"
+case class ArrayAppend(arr: Expression, ele: Expression) extends InsertArrayOneSide {
 
-  @transient protected lazy val elementType: DataType =
-    inputTypes.head.asInstanceOf[ArrayType].elementType
+  override lazy val replacement: Expression =
+    ArrayInsert(arr, Add(ArraySize(left), Literal(1)), ele)

Review Comment:
   use both `arr` and `left`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable`

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on code in PR #40563:
URL: https://github.com/apache/spark/pull/40563#discussion_r1151380334


##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala:
##########
@@ -1855,50 +1855,6 @@ class CollectionExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper
     checkEvaluation(ArrayRepeat(Literal("hi"), Literal(null, IntegerType)), null)
   }
 
-  test("SPARK-41233: ArrayPrepend") {

Review Comment:
   > Any reason to delete the test cases? Should be still relevant right?
   
   `Runtimereplace` do not support execute and it will be replaced with other expression.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #40563: [SPARK-41233][FOLLOWUP] Refactor `array_prepend` with `RuntimeReplaceable`

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on PR #40563:
URL: https://github.com/apache/spark/pull/40563#issuecomment-1519867307

   thanks, merging to master!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on pull request #40563: [SPARK-41233][FOLLOWUP] Refactor `array_prepend` with `RuntimeReplaceable`

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on PR #40563:
URL: https://github.com/apache/spark/pull/40563#issuecomment-1499895556

   > After a second thought, this makes the performance worse. We should improve `ArrayInsert` to generate more efficient code if the position argument is constant. For example, if the position argument is constant 1, then we don't need to generate code to check if position is negative or not.
   
   You means create another PR to simplify the code of `ArrayInsert`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable`

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on code in PR #40563:
URL: https://github.com/apache/spark/pull/40563#discussion_r1151672500


##########
connector/connect/common/src/test/resources/query-tests/explain-results/function_array_append.explain:
##########
@@ -1,2 +1,2 @@
-Project [array_append(e#0, 1) AS array_append(e, 1)#0]
+Project [array_insert(e#0, (size(e#0, false) + 1), 1) AS array_append(e, 1)#0]

Review Comment:
   There are a lot golden files are the same as the case.
   Refer: https://github.com/apache/spark/blob/f6d6ec138fe104bc7317993fe90bfdc672c11938/connector/connect/common/src/test/resources/query-tests/explain-results/function_map_contains_key.explain



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable`

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on PR #40563:
URL: https://github.com/apache/spark/pull/40563#issuecomment-1487887194

   ping @zhengruifeng @HyukjinKwon @dongjoon-hyun cc @infoankitp @navinvishy 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on pull request #40563: [SPARK-41233][FOLLOWUP] Refactor `array_prepend` with `RuntimeReplaceable`

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on PR #40563:
URL: https://github.com/apache/spark/pull/40563#issuecomment-1533956516

   @cloud-fan Thanks !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] infoankitp commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable`

Posted by "infoankitp (via GitHub)" <gi...@apache.org>.
infoankitp commented on code in PR #40563:
URL: https://github.com/apache/spark/pull/40563#discussion_r1151970771


##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala:
##########
@@ -1855,50 +1855,6 @@ class CollectionExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper
     checkEvaluation(ArrayRepeat(Literal("hi"), Literal(null, IntegerType)), null)
   }
 
-  test("SPARK-41233: ArrayPrepend") {

Review Comment:
   I hope we have reviewed that we are having all  types of test cases deleted here in DataFrameFunctionsSuite. If we are missing, please help add those. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan closed pull request #40563: [SPARK-41233][FOLLOWUP] Refactor `array_prepend` with `RuntimeReplaceable`

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan closed pull request #40563: [SPARK-41233][FOLLOWUP] Refactor `array_prepend` with `RuntimeReplaceable`
URL: https://github.com/apache/spark/pull/40563


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] infoankitp commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable`

Posted by "infoankitp (via GitHub)" <gi...@apache.org>.
infoankitp commented on code in PR #40563:
URL: https://github.com/apache/spark/pull/40563#discussion_r1153083574


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -5056,128 +4950,45 @@ case class ArrayCompact(child: Expression)
   """,
   since = "3.4.0",
   group = "array_funcs")
-case class ArrayAppend(left: Expression, right: Expression)
-  extends BinaryExpression
-    with ImplicitCastInputTypes
-    with ComplexTypeMergingExpression
-    with QueryErrorsBase {
-  override def prettyName: String = "array_append"
+case class ArrayAppend(left: Expression, right: Expression) extends InsertArrayOneSide {
 
-  @transient protected lazy val elementType: DataType =
-    inputTypes.head.asInstanceOf[ArrayType].elementType
+  override lazy val replacement: Expression =
+    ArrayInsert(left, Add(ArraySize(left), Literal(1)), right)

Review Comment:
   Oh ! Thanks for confirming! 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable`

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on code in PR #40563:
URL: https://github.com/apache/spark/pull/40563#discussion_r1152648264


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -5056,128 +4950,45 @@ case class ArrayCompact(child: Expression)
   """,
   since = "3.4.0",
   group = "array_funcs")
-case class ArrayAppend(left: Expression, right: Expression)
-  extends BinaryExpression
-    with ImplicitCastInputTypes
-    with ComplexTypeMergingExpression
-    with QueryErrorsBase {
-  override def prettyName: String = "array_append"
+case class ArrayAppend(left: Expression, right: Expression) extends InsertArrayOneSide {
 
-  @transient protected lazy val elementType: DataType =
-    inputTypes.head.asInstanceOf[ArrayType].elementType
+  override lazy val replacement: Expression =
+    ArrayInsert(left, Add(ArraySize(left), Literal(1)), right)

Review Comment:
   `Literal(-1)` let insert element before the last one.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable`

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #40563:
URL: https://github.com/apache/spark/pull/40563#discussion_r1155718705


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -5056,128 +4950,45 @@ case class ArrayCompact(child: Expression)
   """,
   since = "3.4.0",
   group = "array_funcs")
-case class ArrayAppend(left: Expression, right: Expression)
-  extends BinaryExpression
-    with ImplicitCastInputTypes
-    with ComplexTypeMergingExpression
-    with QueryErrorsBase {
-  override def prettyName: String = "array_append"
+case class ArrayAppend(left: Expression, right: Expression) extends InsertArrayOneSide {
 
-  @transient protected lazy val elementType: DataType =
-    inputTypes.head.asInstanceOf[ArrayType].elementType
+  override lazy val replacement: Expression =
+    ArrayInsert(left, Add(ArraySize(left), Literal(1)), right)
 
-  override def inputTypes: Seq[AbstractDataType] = {
-    (left.dataType, right.dataType) match {
-      case (ArrayType(e1, hasNull), e2) =>
-        TypeCoercion.findTightestCommonType(e1, e2) match {
-          case Some(dt) => Seq(ArrayType(dt, hasNull), dt)
-          case _ => Seq.empty
-        }
-      case _ => Seq.empty
-    }
-  }
+  override def prettyName: String = "array_append"
 
-  override def checkInputDataTypes(): TypeCheckResult = {
-    (left.dataType, right.dataType) match {
-      case (ArrayType(e1, _), e2) if DataTypeUtils.sameType(e1, e2) => TypeCheckResult.TypeCheckSuccess
-      case (ArrayType(e1, _), e2) => DataTypeMismatch(
-        errorSubClass = "ARRAY_FUNCTION_DIFF_TYPES",
-        messageParameters = Map(
-          "functionName" -> toSQLId(prettyName),
-          "leftType" -> toSQLType(left.dataType),
-          "rightType" -> toSQLType(right.dataType),
-          "dataType" -> toSQLType(ArrayType)
-        ))
-      case _ =>
-        DataTypeMismatch(
-          errorSubClass = "UNEXPECTED_INPUT_TYPE",
-          messageParameters = Map(
-            "paramIndex" -> "0",
-            "requiredType" -> toSQLType(ArrayType),
-            "inputSql" -> toSQLExpr(left),
-            "inputType" -> toSQLType(left.dataType)
-          )
-        )
-    }
+  override protected def withNewChildrenInternal(
+      newLeft: Expression, newRight: Expression): ArrayAppend = {
+    copy(left = newLeft, right = newRight)
   }
+}
 
-  override def eval(input: InternalRow): Any = {
-    val value1 = left.eval(input)
-    if (value1 == null) {
-      null
-    } else {
-      val value2 = right.eval(input)
-      nullSafeEval(value1, value2)
-    }
-  }
+@ExpressionDescription(
+  usage = """
+      _FUNC_(array, element) - Add the element at the beginning of the array passed as first
+      argument. Type of element should be the same as the type of the elements of the array.
+      Null element is also prepended to the array. But if the array passed is NULL
+      output is NULL
+    """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array('b', 'd', 'c', 'a'), 'd');
+       ["d","b","d","c","a"]
+      > SELECT _FUNC_(array(1, 2, 3, null), null);
+       [null,1,2,3,null]
+      > SELECT _FUNC_(CAST(null as Array<Int>), 2);
+       NULL
+  """,
+  group = "array_funcs",
+  since = "3.5.0")
+case class ArrayPrepend(left: Expression, right: Expression) extends InsertArrayOneSide {
 
-  override protected def nullSafeEval(arr: Any, elementData: Any): Any = {
-    val arrayData = arr.asInstanceOf[ArrayData]
-    val numberOfElements = arrayData.numElements() + 1
-    if (numberOfElements > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
-      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
-    }
-    val finalData = new Array[Any](numberOfElements)
-    arrayData.foreach(elementType, finalData.update)
-    finalData.update(arrayData.numElements(), elementData)
-    new GenericArrayData(finalData)
-  }
+  override lazy val replacement: Expression = ArrayInsert(left, Literal(0), right)

Review Comment:
   great, let's fix it soon before the 3.4 release. also cc @xinrong-meng 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable`

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #40563:
URL: https://github.com/apache/spark/pull/40563#discussion_r1157222572


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1419,101 +1418,26 @@ case class ArrayContains(left: Expression, right: Expression)
   """,
   group = "array_funcs",
   since = "3.5.0")
-case class ArrayPrepend(left: Expression, right: Expression)
-  extends BinaryExpression
-    with ImplicitCastInputTypes
-    with ComplexTypeMergingExpression
-    with QueryErrorsBase {
+case class ArrayPrepend(left: Expression, right: Expression) extends RuntimeReplaceable
+  with ImplicitCastInputTypes with BinaryLike[Expression] with QueryErrorsBase {
 
-  override def nullable: Boolean = left.nullable
+  override lazy val replacement: Expression = ArrayInsert(left, Literal(0), right)

Review Comment:
   this will fail now.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] navinvishy commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable`

Posted by "navinvishy (via GitHub)" <gi...@apache.org>.
navinvishy commented on code in PR #40563:
URL: https://github.com/apache/spark/pull/40563#discussion_r1151369024


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1400,120 +1400,24 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
-// scalastyle:off line.size.limit
-@ExpressionDescription(
-  usage = """
-      _FUNC_(array, element) - Add the element at the beginning of the array passed as first
-      argument. Type of element should be the same as the type of the elements of the array.
-      Null element is also prepended to the array. But if the array passed is NULL
-      output is NULL
-    """,
-  examples = """
-    Examples:
-      > SELECT _FUNC_(array('b', 'd', 'c', 'a'), 'd');
-       ["d","b","d","c","a"]
-      > SELECT _FUNC_(array(1, 2, 3, null), null);
-       [null,1,2,3,null]
-      > SELECT _FUNC_(CAST(null as Array<Int>), 2);
-       NULL
-  """,
-  group = "array_funcs",
-  since = "3.5.0")
-case class ArrayPrepend(left: Expression, right: Expression)
-  extends BinaryExpression
-    with ImplicitCastInputTypes
-    with ComplexTypeMergingExpression
-    with QueryErrorsBase {
-
-  override def nullable: Boolean = left.nullable
+trait InsertArrayOneSide extends RuntimeReplaceable

Review Comment:
   nit: Should we just call this `RuntimeReplaceableBinary` or something? Just a thought, since there doesn't seem to be anything specific about arrays in this trait.



##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala:
##########
@@ -1855,50 +1855,6 @@ class CollectionExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper
     checkEvaluation(ArrayRepeat(Literal("hi"), Literal(null, IntegerType)), null)
   }
 
-  test("SPARK-41233: ArrayPrepend") {

Review Comment:
   Any reason to delete the test cases? Should be still relevant right?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable`

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on code in PR #40563:
URL: https://github.com/apache/spark/pull/40563#discussion_r1151668147


##########
connector/connect/common/src/test/resources/query-tests/explain-results/function_array_append.explain:
##########
@@ -1,2 +1,2 @@
-Project [array_append(e#0, 1) AS array_append(e, 1)#0]
+Project [array_insert(e#0, (size(e#0, false) + 1), 1) AS array_append(e, 1)#0]

Review Comment:
   @HyukjinKwon I tested and found this can't given help.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable`

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #40563:
URL: https://github.com/apache/spark/pull/40563#discussion_r1155700366


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -5056,128 +4950,45 @@ case class ArrayCompact(child: Expression)
   """,
   since = "3.4.0",
   group = "array_funcs")
-case class ArrayAppend(left: Expression, right: Expression)
-  extends BinaryExpression
-    with ImplicitCastInputTypes
-    with ComplexTypeMergingExpression
-    with QueryErrorsBase {
-  override def prettyName: String = "array_append"
+case class ArrayAppend(left: Expression, right: Expression) extends InsertArrayOneSide {
 
-  @transient protected lazy val elementType: DataType =
-    inputTypes.head.asInstanceOf[ArrayType].elementType
+  override lazy val replacement: Expression =
+    ArrayInsert(left, Add(ArraySize(left), Literal(1)), right)
 
-  override def inputTypes: Seq[AbstractDataType] = {
-    (left.dataType, right.dataType) match {
-      case (ArrayType(e1, hasNull), e2) =>
-        TypeCoercion.findTightestCommonType(e1, e2) match {
-          case Some(dt) => Seq(ArrayType(dt, hasNull), dt)
-          case _ => Seq.empty
-        }
-      case _ => Seq.empty
-    }
-  }
+  override def prettyName: String = "array_append"
 
-  override def checkInputDataTypes(): TypeCheckResult = {
-    (left.dataType, right.dataType) match {
-      case (ArrayType(e1, _), e2) if DataTypeUtils.sameType(e1, e2) => TypeCheckResult.TypeCheckSuccess
-      case (ArrayType(e1, _), e2) => DataTypeMismatch(
-        errorSubClass = "ARRAY_FUNCTION_DIFF_TYPES",
-        messageParameters = Map(
-          "functionName" -> toSQLId(prettyName),
-          "leftType" -> toSQLType(left.dataType),
-          "rightType" -> toSQLType(right.dataType),
-          "dataType" -> toSQLType(ArrayType)
-        ))
-      case _ =>
-        DataTypeMismatch(
-          errorSubClass = "UNEXPECTED_INPUT_TYPE",
-          messageParameters = Map(
-            "paramIndex" -> "0",
-            "requiredType" -> toSQLType(ArrayType),
-            "inputSql" -> toSQLExpr(left),
-            "inputType" -> toSQLType(left.dataType)
-          )
-        )
-    }
+  override protected def withNewChildrenInternal(
+      newLeft: Expression, newRight: Expression): ArrayAppend = {
+    copy(left = newLeft, right = newRight)
   }
+}
 
-  override def eval(input: InternalRow): Any = {
-    val value1 = left.eval(input)
-    if (value1 == null) {
-      null
-    } else {
-      val value2 = right.eval(input)
-      nullSafeEval(value1, value2)
-    }
-  }
+@ExpressionDescription(
+  usage = """
+      _FUNC_(array, element) - Add the element at the beginning of the array passed as first
+      argument. Type of element should be the same as the type of the elements of the array.
+      Null element is also prepended to the array. But if the array passed is NULL
+      output is NULL
+    """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array('b', 'd', 'c', 'a'), 'd');
+       ["d","b","d","c","a"]
+      > SELECT _FUNC_(array(1, 2, 3, null), null);
+       [null,1,2,3,null]
+      > SELECT _FUNC_(CAST(null as Array<Int>), 2);
+       NULL
+  """,
+  group = "array_funcs",
+  since = "3.5.0")
+case class ArrayPrepend(left: Expression, right: Expression) extends InsertArrayOneSide {
 
-  override protected def nullSafeEval(arr: Any, elementData: Any): Any = {
-    val arrayData = arr.asInstanceOf[ArrayData]
-    val numberOfElements = arrayData.numElements() + 1
-    if (numberOfElements > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
-      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
-    }
-    val finalData = new Array[Any](numberOfElements)
-    arrayData.foreach(elementType, finalData.update)
-    finalData.update(arrayData.numElements(), elementData)
-    new GenericArrayData(finalData)
-  }
+  override lazy val replacement: Expression = ArrayInsert(left, Literal(0), right)

Review Comment:
   If we do want to reserve index `0` for some special behavior in `array_insert`, I think appending the value to the end of the array seems more convenient?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable`

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.
zhengruifeng commented on code in PR #40563:
URL: https://github.com/apache/spark/pull/40563#discussion_r1155707809


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -5056,128 +4950,45 @@ case class ArrayCompact(child: Expression)
   """,
   since = "3.4.0",
   group = "array_funcs")
-case class ArrayAppend(left: Expression, right: Expression)
-  extends BinaryExpression
-    with ImplicitCastInputTypes
-    with ComplexTypeMergingExpression
-    with QueryErrorsBase {
-  override def prettyName: String = "array_append"
+case class ArrayAppend(left: Expression, right: Expression) extends InsertArrayOneSide {
 
-  @transient protected lazy val elementType: DataType =
-    inputTypes.head.asInstanceOf[ArrayType].elementType
+  override lazy val replacement: Expression =
+    ArrayInsert(left, Add(ArraySize(left), Literal(1)), right)
 
-  override def inputTypes: Seq[AbstractDataType] = {
-    (left.dataType, right.dataType) match {
-      case (ArrayType(e1, hasNull), e2) =>
-        TypeCoercion.findTightestCommonType(e1, e2) match {
-          case Some(dt) => Seq(ArrayType(dt, hasNull), dt)
-          case _ => Seq.empty
-        }
-      case _ => Seq.empty
-    }
-  }
+  override def prettyName: String = "array_append"
 
-  override def checkInputDataTypes(): TypeCheckResult = {
-    (left.dataType, right.dataType) match {
-      case (ArrayType(e1, _), e2) if DataTypeUtils.sameType(e1, e2) => TypeCheckResult.TypeCheckSuccess
-      case (ArrayType(e1, _), e2) => DataTypeMismatch(
-        errorSubClass = "ARRAY_FUNCTION_DIFF_TYPES",
-        messageParameters = Map(
-          "functionName" -> toSQLId(prettyName),
-          "leftType" -> toSQLType(left.dataType),
-          "rightType" -> toSQLType(right.dataType),
-          "dataType" -> toSQLType(ArrayType)
-        ))
-      case _ =>
-        DataTypeMismatch(
-          errorSubClass = "UNEXPECTED_INPUT_TYPE",
-          messageParameters = Map(
-            "paramIndex" -> "0",
-            "requiredType" -> toSQLType(ArrayType),
-            "inputSql" -> toSQLExpr(left),
-            "inputType" -> toSQLType(left.dataType)
-          )
-        )
-    }
+  override protected def withNewChildrenInternal(
+      newLeft: Expression, newRight: Expression): ArrayAppend = {
+    copy(left = newLeft, right = newRight)
   }
+}
 
-  override def eval(input: InternalRow): Any = {
-    val value1 = left.eval(input)
-    if (value1 == null) {
-      null
-    } else {
-      val value2 = right.eval(input)
-      nullSafeEval(value1, value2)
-    }
-  }
+@ExpressionDescription(
+  usage = """
+      _FUNC_(array, element) - Add the element at the beginning of the array passed as first
+      argument. Type of element should be the same as the type of the elements of the array.
+      Null element is also prepended to the array. But if the array passed is NULL
+      output is NULL
+    """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array('b', 'd', 'c', 'a'), 'd');
+       ["d","b","d","c","a"]
+      > SELECT _FUNC_(array(1, 2, 3, null), null);
+       [null,1,2,3,null]
+      > SELECT _FUNC_(CAST(null as Array<Int>), 2);
+       NULL
+  """,
+  group = "array_funcs",
+  since = "3.5.0")
+case class ArrayPrepend(left: Expression, right: Expression) extends InsertArrayOneSide {
 
-  override protected def nullSafeEval(arr: Any, elementData: Any): Any = {
-    val arrayData = arr.asInstanceOf[ArrayData]
-    val numberOfElements = arrayData.numElements() + 1
-    if (numberOfElements > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
-      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
-    }
-    val finalData = new Array[Any](numberOfElements)
-    arrayData.foreach(elementType, finalData.update)
-    finalData.update(arrayData.numElements(), elementData)
-    new GenericArrayData(finalData)
-  }
+  override lazy val replacement: Expression = ArrayInsert(left, Literal(0), right)

Review Comment:
   I think we don't have to reserve index `0`, make it fails seems good to me



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable`

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #40563:
URL: https://github.com/apache/spark/pull/40563#discussion_r1155722493


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -5056,128 +4950,45 @@ case class ArrayCompact(child: Expression)
   """,
   since = "3.4.0",
   group = "array_funcs")
-case class ArrayAppend(left: Expression, right: Expression)
-  extends BinaryExpression
-    with ImplicitCastInputTypes
-    with ComplexTypeMergingExpression
-    with QueryErrorsBase {
-  override def prettyName: String = "array_append"
+case class ArrayAppend(left: Expression, right: Expression) extends InsertArrayOneSide {
 
-  @transient protected lazy val elementType: DataType =
-    inputTypes.head.asInstanceOf[ArrayType].elementType
+  override lazy val replacement: Expression =
+    ArrayInsert(left, Add(ArraySize(left), Literal(1)), right)

Review Comment:
   A risk is `left` can be evaluated twice if common subexpression elimination is disabled, or the underlying execution engine (native engine) does not support it.
   
   I'd prefer to keep `ArrayAppend` as it is, or update `ArrayInsert` to provide an easy way to do append.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable`

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #40563:
URL: https://github.com/apache/spark/pull/40563#discussion_r1157221445


##########
connector/connect/common/src/test/resources/query-tests/explain-results/function_array_prepend.explain:
##########
@@ -1,2 +1,2 @@
-Project [array_prepend(e#0, 1) AS array_prepend(e, 1)#0]
+Project [array_insert(e#0, 0, 1) AS array_prepend(e, 1)#0]

Review Comment:
   can we fix this?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org