You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/12/06 23:40:42 UTC

[GitHub] [spark] navinvishy opened a new pull request, #38947: SPARK-41231: Adds an array_prepend function to catalyst

navinvishy opened a new pull request, #38947:
URL: https://github.com/apache/spark/pull/38947

### What changes were proposed in this pull request?
Adds a new array function array_prepend to catalyst.

### Why are the changes needed?
This adds a function that exists in many SQL implementations, specifically Snowflake: https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/api/snowflake.snowpark.functions.array_prepend.html

### Does this PR introduce _any_ user-facing change?
Yes.

### How was this patch tested?
Added unit tests.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] navinvishy commented on a diff in pull request #38947: [SPARK-41231][SQL] Adds an array_prepend function to catalyst

Posted by GitBox <gi...@apache.org>.

navinvishy commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1042621443


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1399,6 +1399,119 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage =
+    "_FUNC_(array, value) - Returns an array containing value as well as all elements from array. The new element is positioned at the beginning of the array.",
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array(1, 2, 3), 4);
+       [4, 1, 2, 3]
+  """,
+  group = "array_funcs",
+  since = "3.4.0")
+case class ArrayPrepend(left: Expression, right: Expression)
+  extends BinaryExpression
+    with ImplicitCastInputTypes
+    with NullIntolerant
+    with QueryErrorsBase {
+
+  override def nullSafeEval(arr: Any, value: Any): Any = {
+    val numberOfElements = arr.asInstanceOf[ArrayData].numElements()
+    if (numberOfElements + 1 > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
+      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
+    }
+    val newArray = new Array[Any](numberOfElements + 1)
+    newArray(0) = value
+    var pos = 1
+    arr

Review Comment:
   I'm using `ArrayData.foreach` right? Maybe you meant somewhere else?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] amaliujia commented on a diff in pull request #38947: [SPARK-41233][SQL] Add `array_prepend` function

Posted by "amaliujia (via GitHub)" <gi...@apache.org>.

amaliujia commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1107724565


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1399,6 +1399,149 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage =
+    "_FUNC_(array, value) - Returns an array containing value as well as all elements from array. The new element is positioned at the beginning of the array.",
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array(1, 2, 3), 4);
+       [4, 1, 2, 3]
+  """,
+  group = "array_funcs",
+  since = "3.4.0")
+case class ArrayPrepend(left: Expression, right: Expression)
+  extends BinaryExpression
+    with ImplicitCastInputTypes
+    with ComplexTypeMergingExpression
+    with QueryErrorsBase {
+
+  override def nullable: Boolean = left.nullable
+
+  override def eval(input: InternalRow): Any = {
+    val value1 = left.eval(input)
+    if (value1 == null) {
+      null
+    } else {
+      val value2 = right.eval(input)
+      nullSafeEval(value1, value2)
+    }
+  }
+  override def nullSafeEval(arr: Any, value: Any): Any = {
+    val numberOfElements = arr.asInstanceOf[ArrayData].numElements()
+    if (numberOfElements + 1 > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
+      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
+    }
+    val newArray = new Array[Any](numberOfElements + 1)
+    newArray(0) = value
+    var pos = 1
+    arr
+      .asInstanceOf[ArrayData]
+      .foreach(
+        right.dataType,
+        (i, v) => {
+          newArray(pos) = v
+          pos += 1
+        })
+    new GenericArrayData(newArray)
+  }
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    val leftGen = left.genCode(ctx)
+    val rightGen = right.genCode(ctx)
+    val f = (arr: String, value: String) => {
+      val newArraySize = ctx.freshName("newArraySize")
+      val newArray = ctx.freshName("newArray")
+      val i = ctx.freshName("i")
+      val pos = ctx.freshName("pos")
+      val allocation = CodeGenerator.createArrayData(
+        newArray,
+        right.dataType,
+        newArraySize,
+        s" $prettyName failed.")
+      val assignment =
+        CodeGenerator.createArrayAssignment(newArray, right.dataType, arr, pos, i, false)
+      val newElemAssignment =
+        CodeGenerator.setArrayElement(newArray, right.dataType, pos, value, Some(rightGen.isNull))
+      s"""
+         |int $pos = 0;
+         |int $newArraySize = $arr.numElements() + 1;
+         |$allocation
+         |$newElemAssignment
+         |$pos = $pos + 1;
+         |for (int $i = 0; $i < $arr.numElements(); $i ++) {
+         |  $assignment
+         |  $pos = $pos + 1;
+         |}
+         |${ev.value} = $newArray;
+         |""".stripMargin
+    }
+    val resultCode = f(leftGen.value, rightGen.value)
+    if(nullable) {
+      val nullSafeEval = leftGen.code + rightGen.code + ctx.nullSafeExec(nullable, leftGen.isNull) {
+        s"""
+           |${ev.isNull} = false;
+           |${resultCode}
+           |""".stripMargin
+      }
+      ev.copy(code =
+        code"""
+        boolean ${ev.isNull} = true;
+        ${CodeGenerator.javaType(dataType)} ${ev.value} = ${CodeGenerator.defaultValue(dataType)};
+        $nullSafeEval
+      """)
+    } else {
+      ev.copy(code =
+        code"""
+            ${leftGen.code}
+            ${rightGen.code}
+            ${CodeGenerator.javaType(dataType)} ${ev.value} = ${CodeGenerator.defaultValue(dataType)};
+            $resultCode""", isNull = FalseLiteral)
+    }
+  }
+
+  override def prettyName: String = "array_prepend"
+
+  override protected def withNewChildrenInternal(
+      newLeft: Expression, newRight: Expression): ArrayPrepend =
+    copy(left = newLeft, right = newRight)
+
+  override def dataType: DataType = left.dataType

Review Comment:
   I guess there is a test case missing to capture this nullability case? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on pull request #38947: [SPARK-41233][SQL][PYTHON] Add `array_prepend` function

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on PR #38947:
URL: https://github.com/apache/spark/pull/38947#issuecomment-1471792565

   we need add a new rule to `CheckConnectJvmClientCompatibility`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #38947: [SPARK-41231][SQL] Adds an array_prepend function to catalyst

Posted by GitBox <gi...@apache.org>.

LuciferYang commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1042026795


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1399,6 +1399,119 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage =
+    "_FUNC_(array, value) - Returns an array containing value as well as all elements from array. The new element is positioned at the beginning of the array.",
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array(1, 2, 3), 4);
+       [4, 1, 2, 3]
+  """,
+  group = "array_funcs",
+  since = "3.4.0")
+case class ArrayPrepend(left: Expression, right: Expression)
+  extends BinaryExpression
+    with ImplicitCastInputTypes
+    with NullIntolerant
+    with QueryErrorsBase {
+
+  override def nullSafeEval(arr: Any, value: Any): Any = {
+    val numberOfElements = arr.asInstanceOf[ArrayData].numElements()
+    if (numberOfElements + 1 > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
+      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
+    }
+    val newArray = new Array[Any](numberOfElements + 1)
+    newArray(0) = value
+    var pos = 1
+    arr
+      .asInstanceOf[ArrayData]
+      .foreach(
+        right.dataType,
+        (i, v) => {
+          newArray(pos) = v
+          pos += 1
+        })
+    new GenericArrayData(newArray)
+  }
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    nullSafeCodeGen(
+      ctx,
+      ev,
+      (arr, value) => {
+        val newArraySize = ctx.freshName("newArraySize")
+        val newArray = ctx.freshName("newArray")
+        val i = ctx.freshName("i")
+        val pos = ctx.freshName("pos")
+        val allocation = CodeGenerator.createArrayData(
+          newArray,
+          right.dataType,
+          newArraySize,
+          s" $prettyName failed.")
+        val assignment =
+          CodeGenerator.createArrayAssignment(newArray, right.dataType, arr, pos, i, false)
+        val newElemAssignment =
+          CodeGenerator.setArrayElement(newArray, right.dataType, pos, value)
+        s"""
+           |int $pos = 0;
+           |int $newArraySize = $arr.numElements() + 1;
+           |$allocation
+           |$newElemAssignment
+           |$pos = $pos + 1;
+           |for (int $i = 0; $i < $arr.numElements(); $i ++) {
+           |  $assignment
+           |  $pos = $pos + 1;
+           |}
+           |${ev.value} = $newArray;
+           |""".stripMargin
+      })
+  }
+
+  override def prettyName: String = "array_prepend"
+  override protected def withNewChildrenInternal(
+                                                  newLeft: Expression,
+                                                  newRight: Expression): ArrayPrepend =
+    copy(left = newLeft, right = newRight)
+  override def dataType: DataType = left.dataType
+  override def checkInputDataTypes(): TypeCheckResult = {
+    (left.dataType, right.dataType) match {
+      case (_, NullType) | (NullType, _) =>
+        DataTypeMismatch(
+          errorSubClass = "NULL_TYPE",
+          messageParameters = Map("functionName" -> toSQLId(prettyName)))
+      case (l, _) if !ArrayType.acceptsType(l) =>
+        DataTypeMismatch(
+          errorSubClass = "UNEXPECTED_INPUT_TYPE",
+          messageParameters = Map(
+            "paramIndex" -> "1",
+            "requiredType" -> toSQLType(ArrayType),
+            "inputSql" -> toSQLExpr(left),
+            "inputType" -> toSQLType(left.dataType)))
+      case (ArrayType(e1, _), e2) if e1.sameType(e2) =>
+        TypeUtils.checkForOrderingExpr(e2, prettyName)

Review Comment:
   Why need  check `e2` can be ordered ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan commented on a diff in pull request #38947: [SPARK-41233][SQL][PYTHON] Add `array_prepend` function

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.

cloud-fan commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1133450986


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1399,6 +1399,151 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = """
+      _FUNC_(array, element) - Add the element at the beginning of the array passed as first
+      argument. Type of element should be similar to type of the elements of the array.
+      Null element is also prepended to the array. But if the array passed is NULL
+      output is NULL
+    """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array('b', 'd', 'c', 'a'), 'd');
+       ["d","b","d","c","a"]
+      > SELECT _FUNC_(array(1, 2, 3, null), null);
+       [null,1,2,3,null]
+      > SELECT _FUNC_(CAST(null as Array<Int>), 2);
+       NULL
+  """,
+  group = "array_funcs",
+  since = "3.5.0")
+case class ArrayPrepend(left: Expression, right: Expression)
+  extends BinaryExpression
+    with ImplicitCastInputTypes
+    with ComplexTypeMergingExpression
+    with QueryErrorsBase {
+
+  override def nullable: Boolean = left.nullable
+
+  @transient protected lazy val elementType: DataType =
+    inputTypes.head.asInstanceOf[ArrayType].elementType
+
+  override def eval(input: InternalRow): Any = {
+    val value1 = left.eval(input)
+    if (value1 == null) {
+      null
+    } else {
+      val value2 = right.eval(input)
+      nullSafeEval(value1, value2)
+    }
+  }
+  override def nullSafeEval(arr: Any, elementData: Any): Any = {
+    val arrayData = arr.asInstanceOf[ArrayData]
+    val numberOfElements = arrayData.numElements() + 1
+    if (numberOfElements > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
+      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
+    }
+    val finalData = new Array[Any](numberOfElements)
+    finalData.update(0, elementData)
+    arrayData.foreach(elementType, (i: Int, v: Any) => finalData.update(i + 1, v))
+    new GenericArrayData(finalData)
+  }
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    val leftGen = left.genCode(ctx)
+    val rightGen = right.genCode(ctx)
+    val f = (arr: String, value: String) => {
+      val newArraySize = ctx.freshName("newArraySize")
+      val newArray = ctx.freshName("newArray")
+      val i = ctx.freshName("i")
+      val iPlus1 = s"$i+1"
+      val zero = "0"
+      val allocation = CodeGenerator.createArrayData(
+        newArray,
+        elementType,
+        newArraySize,

Review Comment:
   do we really need this variable? can we just pass `s"$arr.numElements() + 1"`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan commented on a diff in pull request #38947: [SPARK-41233][SQL][PYTHON] Add `array_prepend` function

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.

cloud-fan commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1133450131


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1399,6 +1399,151 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = """
+      _FUNC_(array, element) - Add the element at the beginning of the array passed as first
+      argument. Type of element should be similar to type of the elements of the array.

Review Comment:
   We don't need to document the type coercion behavior in each function. We have a dedicated doc to explain what "similar" type is: https://spark.apache.org/docs/latest/sql-ref-ansi-compliance.html#type-promotion-and-precedence



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on pull request #38947: [SPARK-41233][SQL][PYTHON] Add `array_prepend` function

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.

HyukjinKwon commented on PR #38947:
URL: https://github.com/apache/spark/pull/38947#issuecomment-1471792165

   I reopened the PR. @navinvishy would you mind rebasing and syncing to the latest master branch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on pull request #38947: [SPARK-41233][SQL][PYTHON] Add `array_prepend` function

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on PR #38947:
URL: https://github.com/apache/spark/pull/38947#issuecomment-1471794732

   @navinvishy like `ProblemFilters.exclude[Problem]("org.apache.spark.sql.functions.array_prepend"),` and you can run ` dev/connect-jvm-client-mima-check ` to check the result


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] navinvishy commented on a diff in pull request #38947: [SPARK-41233][SQL][PYTHON] Add `array_prepend` function

Posted by "navinvishy (via GitHub)" <gi...@apache.org>.

navinvishy commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1138152130


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1399,6 +1399,151 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = """
+      _FUNC_(array, element) - Add the element at the beginning of the array passed as first
+      argument. Type of element should be similar to type of the elements of the array.
+      Null element is also prepended to the array. But if the array passed is NULL
+      output is NULL
+    """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array('b', 'd', 'c', 'a'), 'd');
+       ["d","b","d","c","a"]
+      > SELECT _FUNC_(array(1, 2, 3, null), null);
+       [null,1,2,3,null]
+      > SELECT _FUNC_(CAST(null as Array<Int>), 2);
+       NULL
+  """,
+  group = "array_funcs",
+  since = "3.5.0")
+case class ArrayPrepend(left: Expression, right: Expression)
+  extends BinaryExpression
+    with ImplicitCastInputTypes
+    with ComplexTypeMergingExpression
+    with QueryErrorsBase {
+
+  override def nullable: Boolean = left.nullable
+
+  @transient protected lazy val elementType: DataType =
+    inputTypes.head.asInstanceOf[ArrayType].elementType
+
+  override def eval(input: InternalRow): Any = {
+    val value1 = left.eval(input)
+    if (value1 == null) {
+      null
+    } else {
+      val value2 = right.eval(input)
+      nullSafeEval(value1, value2)
+    }
+  }
+  override def nullSafeEval(arr: Any, elementData: Any): Any = {
+    val arrayData = arr.asInstanceOf[ArrayData]
+    val numberOfElements = arrayData.numElements() + 1
+    if (numberOfElements > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
+      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
+    }
+    val finalData = new Array[Any](numberOfElements)
+    finalData.update(0, elementData)
+    arrayData.foreach(elementType, (i: Int, v: Any) => finalData.update(i + 1, v))
+    new GenericArrayData(finalData)
+  }
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    val leftGen = left.genCode(ctx)
+    val rightGen = right.genCode(ctx)
+    val f = (arr: String, value: String) => {
+      val newArraySize = ctx.freshName("newArraySize")
+      val newArray = ctx.freshName("newArray")
+      val i = ctx.freshName("i")
+      val iPlus1 = s"$i+1"
+      val zero = "0"
+      val allocation = CodeGenerator.createArrayData(
+        newArray,
+        elementType,
+        newArraySize,
+        s" $prettyName failed.")
+      val assignment =
+        CodeGenerator.createArrayAssignment(newArray, elementType, arr, iPlus1, i, false)
+      val newElemAssignment =
+        CodeGenerator.setArrayElement(newArray, elementType, zero, value, Some(rightGen.isNull))
+      s"""
+         |int $newArraySize = $arr.numElements() + 1;
+         |$allocation
+         |$newElemAssignment
+         |for (int $i = 0; $i < $arr.numElements(); $i ++) {
+         |  $assignment
+         |}
+         |${ev.value} = $newArray;
+         |""".stripMargin
+    }
+    val resultCode = f(leftGen.value, rightGen.value)
+    if(nullable) {
+      val nullSafeEval = leftGen.code + rightGen.code + ctx.nullSafeExec(nullable, leftGen.isNull) {
+        s"""
+           |${ev.isNull} = false;
+           |${resultCode}
+           |""".stripMargin
+      }
+      ev.copy(code =
+        code"""
+        boolean ${ev.isNull} = true;
+        ${CodeGenerator.javaType(dataType)} ${ev.value} = ${CodeGenerator.defaultValue(dataType)};
+        $nullSafeEval
+      """)

Review Comment:
   Changed this, thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] navinvishy commented on a diff in pull request #38947: [SPARK-41233][SQL][PYTHON] Add `array_prepend` function

Posted by "navinvishy (via GitHub)" <gi...@apache.org>.

navinvishy commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1140881182


##########
sql/core/src/main/scala/org/apache/spark/sql/functions.scala:
##########
@@ -4043,6 +4043,16 @@ object functions {
   def array_compact(column: Column): Column = withExpr {
     ArrayCompact(column.expr)
   }
+    /**

Review Comment:
   @LuciferYang thanks. I've made the updates. There were some failures that seem to be unrelated to this change. I reran the workflow and seems to have succeeded this time.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] zhengruifeng commented on a diff in pull request #38947: [SPARK-41231][SQL] Adds an array_prepend function to catalyst

Posted by GitBox <gi...@apache.org>.

zhengruifeng commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1042143651


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1399,6 +1399,119 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage =
+    "_FUNC_(array, value) - Returns an array containing value as well as all elements from array. The new element is positioned at the beginning of the array.",
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array(1, 2, 3), 4);
+       [4, 1, 2, 3]
+  """,
+  group = "array_funcs",
+  since = "3.4.0")
+case class ArrayPrepend(left: Expression, right: Expression)
+  extends BinaryExpression
+    with ImplicitCastInputTypes
+    with NullIntolerant
+    with QueryErrorsBase {
+
+  override def nullSafeEval(arr: Any, value: Any): Any = {
+    val numberOfElements = arr.asInstanceOf[ArrayData].numElements()
+    if (numberOfElements + 1 > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
+      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
+    }
+    val newArray = new Array[Any](numberOfElements + 1)
+    newArray(0) = value
+    var pos = 1
+    arr

Review Comment:
   I think we should use while loop since it's hot path



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1399,6 +1399,119 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage =
+    "_FUNC_(array, value) - Returns an array containing value as well as all elements from array. The new element is positioned at the beginning of the array.",
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array(1, 2, 3), 4);
+       [4, 1, 2, 3]
+  """,
+  group = "array_funcs",
+  since = "3.4.0")
+case class ArrayPrepend(left: Expression, right: Expression)
+  extends BinaryExpression
+    with ImplicitCastInputTypes
+    with NullIntolerant
+    with QueryErrorsBase {
+
+  override def nullSafeEval(arr: Any, value: Any): Any = {
+    val numberOfElements = arr.asInstanceOf[ArrayData].numElements()
+    if (numberOfElements + 1 > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
+      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
+    }
+    val newArray = new Array[Any](numberOfElements + 1)
+    newArray(0) = value
+    var pos = 1
+    arr
+      .asInstanceOf[ArrayData]
+      .foreach(
+        right.dataType,
+        (i, v) => {
+          newArray(pos) = v
+          pos += 1
+        })
+    new GenericArrayData(newArray)
+  }
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    nullSafeCodeGen(
+      ctx,
+      ev,
+      (arr, value) => {
+        val newArraySize = ctx.freshName("newArraySize")
+        val newArray = ctx.freshName("newArray")
+        val i = ctx.freshName("i")
+        val pos = ctx.freshName("pos")
+        val allocation = CodeGenerator.createArrayData(
+          newArray,
+          right.dataType,
+          newArraySize,
+          s" $prettyName failed.")
+        val assignment =
+          CodeGenerator.createArrayAssignment(newArray, right.dataType, arr, pos, i, false)
+        val newElemAssignment =
+          CodeGenerator.setArrayElement(newArray, right.dataType, pos, value)
+        s"""
+           |int $pos = 0;
+           |int $newArraySize = $arr.numElements() + 1;
+           |$allocation
+           |$newElemAssignment
+           |$pos = $pos + 1;
+           |for (int $i = 0; $i < $arr.numElements(); $i ++) {
+           |  $assignment
+           |  $pos = $pos + 1;
+           |}
+           |${ev.value} = $newArray;
+           |""".stripMargin
+      })
+  }
+
+  override def prettyName: String = "array_prepend"
+  override protected def withNewChildrenInternal(

Review Comment:
   indent



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1399,6 +1399,119 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage =
+    "_FUNC_(array, value) - Returns an array containing value as well as all elements from array. The new element is positioned at the beginning of the array.",
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array(1, 2, 3), 4);
+       [4, 1, 2, 3]
+  """,
+  group = "array_funcs",
+  since = "3.4.0")
+case class ArrayPrepend(left: Expression, right: Expression)
+  extends BinaryExpression
+    with ImplicitCastInputTypes
+    with NullIntolerant
+    with QueryErrorsBase {
+
+  override def nullSafeEval(arr: Any, value: Any): Any = {
+    val numberOfElements = arr.asInstanceOf[ArrayData].numElements()
+    if (numberOfElements + 1 > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
+      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
+    }
+    val newArray = new Array[Any](numberOfElements + 1)
+    newArray(0) = value
+    var pos = 1
+    arr
+      .asInstanceOf[ArrayData]
+      .foreach(
+        right.dataType,
+        (i, v) => {
+          newArray(pos) = v
+          pos += 1
+        })
+    new GenericArrayData(newArray)
+  }
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    nullSafeCodeGen(
+      ctx,
+      ev,
+      (arr, value) => {
+        val newArraySize = ctx.freshName("newArraySize")
+        val newArray = ctx.freshName("newArray")
+        val i = ctx.freshName("i")
+        val pos = ctx.freshName("pos")
+        val allocation = CodeGenerator.createArrayData(
+          newArray,
+          right.dataType,
+          newArraySize,
+          s" $prettyName failed.")
+        val assignment =
+          CodeGenerator.createArrayAssignment(newArray, right.dataType, arr, pos, i, false)
+        val newElemAssignment =
+          CodeGenerator.setArrayElement(newArray, right.dataType, pos, value)
+        s"""
+           |int $pos = 0;
+           |int $newArraySize = $arr.numElements() + 1;
+           |$allocation
+           |$newElemAssignment
+           |$pos = $pos + 1;
+           |for (int $i = 0; $i < $arr.numElements(); $i ++) {
+           |  $assignment
+           |  $pos = $pos + 1;
+           |}
+           |${ev.value} = $newArray;
+           |""".stripMargin
+      })
+  }
+
+  override def prettyName: String = "array_prepend"
+  override protected def withNewChildrenInternal(
+                                                  newLeft: Expression,
+                                                  newRight: Expression): ArrayPrepend =
+    copy(left = newLeft, right = newRight)
+  override def dataType: DataType = left.dataType
+  override def checkInputDataTypes(): TypeCheckResult = {
+    (left.dataType, right.dataType) match {
+      case (_, NullType) | (NullType, _) =>
+        DataTypeMismatch(
+          errorSubClass = "NULL_TYPE",
+          messageParameters = Map("functionName" -> toSQLId(prettyName)))
+      case (l, _) if !ArrayType.acceptsType(l) =>
+        DataTypeMismatch(
+          errorSubClass = "UNEXPECTED_INPUT_TYPE",
+          messageParameters = Map(
+            "paramIndex" -> "1",
+            "requiredType" -> toSQLType(ArrayType),
+            "inputSql" -> toSQLExpr(left),
+            "inputType" -> toSQLType(left.dataType)))
+      case (ArrayType(e1, _), e2) if e1.sameType(e2) =>
+        TypeUtils.checkForOrderingExpr(e2, prettyName)
+      case _ =>
+        DataTypeMismatch(
+          errorSubClass = "ARRAY_FUNCTION_DIFF_TYPES",
+          messageParameters = Map(
+            "functionName" -> toSQLId(prettyName),
+            "dataType" -> toSQLType(ArrayType),
+            "leftType" -> toSQLType(left.dataType),
+            "rightType" -> toSQLType(right.dataType)))
+    }
+  }
+  override def inputTypes: Seq[AbstractDataType] = {
+    (left.dataType, right.dataType) match {
+      case (_, NullType) => Seq.empty

Review Comment:
   why need to deal with `NullType` here?



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1399,6 +1399,119 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage =
+    "_FUNC_(array, value) - Returns an array containing value as well as all elements from array. The new element is positioned at the beginning of the array.",
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array(1, 2, 3), 4);
+       [4, 1, 2, 3]
+  """,
+  group = "array_funcs",
+  since = "3.4.0")
+case class ArrayPrepend(left: Expression, right: Expression)
+  extends BinaryExpression
+    with ImplicitCastInputTypes
+    with NullIntolerant
+    with QueryErrorsBase {
+
+  override def nullSafeEval(arr: Any, value: Any): Any = {
+    val numberOfElements = arr.asInstanceOf[ArrayData].numElements()
+    if (numberOfElements + 1 > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
+      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
+    }
+    val newArray = new Array[Any](numberOfElements + 1)
+    newArray(0) = value
+    var pos = 1
+    arr
+      .asInstanceOf[ArrayData]
+      .foreach(
+        right.dataType,
+        (i, v) => {
+          newArray(pos) = v
+          pos += 1
+        })
+    new GenericArrayData(newArray)
+  }
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    nullSafeCodeGen(
+      ctx,
+      ev,
+      (arr, value) => {
+        val newArraySize = ctx.freshName("newArraySize")
+        val newArray = ctx.freshName("newArray")
+        val i = ctx.freshName("i")
+        val pos = ctx.freshName("pos")
+        val allocation = CodeGenerator.createArrayData(
+          newArray,
+          right.dataType,
+          newArraySize,
+          s" $prettyName failed.")
+        val assignment =
+          CodeGenerator.createArrayAssignment(newArray, right.dataType, arr, pos, i, false)
+        val newElemAssignment =
+          CodeGenerator.setArrayElement(newArray, right.dataType, pos, value)
+        s"""
+           |int $pos = 0;
+           |int $newArraySize = $arr.numElements() + 1;
+           |$allocation
+           |$newElemAssignment
+           |$pos = $pos + 1;
+           |for (int $i = 0; $i < $arr.numElements(); $i ++) {
+           |  $assignment
+           |  $pos = $pos + 1;
+           |}
+           |${ev.value} = $newArray;
+           |""".stripMargin
+      })
+  }
+
+  override def prettyName: String = "array_prepend"
+  override protected def withNewChildrenInternal(
+                                                  newLeft: Expression,
+                                                  newRight: Expression): ArrayPrepend =
+    copy(left = newLeft, right = newRight)
+  override def dataType: DataType = left.dataType
+  override def checkInputDataTypes(): TypeCheckResult = {

Review Comment:
   I think we can refer to `ArrayRemove#checkInputDataTypes` here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #38947: SPARK-41231: Adds an array_prepend function to catalyst

Posted by GitBox <gi...@apache.org>.

HyukjinKwon commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1041690900


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -119,21 +117,24 @@ case class Size(child: Expression, legacySizeOfNull: Boolean)
     val value = child.eval(input)
     if (value == null) {
       if (legacySizeOfNull) -1 else null
-    } else child.dataType match {
-      case _: ArrayType => value.asInstanceOf[ArrayData].numElements()
-      case _: MapType => value.asInstanceOf[MapData].numElements()
-      case other => throw QueryExecutionErrors.unsupportedOperandTypeForSizeFunctionError(other)
-    }
+    } else

Review Comment:
   Can we remove unrelated style changes?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] amaliujia commented on a diff in pull request #38947: [SPARK-41233][SQL] Add `array_prepend` function

Posted by "amaliujia (via GitHub)" <gi...@apache.org>.

amaliujia commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1107724565


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1399,6 +1399,149 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage =
+    "_FUNC_(array, value) - Returns an array containing value as well as all elements from array. The new element is positioned at the beginning of the array.",
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array(1, 2, 3), 4);
+       [4, 1, 2, 3]
+  """,
+  group = "array_funcs",
+  since = "3.4.0")
+case class ArrayPrepend(left: Expression, right: Expression)
+  extends BinaryExpression
+    with ImplicitCastInputTypes
+    with ComplexTypeMergingExpression
+    with QueryErrorsBase {
+
+  override def nullable: Boolean = left.nullable
+
+  override def eval(input: InternalRow): Any = {
+    val value1 = left.eval(input)
+    if (value1 == null) {
+      null
+    } else {
+      val value2 = right.eval(input)
+      nullSafeEval(value1, value2)
+    }
+  }
+  override def nullSafeEval(arr: Any, value: Any): Any = {
+    val numberOfElements = arr.asInstanceOf[ArrayData].numElements()
+    if (numberOfElements + 1 > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
+      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
+    }
+    val newArray = new Array[Any](numberOfElements + 1)
+    newArray(0) = value
+    var pos = 1
+    arr
+      .asInstanceOf[ArrayData]
+      .foreach(
+        right.dataType,
+        (i, v) => {
+          newArray(pos) = v
+          pos += 1
+        })
+    new GenericArrayData(newArray)
+  }
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    val leftGen = left.genCode(ctx)
+    val rightGen = right.genCode(ctx)
+    val f = (arr: String, value: String) => {
+      val newArraySize = ctx.freshName("newArraySize")
+      val newArray = ctx.freshName("newArray")
+      val i = ctx.freshName("i")
+      val pos = ctx.freshName("pos")
+      val allocation = CodeGenerator.createArrayData(
+        newArray,
+        right.dataType,
+        newArraySize,
+        s" $prettyName failed.")
+      val assignment =
+        CodeGenerator.createArrayAssignment(newArray, right.dataType, arr, pos, i, false)
+      val newElemAssignment =
+        CodeGenerator.setArrayElement(newArray, right.dataType, pos, value, Some(rightGen.isNull))
+      s"""
+         |int $pos = 0;
+         |int $newArraySize = $arr.numElements() + 1;
+         |$allocation
+         |$newElemAssignment
+         |$pos = $pos + 1;
+         |for (int $i = 0; $i < $arr.numElements(); $i ++) {
+         |  $assignment
+         |  $pos = $pos + 1;
+         |}
+         |${ev.value} = $newArray;
+         |""".stripMargin
+    }
+    val resultCode = f(leftGen.value, rightGen.value)
+    if(nullable) {
+      val nullSafeEval = leftGen.code + rightGen.code + ctx.nullSafeExec(nullable, leftGen.isNull) {
+        s"""
+           |${ev.isNull} = false;
+           |${resultCode}
+           |""".stripMargin
+      }
+      ev.copy(code =
+        code"""
+        boolean ${ev.isNull} = true;
+        ${CodeGenerator.javaType(dataType)} ${ev.value} = ${CodeGenerator.defaultValue(dataType)};
+        $nullSafeEval
+      """)
+    } else {
+      ev.copy(code =
+        code"""
+            ${leftGen.code}
+            ${rightGen.code}
+            ${CodeGenerator.javaType(dataType)} ${ev.value} = ${CodeGenerator.defaultValue(dataType)};
+            $resultCode""", isNull = FalseLiteral)
+    }
+  }
+
+  override def prettyName: String = "array_prepend"
+
+  override protected def withNewChildrenInternal(
+      newLeft: Expression, newRight: Expression): ArrayPrepend =
+    copy(left = newLeft, right = newRight)
+
+  override def dataType: DataType = left.dataType

Review Comment:
   I guess there is a test case missing to capture this nullability case?  Something like left array is not nullable but right parameter is a null?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] navinvishy commented on pull request #38947: [SPARK-41233][SQL][PYTHON] Add `array_prepend` function

Posted by "navinvishy (via GitHub)" <gi...@apache.org>.

navinvishy commented on PR #38947:
URL: https://github.com/apache/spark/pull/38947#issuecomment-1462275861

   @dongjoon-hyun @beliefer @LuciferYang @cloud-fan @HyukjinKwon gentle ping to please take a look. Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on pull request #38947: [SPARK-41233][SQL][PYTHON] Add `array_prepend` function

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.

HyukjinKwon commented on PR #38947:
URL: https://github.com/apache/spark/pull/38947#issuecomment-1471790173

   Seems like this broke the mima check for Spark connect. I am reverting this for now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] zhengruifeng commented on pull request #38947: [SPARK-41233][SQL][PYTHON] Add `array_prepend` function

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.

zhengruifeng commented on PR #38947:
URL: https://github.com/apache/spark/pull/38947#issuecomment-1475463815

   merged to master again 😄 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] navinvishy commented on pull request #38947: [SPARK-41233][SQL] Adds an array_prepend function to catalyst

Posted by GitBox <gi...@apache.org>.

navinvishy commented on PR #38947:
URL: https://github.com/apache/spark/pull/38947#issuecomment-1341514611

   > The jira number should be [SPARK-41233](https://issues.apache.org/jira/browse/SPARK-41233)
   
   Changed. Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] navinvishy commented on a diff in pull request #38947: SPARK-41231: Adds an array_prepend function to catalyst

Posted by GitBox <gi...@apache.org>.

navinvishy commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1041588232


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala:
##########
@@ -692,6 +696,7 @@ object FunctionRegistry {
     expression[Sequence]("sequence"),
     expression[ArrayRepeat]("array_repeat"),
     expression[ArrayRemove]("array_remove"),
+    expression[ArrayPrepend]("array_prepend"),

Review Comment:
   Relevant diff here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] zhengruifeng commented on a diff in pull request #38947: [SPARK-41233][SQL] Add `array_prepend` function

Posted by GitBox <gi...@apache.org>.

zhengruifeng commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1057148245


##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala:
##########
@@ -1840,6 +1840,47 @@ class CollectionExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper
     checkEvaluation(ArrayRepeat(Literal("hi"), Literal(null, IntegerType)), null)
   }
 
+  test("SPARK-41233: ArrayPrepend") {
+    val a0 = Literal.create(Seq(1, 2, 3, 4), ArrayType(IntegerType))
+    val a1 = Literal.create(Seq("a", "b", "c"), ArrayType(StringType))
+    val a2 = Literal.create(Seq.empty[Integer], ArrayType(IntegerType))
+    val a3 = Literal.create(null, ArrayType(StringType))
+
+    checkEvaluation(ArrayPrepend(a0, Literal(0)), Seq(0, 1, 2, 3, 4))
+    checkEvaluation(ArrayPrepend(a1, Literal("a")), Seq("a", "a", "b", "c"))
+    checkEvaluation(ArrayPrepend(a2, Literal(1)), Seq(1))
+    checkEvaluation(ArrayPrepend(a2, Literal(null, IntegerType)), null)

Review Comment:
   sorry for the late reply, we have a update on the null handling, that is:
   1, if the array is NULL, returns NULL;
   2, if the array is NOT NULL, the element is NULL, prepend the NULL value into the array.
   
   would you mind updating this behavior? thanks



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] navinvishy commented on a diff in pull request #38947: [SPARK-41233][SQL] Add `array_prepend` function

Posted by "navinvishy (via GitHub)" <gi...@apache.org>.

navinvishy commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1119413659


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1399,6 +1399,119 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage =
+    "_FUNC_(array, value) - Returns an array containing value as well as all elements from array. The new element is positioned at the beginning of the array.",
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array(1, 2, 3), 4);
+       [4, 1, 2, 3]
+  """,
+  group = "array_funcs",
+  since = "3.4.0")
+case class ArrayPrepend(left: Expression, right: Expression)
+  extends BinaryExpression
+    with ImplicitCastInputTypes
+    with NullIntolerant
+    with QueryErrorsBase {
+
+  override def nullSafeEval(arr: Any, value: Any): Any = {
+    val numberOfElements = arr.asInstanceOf[ArrayData].numElements()
+    if (numberOfElements + 1 > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
+      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
+    }
+    val newArray = new Array[Any](numberOfElements + 1)
+    newArray(0) = value
+    var pos = 1
+    arr
+      .asInstanceOf[ArrayData]
+      .foreach(
+        right.dataType,
+        (i, v) => {
+          newArray(pos) = v
+          pos += 1
+        })
+    new GenericArrayData(newArray)
+  }
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    nullSafeCodeGen(
+      ctx,
+      ev,
+      (arr, value) => {
+        val newArraySize = ctx.freshName("newArraySize")
+        val newArray = ctx.freshName("newArray")
+        val i = ctx.freshName("i")
+        val pos = ctx.freshName("pos")
+        val allocation = CodeGenerator.createArrayData(
+          newArray,
+          right.dataType,
+          newArraySize,
+          s" $prettyName failed.")
+        val assignment =
+          CodeGenerator.createArrayAssignment(newArray, right.dataType, arr, pos, i, false)
+        val newElemAssignment =
+          CodeGenerator.setArrayElement(newArray, right.dataType, pos, value)
+        s"""
+           |int $pos = 0;
+           |int $newArraySize = $arr.numElements() + 1;
+           |$allocation
+           |$newElemAssignment
+           |$pos = $pos + 1;
+           |for (int $i = 0; $i < $arr.numElements(); $i ++) {
+           |  $assignment
+           |  $pos = $pos + 1;
+           |}
+           |${ev.value} = $newArray;
+           |""".stripMargin
+      })
+  }
+
+  override def prettyName: String = "array_prepend"
+  override protected def withNewChildrenInternal(
+                                                  newLeft: Expression,
+                                                  newRight: Expression): ArrayPrepend =
+    copy(left = newLeft, right = newRight)
+  override def dataType: DataType = left.dataType
+  override def checkInputDataTypes(): TypeCheckResult = {
+    (left.dataType, right.dataType) match {
+      case (_, NullType) | (NullType, _) =>
+        DataTypeMismatch(
+          errorSubClass = "NULL_TYPE",
+          messageParameters = Map("functionName" -> toSQLId(prettyName)))
+      case (l, _) if !ArrayType.acceptsType(l) =>
+        DataTypeMismatch(
+          errorSubClass = "UNEXPECTED_INPUT_TYPE",
+          messageParameters = Map(
+            "paramIndex" -> "1",
+            "requiredType" -> toSQLType(ArrayType),
+            "inputSql" -> toSQLExpr(left),
+            "inputType" -> toSQLType(left.dataType)))
+      case (ArrayType(e1, _), e2) if e1.sameType(e2) =>
+        TypeUtils.checkForOrderingExpr(e2, prettyName)
+      case _ =>
+        DataTypeMismatch(
+          errorSubClass = "ARRAY_FUNCTION_DIFF_TYPES",
+          messageParameters = Map(
+            "functionName" -> toSQLId(prettyName),
+            "dataType" -> toSQLType(ArrayType),
+            "leftType" -> toSQLType(left.dataType),
+            "rightType" -> toSQLType(right.dataType)))
+    }
+  }
+  override def inputTypes: Seq[AbstractDataType] = {
+    (left.dataType, right.dataType) match {
+      case (_, NullType) => Seq.empty

Review Comment:
   This matches what is in `ArrayAppend` now.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #38947: [SPARK-41233][SQL][PYTHON] Add `array_prepend` function

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1140918554


##########
sql/core/src/main/scala/org/apache/spark/sql/functions.scala:
##########
@@ -4043,6 +4043,16 @@ object functions {
   def array_compact(column: Column): Column = withExpr {
     ArrayCompact(column.expr)
   }
+    /**

Review Comment:
   All Passed now :)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] zhengruifeng closed pull request #38947: [SPARK-41233][SQL][PYTHON] Add `array_prepend` function

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.

zhengruifeng closed pull request #38947: [SPARK-41233][SQL][PYTHON] Add `array_prepend` function
URL: https://github.com/apache/spark/pull/38947


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] navinvishy commented on pull request #38947: [SPARK-41233][SQL] Add `array_prepend` function

Posted by "navinvishy (via GitHub)" <gi...@apache.org>.

navinvishy commented on PR #38947:
URL: https://github.com/apache/spark/pull/38947#issuecomment-1407297181

   > > Could we wait array_append merged then this PR could follow some convention and make a better abstract ? @zhengruifeng @LuciferYang @HyukjinKwon
   > 
   > If it is possible, I think we should wait
   
   Looks like array_append has already been merged? https://github.com/apache/spark/pull/38865


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] zhengruifeng commented on pull request #38947: [SPARK-41233][SQL] Add `array_prepend` function

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.

zhengruifeng commented on PR #38947:
URL: https://github.com/apache/spark/pull/38947#issuecomment-1409581323

   @navinvishy mind fix the scala linter failure?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #38947: [SPARK-41233][SQL] Add `array_prepend` function

Posted by GitBox <gi...@apache.org>.

dongjoon-hyun commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1045166213


##########
sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala:
##########
@@ -2651,6 +2651,58 @@ class DataFrameFunctionsSuite extends QueryTest with SharedSparkSession {
     )
   }
 
+  test("array prepend") {

Review Comment:
   ```scala
   - test("array prepend") {
   + test("SPARK-41233: array prepend") {
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #38947: [SPARK-41233][SQL] Add `array_prepend` function

Posted by GitBox <gi...@apache.org>.

dongjoon-hyun commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1045166142


##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala:
##########
@@ -1840,6 +1840,47 @@ class CollectionExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper
     checkEvaluation(ArrayRepeat(Literal("hi"), Literal(null, IntegerType)), null)
   }
 
+  test("Array prepend") {

Review Comment:
   ```scala
   - test("Array prepend") {
   + test("SPARK-41233: ArrayPrepend") {
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] navinvishy commented on a diff in pull request #38947: [SPARK-41231][SQL] Adds an array_prepend function to catalyst

Posted by GitBox <gi...@apache.org>.

navinvishy commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1042618665


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1399,6 +1399,119 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage =
+    "_FUNC_(array, value) - Returns an array containing value as well as all elements from array. The new element is positioned at the beginning of the array.",
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array(1, 2, 3), 4);
+       [4, 1, 2, 3]
+  """,
+  group = "array_funcs",
+  since = "3.4.0")
+case class ArrayPrepend(left: Expression, right: Expression)
+  extends BinaryExpression
+    with ImplicitCastInputTypes
+    with NullIntolerant
+    with QueryErrorsBase {
+
+  override def nullSafeEval(arr: Any, value: Any): Any = {
+    val numberOfElements = arr.asInstanceOf[ArrayData].numElements()
+    if (numberOfElements + 1 > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
+      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
+    }
+    val newArray = new Array[Any](numberOfElements + 1)
+    newArray(0) = value
+    var pos = 1
+    arr
+      .asInstanceOf[ArrayData]
+      .foreach(
+        right.dataType,
+        (i, v) => {
+          newArray(pos) = v
+          pos += 1
+        })
+    new GenericArrayData(newArray)
+  }
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    nullSafeCodeGen(
+      ctx,
+      ev,
+      (arr, value) => {
+        val newArraySize = ctx.freshName("newArraySize")
+        val newArray = ctx.freshName("newArray")
+        val i = ctx.freshName("i")
+        val pos = ctx.freshName("pos")
+        val allocation = CodeGenerator.createArrayData(
+          newArray,
+          right.dataType,
+          newArraySize,
+          s" $prettyName failed.")
+        val assignment =
+          CodeGenerator.createArrayAssignment(newArray, right.dataType, arr, pos, i, false)
+        val newElemAssignment =
+          CodeGenerator.setArrayElement(newArray, right.dataType, pos, value)
+        s"""
+           |int $pos = 0;
+           |int $newArraySize = $arr.numElements() + 1;
+           |$allocation
+           |$newElemAssignment
+           |$pos = $pos + 1;
+           |for (int $i = 0; $i < $arr.numElements(); $i ++) {
+           |  $assignment
+           |  $pos = $pos + 1;
+           |}
+           |${ev.value} = $newArray;
+           |""".stripMargin
+      })
+  }
+
+  override def prettyName: String = "array_prepend"
+  override protected def withNewChildrenInternal(
+                                                  newLeft: Expression,
+                                                  newRight: Expression): ArrayPrepend =
+    copy(left = newLeft, right = newRight)
+  override def dataType: DataType = left.dataType
+  override def checkInputDataTypes(): TypeCheckResult = {
+    (left.dataType, right.dataType) match {
+      case (_, NullType) | (NullType, _) =>
+        DataTypeMismatch(
+          errorSubClass = "NULL_TYPE",
+          messageParameters = Map("functionName" -> toSQLId(prettyName)))
+      case (l, _) if !ArrayType.acceptsType(l) =>
+        DataTypeMismatch(
+          errorSubClass = "UNEXPECTED_INPUT_TYPE",
+          messageParameters = Map(
+            "paramIndex" -> "1",
+            "requiredType" -> toSQLType(ArrayType),
+            "inputSql" -> toSQLExpr(left),
+            "inputType" -> toSQLType(left.dataType)))
+      case (ArrayType(e1, _), e2) if e1.sameType(e2) =>
+        TypeUtils.checkForOrderingExpr(e2, prettyName)
+      case _ =>
+        DataTypeMismatch(
+          errorSubClass = "ARRAY_FUNCTION_DIFF_TYPES",
+          messageParameters = Map(
+            "functionName" -> toSQLId(prettyName),
+            "dataType" -> toSQLType(ArrayType),
+            "leftType" -> toSQLType(left.dataType),
+            "rightType" -> toSQLType(right.dataType)))
+    }
+  }
+  override def inputTypes: Seq[AbstractDataType] = {
+    (left.dataType, right.dataType) match {
+      case (_, NullType) => Seq.empty

Review Comment:
   I thought this is good since we're declaring this to be `NullIntolerant`. I was also following what `ArrayContains` does.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] navinvishy commented on a diff in pull request #38947: SPARK-41231: Adds an array_prepend function to catalyst

Posted by GitBox <gi...@apache.org>.

navinvishy commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1041587895


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1376,35 +1418,148 @@ case class ArrayContains(left: Expression, right: Expression)
            |   break;
            |}
          """.stripMargin
-      } else {
-        s"""
+        } else {
+          s"""
            |if (${ctx.genEqual(right.dataType, value, getValue)}) {
            |  ${ev.value} = true;
            |  break;
            |}
          """.stripMargin
-      }
-      s"""
+        }
+        s"""
          |for (int $i = 0; $i < $arr.numElements(); $i ++) {
          |  $loopBodyCode
          |}
        """.stripMargin
-    })
+      })
   }
 
   override def prettyName: String = "array_contains"
 
   override protected def withNewChildrenInternal(
-      newLeft: Expression, newRight: Expression): ArrayContains =
+      newLeft: Expression,
+      newRight: Expression): ArrayContains =
     copy(left = newLeft, right = newRight)
 }
+@ExpressionDescription(

Review Comment:
   Relevant diff here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] navinvishy commented on a diff in pull request #38947: SPARK-41231: Adds an array_prepend function to catalyst

Posted by GitBox <gi...@apache.org>.

navinvishy commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1041588680


##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala:
##########
@@ -1840,6 +1939,47 @@ class CollectionExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper
     checkEvaluation(ArrayRepeat(Literal("hi"), Literal(null, IntegerType)), null)
   }
 
+  test("Array prepend") {

Review Comment:
   Relevant diff here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] zhengruifeng commented on a diff in pull request #38947: [SPARK-41233][SQL] Add `array_prepend` function

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.

zhengruifeng commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1119610173


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1399,6 +1399,145 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage =

Review Comment:
   please also document the null handling like `ArrayApend`:
   
   ```
    Type of element should be similar to type of the elements of the array.
         Null element is also appended into the array. But if the array passed, is NULL
         output is NULL
   ```
   
   examples
   ```
         > SELECT _FUNC_(array(1, 2, 3, null), null);
          [null,1,2,3,null]
         > SELECT _FUNC_(CAST(null as Array<Int>), 2);
          NULL
   ```
   



##########
python/pyspark/sql/functions.py:
##########
@@ -7619,6 +7619,36 @@ def get(col: "ColumnOrName", index: Union["ColumnOrName", int]) -> Column:
     return _invoke_function_over_columns("get", col, index)
 
 
+@try_remote_functions
+def array_prepend(col: "ColumnOrName", element: Any) -> Column:
+    """
+    Collection function: Returns an array containing element as
+    well as all elements from array. The new element is positioned
+    at the beginning of the array.
+
+    .. versionadded:: 3.4.0
+
+    Parameters
+    ----------
+    col : :class:`~pyspark.sql.Column` or str
+        name of column containing array
+    element :
+        element to be prepended to the array
+
+    Returns
+    -------
+    :class:`~pyspark.sql.Column`
+        an array excluding given value.
+
+    Examples
+    --------
+    >>> df = spark.createDataFrame([([2, 3, 4],), ([],)], ['data'])
+    >>> df.select(array_prepend(df.data, 1)).collect()
+    [Row(array_prepend(data, 1)=[1, 2, 3, 4]), Row(array_prepend(data, 1)=[1])]
+    """
+    return _invoke_function("array_prepend", _to_java_column(col), element)

Review Comment:
   ```suggestion
       return _invoke_function_over_columns("array_prepend", col, lit(value))
   ```



##########
python/pyspark/sql/functions.py:
##########
@@ -7619,6 +7619,36 @@ def get(col: "ColumnOrName", index: Union["ColumnOrName", int]) -> Column:
     return _invoke_function_over_columns("get", col, index)
 
 
+@try_remote_functions
+def array_prepend(col: "ColumnOrName", element: Any) -> Column:
+    """
+    Collection function: Returns an array containing element as
+    well as all elements from array. The new element is positioned
+    at the beginning of the array.
+
+    .. versionadded:: 3.4.0
+
+    Parameters
+    ----------
+    col : :class:`~pyspark.sql.Column` or str
+        name of column containing array
+    element :
+        element to be prepended to the array

Review Comment:
   ```suggestion
       value :
           a literal value, or a :class:`~pyspark.sql.Column` expression.
   ```



##########
python/pyspark/sql/functions.py:
##########
@@ -7619,6 +7619,36 @@ def get(col: "ColumnOrName", index: Union["ColumnOrName", int]) -> Column:
     return _invoke_function_over_columns("get", col, index)
 
 
+@try_remote_functions
+def array_prepend(col: "ColumnOrName", element: Any) -> Column:

Review Comment:
   ```suggestion
   def array_prepend(col: "ColumnOrName", value: Any) -> Column:
   ```
   
   to be consistent with `array_append`



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1399,6 +1399,145 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage =
+    "_FUNC_(array, value) - Returns an array containing value as well as all elements from array. The new element is positioned at the beginning of the array.",
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array('b', 'd', 'c', 'a'), 'd');
+       ["d","b","d","c","a"]
+  """,
+  group = "array_funcs",
+  since = "3.4.0")
+case class ArrayPrepend(left: Expression, right: Expression)
+  extends BinaryExpression
+    with ImplicitCastInputTypes
+    with ComplexTypeMergingExpression
+    with QueryErrorsBase {
+
+  override def nullable: Boolean = left.nullable
+
+  @transient protected lazy val elementType: DataType =
+    inputTypes.head.asInstanceOf[ArrayType].elementType
+
+  override def eval(input: InternalRow): Any = {
+    val value1 = left.eval(input)
+    if (value1 == null) {
+      null
+    } else {
+      val value2 = right.eval(input)
+      nullSafeEval(value1, value2)
+    }
+  }
+  override def nullSafeEval(arr: Any, elementData: Any): Any = {
+    val arrayData = arr.asInstanceOf[ArrayData]
+    val numberOfElements = arrayData.numElements() + 1
+    if (numberOfElements > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
+      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
+    }
+    val finalData = new Array[Any](numberOfElements)
+    finalData.update(0, elementData)
+    arrayData.foreach(elementType, (i: Int, v: Any) => finalData.update(i + 1, v))
+    new GenericArrayData(finalData)
+  }
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    val leftGen = left.genCode(ctx)
+    val rightGen = right.genCode(ctx)
+    val f = (arr: String, value: String) => {
+      val newArraySize = ctx.freshName("newArraySize")
+      val newArray = ctx.freshName("newArray")
+      val i = ctx.freshName("i")
+      val pos = ctx.freshName("pos")

Review Comment:
   I guess we only need one of var `i` and `pos`? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #38947: [SPARK-41233][SQL][PYTHON] Add `array_prepend` function

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1139918574


##########
sql/core/src/main/scala/org/apache/spark/sql/functions.scala:
##########
@@ -4043,6 +4043,16 @@ object functions {
   def array_compact(column: Column): Column = withExpr {
     ArrayCompact(column.expr)
   }
+    /**

Review Comment:
   nit: The indentation of comments is different from others



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] zhengruifeng commented on a diff in pull request #38947: [SPARK-41233][SQL] Add `array_prepend` function

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.

zhengruifeng commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1105245346


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1399,6 +1399,149 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage =
+    "_FUNC_(array, value) - Returns an array containing value as well as all elements from array. The new element is positioned at the beginning of the array.",
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array(1, 2, 3), 4);
+       [4, 1, 2, 3]
+  """,
+  group = "array_funcs",
+  since = "3.4.0")
+case class ArrayPrepend(left: Expression, right: Expression)
+  extends BinaryExpression
+    with ImplicitCastInputTypes
+    with ComplexTypeMergingExpression
+    with QueryErrorsBase {
+
+  override def nullable: Boolean = left.nullable
+
+  override def eval(input: InternalRow): Any = {
+    val value1 = left.eval(input)
+    if (value1 == null) {
+      null
+    } else {
+      val value2 = right.eval(input)
+      nullSafeEval(value1, value2)
+    }
+  }
+  override def nullSafeEval(arr: Any, value: Any): Any = {
+    val numberOfElements = arr.asInstanceOf[ArrayData].numElements()
+    if (numberOfElements + 1 > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
+      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
+    }
+    val newArray = new Array[Any](numberOfElements + 1)
+    newArray(0) = value
+    var pos = 1
+    arr
+      .asInstanceOf[ArrayData]
+      .foreach(
+        right.dataType,
+        (i, v) => {
+          newArray(pos) = v
+          pos += 1
+        })
+    new GenericArrayData(newArray)
+  }
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    val leftGen = left.genCode(ctx)
+    val rightGen = right.genCode(ctx)
+    val f = (arr: String, value: String) => {
+      val newArraySize = ctx.freshName("newArraySize")
+      val newArray = ctx.freshName("newArray")
+      val i = ctx.freshName("i")
+      val pos = ctx.freshName("pos")
+      val allocation = CodeGenerator.createArrayData(
+        newArray,
+        right.dataType,
+        newArraySize,
+        s" $prettyName failed.")
+      val assignment =
+        CodeGenerator.createArrayAssignment(newArray, right.dataType, arr, pos, i, false)
+      val newElemAssignment =
+        CodeGenerator.setArrayElement(newArray, right.dataType, pos, value, Some(rightGen.isNull))

Review Comment:
   I think we should not use `right.dataType` in `assignment ` and `newElemAssignment `.
   
   may use the
   ```
     @transient protected lazy val elementType: DataType =
       inputTypes.head.asInstanceOf[ArrayType].elementType
   ```



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1399,6 +1399,149 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage =
+    "_FUNC_(array, value) - Returns an array containing value as well as all elements from array. The new element is positioned at the beginning of the array.",
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array(1, 2, 3), 4);
+       [4, 1, 2, 3]
+  """,
+  group = "array_funcs",
+  since = "3.4.0")
+case class ArrayPrepend(left: Expression, right: Expression)
+  extends BinaryExpression
+    with ImplicitCastInputTypes
+    with ComplexTypeMergingExpression
+    with QueryErrorsBase {
+
+  override def nullable: Boolean = left.nullable
+
+  override def eval(input: InternalRow): Any = {
+    val value1 = left.eval(input)
+    if (value1 == null) {
+      null
+    } else {
+      val value2 = right.eval(input)
+      nullSafeEval(value1, value2)
+    }
+  }
+  override def nullSafeEval(arr: Any, value: Any): Any = {
+    val numberOfElements = arr.asInstanceOf[ArrayData].numElements()
+    if (numberOfElements + 1 > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
+      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
+    }
+    val newArray = new Array[Any](numberOfElements + 1)
+    newArray(0) = value
+    var pos = 1
+    arr
+      .asInstanceOf[ArrayData]
+      .foreach(
+        right.dataType,
+        (i, v) => {
+          newArray(pos) = v
+          pos += 1
+        })
+    new GenericArrayData(newArray)
+  }
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    val leftGen = left.genCode(ctx)
+    val rightGen = right.genCode(ctx)
+    val f = (arr: String, value: String) => {
+      val newArraySize = ctx.freshName("newArraySize")
+      val newArray = ctx.freshName("newArray")
+      val i = ctx.freshName("i")
+      val pos = ctx.freshName("pos")
+      val allocation = CodeGenerator.createArrayData(
+        newArray,
+        right.dataType,
+        newArraySize,
+        s" $prettyName failed.")
+      val assignment =
+        CodeGenerator.createArrayAssignment(newArray, right.dataType, arr, pos, i, false)
+      val newElemAssignment =
+        CodeGenerator.setArrayElement(newArray, right.dataType, pos, value, Some(rightGen.isNull))
+      s"""
+         |int $pos = 0;
+         |int $newArraySize = $arr.numElements() + 1;
+         |$allocation
+         |$newElemAssignment
+         |$pos = $pos + 1;
+         |for (int $i = 0; $i < $arr.numElements(); $i ++) {
+         |  $assignment
+         |  $pos = $pos + 1;
+         |}
+         |${ev.value} = $newArray;
+         |""".stripMargin
+    }
+    val resultCode = f(leftGen.value, rightGen.value)
+    if(nullable) {
+      val nullSafeEval = leftGen.code + rightGen.code + ctx.nullSafeExec(nullable, leftGen.isNull) {
+        s"""
+           |${ev.isNull} = false;
+           |${resultCode}
+           |""".stripMargin
+      }
+      ev.copy(code =
+        code"""
+        boolean ${ev.isNull} = true;
+        ${CodeGenerator.javaType(dataType)} ${ev.value} = ${CodeGenerator.defaultValue(dataType)};
+        $nullSafeEval
+      """)
+    } else {
+      ev.copy(code =
+        code"""
+            ${leftGen.code}
+            ${rightGen.code}
+            ${CodeGenerator.javaType(dataType)} ${ev.value} = ${CodeGenerator.defaultValue(dataType)};
+            $resultCode""", isNull = FalseLiteral)
+    }
+  }
+
+  override def prettyName: String = "array_prepend"
+
+  override protected def withNewChildrenInternal(
+      newLeft: Expression, newRight: Expression): ArrayPrepend =
+    copy(left = newLeft, right = newRight)
+
+  override def dataType: DataType = left.dataType

Review Comment:
   ```suggestion
     override def dataType: DataType = if (right.nullable) left.dataType.asNullable else left.dataType
   ```



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1399,6 +1399,119 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage =
+    "_FUNC_(array, value) - Returns an array containing value as well as all elements from array. The new element is positioned at the beginning of the array.",
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array(1, 2, 3), 4);
+       [4, 1, 2, 3]
+  """,
+  group = "array_funcs",
+  since = "3.4.0")
+case class ArrayPrepend(left: Expression, right: Expression)
+  extends BinaryExpression
+    with ImplicitCastInputTypes
+    with NullIntolerant
+    with QueryErrorsBase {
+
+  override def nullSafeEval(arr: Any, value: Any): Any = {
+    val numberOfElements = arr.asInstanceOf[ArrayData].numElements()
+    if (numberOfElements + 1 > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
+      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
+    }
+    val newArray = new Array[Any](numberOfElements + 1)
+    newArray(0) = value
+    var pos = 1
+    arr

Review Comment:
   would you minding make this `nullSafeEval` method and `doGenCode` method more consistent with `ArrayAppend`:
   
   ```
     @transient protected lazy val elementType: DataType =
       inputTypes.head.asInstanceOf[ArrayType].elementType
   
     override protected def nullSafeEval(arr: Any, elementData: Any): Any = {
       val arrayData = arr.asInstanceOf[ArrayData]
       val numberOfElements = arrayData.numElements() + 1
       if (numberOfElements > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
         throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
       }
       val finalData = new Array[Any](numberOfElements)
       finalData.update(0, elementData)
       arrayData.foreach(elementType, (i: Int, v: Any) => finalData.update(i + 1, v))
       new GenericArrayData(finalData)
     }
   ```
   
   I think we can make a better abstract later



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] zhengruifeng commented on pull request #38947: [SPARK-41233][SQL][PYTHON] Add `array_prepend` function

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.

zhengruifeng commented on PR #38947:
URL: https://github.com/apache/spark/pull/38947#issuecomment-1465385482

   @navinvishy  sorry, it seems late for 3.4.0.
   would you mind changing the version from `3.4.0` to `3.5.0`?
   
   I am going to merge this PR to master this week if no more comments.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on pull request #38947: [SPARK-41233][SQL] Add `array_prepend` function

Posted by GitBox <gi...@apache.org>.

LuciferYang commented on PR #38947:
URL: https://github.com/apache/spark/pull/38947#issuecomment-1365031813

   @navinvishy cloud you resolve the conflict?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] zhengruifeng commented on pull request #38947: [SPARK-41233][SQL] Add `array_prepend` function

Posted by GitBox <gi...@apache.org>.

zhengruifeng commented on PR #38947:
URL: https://github.com/apache/spark/pull/38947#issuecomment-1365029531

   also cc @beliefer 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #38947: [SPARK-41233][SQL] Add `array_prepend` function

Posted by GitBox <gi...@apache.org>.

LuciferYang commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1046742799


##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala:
##########
@@ -1840,6 +1840,47 @@ class CollectionExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper
     checkEvaluation(ArrayRepeat(Literal("hi"), Literal(null, IntegerType)), null)
   }
 
+  test("Array prepend") {

Review Comment:
   This is not a bug fix, also need to add jira number?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on pull request #38947: [SPARK-41233][SQL] Add `array_prepend` function

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on PR #38947:
URL: https://github.com/apache/spark/pull/38947#issuecomment-1407296674

   > Could we wait array_append merged then this PR could follow some convention and make a better abstract ? @zhengruifeng @LuciferYang @HyukjinKwon
   
   If it is possible, I think we should wait
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan commented on a diff in pull request #38947: [SPARK-41233][SQL][PYTHON] Add `array_prepend` function

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.

cloud-fan commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1133454186


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1399,6 +1399,151 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = """
+      _FUNC_(array, element) - Add the element at the beginning of the array passed as first
+      argument. Type of element should be similar to type of the elements of the array.
+      Null element is also prepended to the array. But if the array passed is NULL
+      output is NULL
+    """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array('b', 'd', 'c', 'a'), 'd');
+       ["d","b","d","c","a"]
+      > SELECT _FUNC_(array(1, 2, 3, null), null);
+       [null,1,2,3,null]
+      > SELECT _FUNC_(CAST(null as Array<Int>), 2);
+       NULL
+  """,
+  group = "array_funcs",
+  since = "3.5.0")
+case class ArrayPrepend(left: Expression, right: Expression)
+  extends BinaryExpression
+    with ImplicitCastInputTypes
+    with ComplexTypeMergingExpression
+    with QueryErrorsBase {
+
+  override def nullable: Boolean = left.nullable
+
+  @transient protected lazy val elementType: DataType =
+    inputTypes.head.asInstanceOf[ArrayType].elementType
+
+  override def eval(input: InternalRow): Any = {
+    val value1 = left.eval(input)
+    if (value1 == null) {
+      null
+    } else {
+      val value2 = right.eval(input)
+      nullSafeEval(value1, value2)
+    }
+  }
+  override def nullSafeEval(arr: Any, elementData: Any): Any = {
+    val arrayData = arr.asInstanceOf[ArrayData]
+    val numberOfElements = arrayData.numElements() + 1
+    if (numberOfElements > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
+      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
+    }
+    val finalData = new Array[Any](numberOfElements)
+    finalData.update(0, elementData)
+    arrayData.foreach(elementType, (i: Int, v: Any) => finalData.update(i + 1, v))
+    new GenericArrayData(finalData)
+  }
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    val leftGen = left.genCode(ctx)
+    val rightGen = right.genCode(ctx)
+    val f = (arr: String, value: String) => {
+      val newArraySize = ctx.freshName("newArraySize")
+      val newArray = ctx.freshName("newArray")
+      val i = ctx.freshName("i")
+      val iPlus1 = s"$i+1"
+      val zero = "0"
+      val allocation = CodeGenerator.createArrayData(
+        newArray,
+        elementType,
+        newArraySize,
+        s" $prettyName failed.")
+      val assignment =
+        CodeGenerator.createArrayAssignment(newArray, elementType, arr, iPlus1, i, false)
+      val newElemAssignment =
+        CodeGenerator.setArrayElement(newArray, elementType, zero, value, Some(rightGen.isNull))
+      s"""
+         |int $newArraySize = $arr.numElements() + 1;
+         |$allocation
+         |$newElemAssignment
+         |for (int $i = 0; $i < $arr.numElements(); $i ++) {
+         |  $assignment
+         |}
+         |${ev.value} = $newArray;
+         |""".stripMargin
+    }
+    val resultCode = f(leftGen.value, rightGen.value)
+    if(nullable) {
+      val nullSafeEval = leftGen.code + rightGen.code + ctx.nullSafeExec(nullable, leftGen.isNull) {
+        s"""
+           |${ev.isNull} = false;
+           |${resultCode}
+           |""".stripMargin
+      }
+      ev.copy(code =
+        code"""
+        boolean ${ev.isNull} = true;
+        ${CodeGenerator.javaType(dataType)} ${ev.value} = ${CodeGenerator.defaultValue(dataType)};
+        $nullSafeEval
+      """)
+    } else {
+      ev.copy(code =
+        code"""

Review Comment:
   ditto



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] zhengruifeng commented on pull request #38947: [SPARK-41233][SQL][PYTHON] Add `array_prepend` function

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.

zhengruifeng commented on PR #38947:
URL: https://github.com/apache/spark/pull/38947#issuecomment-1471205562

   @navinvishy would you mind addressing wenchen's comments? we can merge it then.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on pull request #38947: [SPARK-41231][SQL] Adds an array_prepend function to catalyst

Posted by GitBox <gi...@apache.org>.

LuciferYang commented on PR #38947:
URL: https://github.com/apache/spark/pull/38947#issuecomment-1340705645

   The jira number should be `[SPARK-41233](https://issues.apache.org/jira/browse/SPARK-41233)` 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] navinvishy commented on a diff in pull request #38947: SPARK-41231: Adds an array_prepend function to catalyst

Posted by GitBox <gi...@apache.org>.

navinvishy commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1041589578


##########
sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala:
##########
@@ -2645,78 +2257,103 @@ class DataFrameFunctionsSuite extends QueryTest with SharedSparkSession {
         "paramIndex" -> "2",
         "inputSql" -> "\"1.0\"",
         "inputType" -> "\"DECIMAL(2,1)\"",
-        "requiredType" -> "\"INT\""
-      ),
-      queryContext = Array(ExpectedContext("", "", 0, 19, "array_repeat(a, 1.0)"))
-    )
+        "requiredType" -> "\"INT\""),
+      queryContext = Array(ExpectedContext("", "", 0, 19, "array_repeat(a, 1.0)")))
   }
 
+  test("array prepend") {

Review Comment:
   Relevant diff here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] navinvishy commented on a diff in pull request #38947: [SPARK-41231][SQL] Adds an array_prepend function to catalyst

Posted by GitBox <gi...@apache.org>.

navinvishy commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1041817673


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -119,21 +117,24 @@ case class Size(child: Expression, legacySizeOfNull: Boolean)
     val value = child.eval(input)
     if (value == null) {
       if (legacySizeOfNull) -1 else null
-    } else child.dataType match {
-      case _: ArrayType => value.asInstanceOf[ArrayData].numElements()
-      case _: MapType => value.asInstanceOf[MapData].numElements()
-      case other => throw QueryExecutionErrors.unsupportedOperandTypeForSizeFunctionError(other)
-    }
+    } else

Review Comment:
   Yeah there were too many of those. I've removed them now.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] zhengruifeng commented on pull request #38947: [SPARK-41233][SQL] Add `array_prepend` function

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.

zhengruifeng commented on PR #38947:
URL: https://github.com/apache/spark/pull/38947#issuecomment-1449588306

   cc @dongjoon-hyun @beliefer @LuciferYang @cloud-fan @HyukjinKwon would you mind taking another look? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] navinvishy commented on pull request #38947: [SPARK-41233][SQL][PYTHON] Add `array_prepend` function

Posted by "navinvishy (via GitHub)" <gi...@apache.org>.

navinvishy commented on PR #38947:
URL: https://github.com/apache/spark/pull/38947#issuecomment-1465390080

   > @navinvishy sorry, it seems late for 3.4.0. would you mind changing the version from `3.4.0` to `3.5.0`?
   > 
   > I am going to merge this PR to master this week if no more comments.
   
   I've updated the version. Thanks @zhengruifeng !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] navinvishy closed pull request #38947: [SPARK-41233][SQL] Add `array_prepend` function

Posted by "navinvishy (via GitHub)" <gi...@apache.org>.

navinvishy closed pull request #38947: [SPARK-41233][SQL] Add `array_prepend` function
URL: https://github.com/apache/spark/pull/38947


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] zhengruifeng commented on a diff in pull request #38947: [SPARK-41233][SQL] Add `array_prepend` function

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.

zhengruifeng commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1105246370


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1399,6 +1399,149 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage =
+    "_FUNC_(array, value) - Returns an array containing value as well as all elements from array. The new element is positioned at the beginning of the array.",
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array(1, 2, 3), 4);
+       [4, 1, 2, 3]
+  """,
+  group = "array_funcs",
+  since = "3.4.0")
+case class ArrayPrepend(left: Expression, right: Expression)
+  extends BinaryExpression
+    with ImplicitCastInputTypes
+    with ComplexTypeMergingExpression
+    with QueryErrorsBase {
+
+  override def nullable: Boolean = left.nullable
+
+  override def eval(input: InternalRow): Any = {
+    val value1 = left.eval(input)
+    if (value1 == null) {
+      null
+    } else {
+      val value2 = right.eval(input)
+      nullSafeEval(value1, value2)
+    }
+  }
+  override def nullSafeEval(arr: Any, value: Any): Any = {
+    val numberOfElements = arr.asInstanceOf[ArrayData].numElements()
+    if (numberOfElements + 1 > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
+      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
+    }
+    val newArray = new Array[Any](numberOfElements + 1)
+    newArray(0) = value
+    var pos = 1
+    arr
+      .asInstanceOf[ArrayData]
+      .foreach(
+        right.dataType,
+        (i, v) => {
+          newArray(pos) = v
+          pos += 1
+        })
+    new GenericArrayData(newArray)
+  }
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    val leftGen = left.genCode(ctx)
+    val rightGen = right.genCode(ctx)
+    val f = (arr: String, value: String) => {
+      val newArraySize = ctx.freshName("newArraySize")
+      val newArray = ctx.freshName("newArray")
+      val i = ctx.freshName("i")
+      val pos = ctx.freshName("pos")
+      val allocation = CodeGenerator.createArrayData(
+        newArray,
+        right.dataType,
+        newArraySize,
+        s" $prettyName failed.")
+      val assignment =
+        CodeGenerator.createArrayAssignment(newArray, right.dataType, arr, pos, i, false)
+      val newElemAssignment =
+        CodeGenerator.setArrayElement(newArray, right.dataType, pos, value, Some(rightGen.isNull))
+      s"""
+         |int $pos = 0;
+         |int $newArraySize = $arr.numElements() + 1;
+         |$allocation
+         |$newElemAssignment
+         |$pos = $pos + 1;
+         |for (int $i = 0; $i < $arr.numElements(); $i ++) {
+         |  $assignment
+         |  $pos = $pos + 1;
+         |}
+         |${ev.value} = $newArray;
+         |""".stripMargin
+    }
+    val resultCode = f(leftGen.value, rightGen.value)
+    if(nullable) {
+      val nullSafeEval = leftGen.code + rightGen.code + ctx.nullSafeExec(nullable, leftGen.isNull) {
+        s"""
+           |${ev.isNull} = false;
+           |${resultCode}
+           |""".stripMargin
+      }
+      ev.copy(code =
+        code"""
+        boolean ${ev.isNull} = true;
+        ${CodeGenerator.javaType(dataType)} ${ev.value} = ${CodeGenerator.defaultValue(dataType)};
+        $nullSafeEval
+      """)
+    } else {
+      ev.copy(code =
+        code"""
+            ${leftGen.code}
+            ${rightGen.code}
+            ${CodeGenerator.javaType(dataType)} ${ev.value} = ${CodeGenerator.defaultValue(dataType)};
+            $resultCode""", isNull = FalseLiteral)
+    }
+  }
+
+  override def prettyName: String = "array_prepend"
+
+  override protected def withNewChildrenInternal(
+      newLeft: Expression, newRight: Expression): ArrayPrepend =
+    copy(left = newLeft, right = newRight)
+
+  override def dataType: DataType = left.dataType

Review Comment:
   see https://github.com/apache/spark/commit/718b6b7ed8277d5f6577367ab0d49f60f9777df7



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] zhengruifeng commented on pull request #38947: [SPARK-41233][SQL][PYTHON] Add `array_prepend` function

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.

zhengruifeng commented on PR #38947:
URL: https://github.com/apache/spark/pull/38947#issuecomment-1471632549

   merged into master, thank you @navinvishy for working on this!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] navinvishy commented on a diff in pull request #38947: [SPARK-41233][SQL] Add `array_prepend` function

Posted by "navinvishy (via GitHub)" <gi...@apache.org>.

navinvishy commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1118278902


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1399,6 +1399,149 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage =
+    "_FUNC_(array, value) - Returns an array containing value as well as all elements from array. The new element is positioned at the beginning of the array.",
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array(1, 2, 3), 4);
+       [4, 1, 2, 3]
+  """,
+  group = "array_funcs",
+  since = "3.4.0")
+case class ArrayPrepend(left: Expression, right: Expression)
+  extends BinaryExpression
+    with ImplicitCastInputTypes
+    with ComplexTypeMergingExpression
+    with QueryErrorsBase {
+
+  override def nullable: Boolean = left.nullable
+
+  override def eval(input: InternalRow): Any = {
+    val value1 = left.eval(input)
+    if (value1 == null) {
+      null
+    } else {
+      val value2 = right.eval(input)
+      nullSafeEval(value1, value2)
+    }
+  }
+  override def nullSafeEval(arr: Any, value: Any): Any = {
+    val numberOfElements = arr.asInstanceOf[ArrayData].numElements()
+    if (numberOfElements + 1 > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
+      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
+    }
+    val newArray = new Array[Any](numberOfElements + 1)
+    newArray(0) = value
+    var pos = 1
+    arr
+      .asInstanceOf[ArrayData]
+      .foreach(
+        right.dataType,
+        (i, v) => {
+          newArray(pos) = v
+          pos += 1
+        })
+    new GenericArrayData(newArray)
+  }
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    val leftGen = left.genCode(ctx)
+    val rightGen = right.genCode(ctx)
+    val f = (arr: String, value: String) => {
+      val newArraySize = ctx.freshName("newArraySize")
+      val newArray = ctx.freshName("newArray")
+      val i = ctx.freshName("i")
+      val pos = ctx.freshName("pos")
+      val allocation = CodeGenerator.createArrayData(
+        newArray,
+        right.dataType,
+        newArraySize,
+        s" $prettyName failed.")
+      val assignment =
+        CodeGenerator.createArrayAssignment(newArray, right.dataType, arr, pos, i, false)
+      val newElemAssignment =
+        CodeGenerator.setArrayElement(newArray, right.dataType, pos, value, Some(rightGen.isNull))

Review Comment:
   @zhengruifeng thanks, I've made this change.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] navinvishy commented on a diff in pull request #38947: SPARK-41231: Adds an array_prepend function to catalyst

Posted by GitBox <gi...@apache.org>.

navinvishy commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1041589182


##########
sql/core/src/main/scala/org/apache/spark/sql/functions.scala:
##########
@@ -4018,10 +4181,22 @@ object functions {
    * @group collection_funcs
    * @since 2.4.0
    */
+
   def array_remove(column: Column, element: Any): Column = withExpr {
     ArrayRemove(column.expr, lit(element).expr)
   }
 
+  /**

Review Comment:
   Relevant diff here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on pull request #38947: [SPARK-41231][SQL] Adds an array_prepend function to catalyst

Posted by GitBox <gi...@apache.org>.

LuciferYang commented on PR #38947:
URL: https://github.com/apache/spark/pull/38947#issuecomment-1340714253

   @HyukjinKwon `ArrayAppend` and `ArrayPrepend` are special cases of `ArrayInsert`. Is it better to implement each function independently or consider them as a whole?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] amaliujia commented on a diff in pull request #38947: [SPARK-41233][SQL] Add `array_prepend` function

Posted by "amaliujia (via GitHub)" <gi...@apache.org>.

amaliujia commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1107726442


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1399,6 +1399,119 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage =
+    "_FUNC_(array, value) - Returns an array containing value as well as all elements from array. The new element is positioned at the beginning of the array.",
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array(1, 2, 3), 4);
+       [4, 1, 2, 3]
+  """,
+  group = "array_funcs",
+  since = "3.4.0")
+case class ArrayPrepend(left: Expression, right: Expression)
+  extends BinaryExpression
+    with ImplicitCastInputTypes
+    with NullIntolerant
+    with QueryErrorsBase {
+
+  override def nullSafeEval(arr: Any, value: Any): Any = {
+    val numberOfElements = arr.asInstanceOf[ArrayData].numElements()
+    if (numberOfElements + 1 > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
+      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
+    }
+    val newArray = new Array[Any](numberOfElements + 1)
+    newArray(0) = value
+    var pos = 1
+    arr

Review Comment:
   +1 to be consistent with `ArrayAppend`. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] navinvishy commented on a diff in pull request #38947: [SPARK-41233][SQL] Add `array_prepend` function

Posted by "navinvishy (via GitHub)" <gi...@apache.org>.

navinvishy commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1121154959


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1399,6 +1399,145 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage =
+    "_FUNC_(array, value) - Returns an array containing value as well as all elements from array. The new element is positioned at the beginning of the array.",
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array('b', 'd', 'c', 'a'), 'd');
+       ["d","b","d","c","a"]
+  """,
+  group = "array_funcs",
+  since = "3.4.0")
+case class ArrayPrepend(left: Expression, right: Expression)
+  extends BinaryExpression
+    with ImplicitCastInputTypes
+    with ComplexTypeMergingExpression
+    with QueryErrorsBase {
+
+  override def nullable: Boolean = left.nullable
+
+  @transient protected lazy val elementType: DataType =
+    inputTypes.head.asInstanceOf[ArrayType].elementType
+
+  override def eval(input: InternalRow): Any = {
+    val value1 = left.eval(input)
+    if (value1 == null) {
+      null
+    } else {
+      val value2 = right.eval(input)
+      nullSafeEval(value1, value2)
+    }
+  }
+  override def nullSafeEval(arr: Any, elementData: Any): Any = {
+    val arrayData = arr.asInstanceOf[ArrayData]
+    val numberOfElements = arrayData.numElements() + 1
+    if (numberOfElements > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
+      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
+    }
+    val finalData = new Array[Any](numberOfElements)
+    finalData.update(0, elementData)
+    arrayData.foreach(elementType, (i: Int, v: Any) => finalData.update(i + 1, v))
+    new GenericArrayData(finalData)
+  }
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    val leftGen = left.genCode(ctx)
+    val rightGen = right.genCode(ctx)
+    val f = (arr: String, value: String) => {
+      val newArraySize = ctx.freshName("newArraySize")
+      val newArray = ctx.freshName("newArray")
+      val i = ctx.freshName("i")
+      val pos = ctx.freshName("pos")

Review Comment:
   @zhengruifeng I assume you mean that we could use `i+1` here for the assignment instead of using a different var `pos`? I've done that here - please let me know if that works.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] navinvishy commented on pull request #38947: [SPARK-41233][SQL] Add `array_prepend` function

Posted by "navinvishy (via GitHub)" <gi...@apache.org>.

navinvishy commented on PR #38947:
URL: https://github.com/apache/spark/pull/38947#issuecomment-1407298641

   > > > Could we wait array_append merged then this PR could follow some convention and make a better abstract ? @zhengruifeng @LuciferYang @HyukjinKwon
   > > 
   > > 
   > > If it is possible, I think we should wait
   > 
   > Looks like array_append has already been merged? #38865
   
   Also, I'm happy to try and unify `ArrayPrepend` and `ArrayAppend` (and maybe some others?) if that is useful.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan commented on a diff in pull request #38947: [SPARK-41233][SQL][PYTHON] Add `array_prepend` function

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.

cloud-fan commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1133448219


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1399,6 +1399,151 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = """
+      _FUNC_(array, element) - Add the element at the beginning of the array passed as first
+      argument. Type of element should be similar to type of the elements of the array.

Review Comment:
   ```suggestion
         argument. Type of element should be the same to the type of the elements of the array.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan commented on a diff in pull request #38947: [SPARK-41233][SQL][PYTHON] Add `array_prepend` function

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.

cloud-fan commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1133454113


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1399,6 +1399,151 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = """
+      _FUNC_(array, element) - Add the element at the beginning of the array passed as first
+      argument. Type of element should be similar to type of the elements of the array.
+      Null element is also prepended to the array. But if the array passed is NULL
+      output is NULL
+    """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array('b', 'd', 'c', 'a'), 'd');
+       ["d","b","d","c","a"]
+      > SELECT _FUNC_(array(1, 2, 3, null), null);
+       [null,1,2,3,null]
+      > SELECT _FUNC_(CAST(null as Array<Int>), 2);
+       NULL
+  """,
+  group = "array_funcs",
+  since = "3.5.0")
+case class ArrayPrepend(left: Expression, right: Expression)
+  extends BinaryExpression
+    with ImplicitCastInputTypes
+    with ComplexTypeMergingExpression
+    with QueryErrorsBase {
+
+  override def nullable: Boolean = left.nullable
+
+  @transient protected lazy val elementType: DataType =
+    inputTypes.head.asInstanceOf[ArrayType].elementType
+
+  override def eval(input: InternalRow): Any = {
+    val value1 = left.eval(input)
+    if (value1 == null) {
+      null
+    } else {
+      val value2 = right.eval(input)
+      nullSafeEval(value1, value2)
+    }
+  }
+  override def nullSafeEval(arr: Any, elementData: Any): Any = {
+    val arrayData = arr.asInstanceOf[ArrayData]
+    val numberOfElements = arrayData.numElements() + 1
+    if (numberOfElements > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
+      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
+    }
+    val finalData = new Array[Any](numberOfElements)
+    finalData.update(0, elementData)
+    arrayData.foreach(elementType, (i: Int, v: Any) => finalData.update(i + 1, v))
+    new GenericArrayData(finalData)
+  }
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    val leftGen = left.genCode(ctx)
+    val rightGen = right.genCode(ctx)
+    val f = (arr: String, value: String) => {
+      val newArraySize = ctx.freshName("newArraySize")
+      val newArray = ctx.freshName("newArray")
+      val i = ctx.freshName("i")
+      val iPlus1 = s"$i+1"
+      val zero = "0"
+      val allocation = CodeGenerator.createArrayData(
+        newArray,
+        elementType,
+        newArraySize,
+        s" $prettyName failed.")
+      val assignment =
+        CodeGenerator.createArrayAssignment(newArray, elementType, arr, iPlus1, i, false)
+      val newElemAssignment =
+        CodeGenerator.setArrayElement(newArray, elementType, zero, value, Some(rightGen.isNull))
+      s"""
+         |int $newArraySize = $arr.numElements() + 1;
+         |$allocation
+         |$newElemAssignment
+         |for (int $i = 0; $i < $arr.numElements(); $i ++) {
+         |  $assignment
+         |}
+         |${ev.value} = $newArray;
+         |""".stripMargin
+    }
+    val resultCode = f(leftGen.value, rightGen.value)
+    if(nullable) {
+      val nullSafeEval = leftGen.code + rightGen.code + ctx.nullSafeExec(nullable, leftGen.isNull) {
+        s"""
+           |${ev.isNull} = false;
+           |${resultCode}
+           |""".stripMargin
+      }
+      ev.copy(code =
+        code"""
+        boolean ${ev.isNull} = true;
+        ${CodeGenerator.javaType(dataType)} ${ev.value} = ${CodeGenerator.defaultValue(dataType)};
+        $nullSafeEval
+      """)

Review Comment:
   please please the same code style as
   ```
   s"""
      |...
      |...
      |""". stripMargin
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] navinvishy commented on a diff in pull request #38947: [SPARK-41233][SQL][PYTHON] Add `array_prepend` function

Posted by "navinvishy (via GitHub)" <gi...@apache.org>.

navinvishy commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1138150738


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1399,6 +1399,151 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = """
+      _FUNC_(array, element) - Add the element at the beginning of the array passed as first
+      argument. Type of element should be similar to type of the elements of the array.
+      Null element is also prepended to the array. But if the array passed is NULL
+      output is NULL
+    """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array('b', 'd', 'c', 'a'), 'd');
+       ["d","b","d","c","a"]
+      > SELECT _FUNC_(array(1, 2, 3, null), null);
+       [null,1,2,3,null]
+      > SELECT _FUNC_(CAST(null as Array<Int>), 2);
+       NULL
+  """,
+  group = "array_funcs",
+  since = "3.5.0")
+case class ArrayPrepend(left: Expression, right: Expression)
+  extends BinaryExpression
+    with ImplicitCastInputTypes
+    with ComplexTypeMergingExpression
+    with QueryErrorsBase {
+
+  override def nullable: Boolean = left.nullable
+
+  @transient protected lazy val elementType: DataType =
+    inputTypes.head.asInstanceOf[ArrayType].elementType
+
+  override def eval(input: InternalRow): Any = {
+    val value1 = left.eval(input)
+    if (value1 == null) {
+      null
+    } else {
+      val value2 = right.eval(input)
+      nullSafeEval(value1, value2)
+    }
+  }
+  override def nullSafeEval(arr: Any, elementData: Any): Any = {
+    val arrayData = arr.asInstanceOf[ArrayData]
+    val numberOfElements = arrayData.numElements() + 1
+    if (numberOfElements > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
+      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
+    }
+    val finalData = new Array[Any](numberOfElements)
+    finalData.update(0, elementData)
+    arrayData.foreach(elementType, (i: Int, v: Any) => finalData.update(i + 1, v))
+    new GenericArrayData(finalData)
+  }
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    val leftGen = left.genCode(ctx)
+    val rightGen = right.genCode(ctx)
+    val f = (arr: String, value: String) => {
+      val newArraySize = ctx.freshName("newArraySize")
+      val newArray = ctx.freshName("newArray")
+      val i = ctx.freshName("i")
+      val iPlus1 = s"$i+1"
+      val zero = "0"
+      val allocation = CodeGenerator.createArrayData(
+        newArray,
+        elementType,
+        newArraySize,

Review Comment:
   Yes, that wasn't really necessary. I've removed it, thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] navinvishy commented on pull request #38947: [SPARK-41233][SQL][PYTHON] Add `array_prepend` function

Posted by "navinvishy (via GitHub)" <gi...@apache.org>.

navinvishy commented on PR #38947:
URL: https://github.com/apache/spark/pull/38947#issuecomment-1471352117

   > @navinvishy would you mind addressing wenchen's comments? we can merge it then.
   
   I've addressed them. Thanks for checking, @zhengruifeng !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] navinvishy commented on a diff in pull request #38947: [SPARK-41233][SQL] Add `array_prepend` function

Posted by "navinvishy (via GitHub)" <gi...@apache.org>.

navinvishy commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1118278674


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1399,6 +1399,149 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage =
+    "_FUNC_(array, value) - Returns an array containing value as well as all elements from array. The new element is positioned at the beginning of the array.",
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array(1, 2, 3), 4);
+       [4, 1, 2, 3]
+  """,
+  group = "array_funcs",
+  since = "3.4.0")
+case class ArrayPrepend(left: Expression, right: Expression)
+  extends BinaryExpression
+    with ImplicitCastInputTypes
+    with ComplexTypeMergingExpression
+    with QueryErrorsBase {
+
+  override def nullable: Boolean = left.nullable
+
+  override def eval(input: InternalRow): Any = {
+    val value1 = left.eval(input)
+    if (value1 == null) {
+      null
+    } else {
+      val value2 = right.eval(input)
+      nullSafeEval(value1, value2)
+    }
+  }
+  override def nullSafeEval(arr: Any, value: Any): Any = {
+    val numberOfElements = arr.asInstanceOf[ArrayData].numElements()
+    if (numberOfElements + 1 > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
+      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
+    }
+    val newArray = new Array[Any](numberOfElements + 1)
+    newArray(0) = value
+    var pos = 1
+    arr
+      .asInstanceOf[ArrayData]
+      .foreach(
+        right.dataType,
+        (i, v) => {
+          newArray(pos) = v
+          pos += 1
+        })
+    new GenericArrayData(newArray)
+  }
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    val leftGen = left.genCode(ctx)
+    val rightGen = right.genCode(ctx)
+    val f = (arr: String, value: String) => {
+      val newArraySize = ctx.freshName("newArraySize")
+      val newArray = ctx.freshName("newArray")
+      val i = ctx.freshName("i")
+      val pos = ctx.freshName("pos")
+      val allocation = CodeGenerator.createArrayData(
+        newArray,
+        right.dataType,
+        newArraySize,
+        s" $prettyName failed.")
+      val assignment =
+        CodeGenerator.createArrayAssignment(newArray, right.dataType, arr, pos, i, false)
+      val newElemAssignment =
+        CodeGenerator.setArrayElement(newArray, right.dataType, pos, value, Some(rightGen.isNull))
+      s"""
+         |int $pos = 0;
+         |int $newArraySize = $arr.numElements() + 1;
+         |$allocation
+         |$newElemAssignment
+         |$pos = $pos + 1;
+         |for (int $i = 0; $i < $arr.numElements(); $i ++) {
+         |  $assignment
+         |  $pos = $pos + 1;
+         |}
+         |${ev.value} = $newArray;
+         |""".stripMargin
+    }
+    val resultCode = f(leftGen.value, rightGen.value)
+    if(nullable) {
+      val nullSafeEval = leftGen.code + rightGen.code + ctx.nullSafeExec(nullable, leftGen.isNull) {
+        s"""
+           |${ev.isNull} = false;
+           |${resultCode}
+           |""".stripMargin
+      }
+      ev.copy(code =
+        code"""
+        boolean ${ev.isNull} = true;
+        ${CodeGenerator.javaType(dataType)} ${ev.value} = ${CodeGenerator.defaultValue(dataType)};
+        $nullSafeEval
+      """)
+    } else {
+      ev.copy(code =
+        code"""
+            ${leftGen.code}
+            ${rightGen.code}
+            ${CodeGenerator.javaType(dataType)} ${ev.value} = ${CodeGenerator.defaultValue(dataType)};
+            $resultCode""", isNull = FalseLiteral)
+    }
+  }
+
+  override def prettyName: String = "array_prepend"
+
+  override protected def withNewChildrenInternal(
+      newLeft: Expression, newRight: Expression): ArrayPrepend =
+    copy(left = newLeft, right = newRight)
+
+  override def dataType: DataType = left.dataType

Review Comment:
   @zhengruifeng I've made this change and added a test for null cases for both left and right params.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] navinvishy commented on pull request #38947: SPARK-41231: Adds an array_prepend function to catalyst

Posted by GitBox <gi...@apache.org>.

navinvishy commented on PR #38947:
URL: https://github.com/apache/spark/pull/38947#issuecomment-1340167033

   Running `./dev/scalafmt` produced a lot of formatting changes on these files. I've added comments to the portions that are relevant to the task so they are easy to find for reviewing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] zhengruifeng commented on a diff in pull request #38947: [SPARK-41231][SQL] Adds an array_prepend function to catalyst

Posted by GitBox <gi...@apache.org>.

zhengruifeng commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1042152011


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1399,6 +1399,119 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage =
+    "_FUNC_(array, value) - Returns an array containing value as well as all elements from array. The new element is positioned at the beginning of the array.",
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array(1, 2, 3), 4);
+       [4, 1, 2, 3]
+  """,
+  group = "array_funcs",
+  since = "3.4.0")
+case class ArrayPrepend(left: Expression, right: Expression)
+  extends BinaryExpression
+    with ImplicitCastInputTypes
+    with NullIntolerant
+    with QueryErrorsBase {
+
+  override def nullSafeEval(arr: Any, value: Any): Any = {
+    val numberOfElements = arr.asInstanceOf[ArrayData].numElements()
+    if (numberOfElements + 1 > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
+      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
+    }
+    val newArray = new Array[Any](numberOfElements + 1)
+    newArray(0) = value
+    var pos = 1
+    arr

Review Comment:
   or use `ArrayData.foreach`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] navinvishy commented on a diff in pull request #38947: [SPARK-41231][SQL] Adds an array_prepend function to catalyst

Posted by GitBox <gi...@apache.org>.

navinvishy commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1042617849


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1399,6 +1399,119 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage =
+    "_FUNC_(array, value) - Returns an array containing value as well as all elements from array. The new element is positioned at the beginning of the array.",
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array(1, 2, 3), 4);
+       [4, 1, 2, 3]
+  """,
+  group = "array_funcs",
+  since = "3.4.0")
+case class ArrayPrepend(left: Expression, right: Expression)
+  extends BinaryExpression
+    with ImplicitCastInputTypes
+    with NullIntolerant
+    with QueryErrorsBase {
+
+  override def nullSafeEval(arr: Any, value: Any): Any = {
+    val numberOfElements = arr.asInstanceOf[ArrayData].numElements()
+    if (numberOfElements + 1 > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
+      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
+    }
+    val newArray = new Array[Any](numberOfElements + 1)
+    newArray(0) = value
+    var pos = 1
+    arr
+      .asInstanceOf[ArrayData]
+      .foreach(
+        right.dataType,
+        (i, v) => {
+          newArray(pos) = v
+          pos += 1
+        })
+    new GenericArrayData(newArray)
+  }
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    nullSafeCodeGen(
+      ctx,
+      ev,
+      (arr, value) => {
+        val newArraySize = ctx.freshName("newArraySize")
+        val newArray = ctx.freshName("newArray")
+        val i = ctx.freshName("i")
+        val pos = ctx.freshName("pos")
+        val allocation = CodeGenerator.createArrayData(
+          newArray,
+          right.dataType,
+          newArraySize,
+          s" $prettyName failed.")
+        val assignment =
+          CodeGenerator.createArrayAssignment(newArray, right.dataType, arr, pos, i, false)
+        val newElemAssignment =
+          CodeGenerator.setArrayElement(newArray, right.dataType, pos, value)
+        s"""
+           |int $pos = 0;
+           |int $newArraySize = $arr.numElements() + 1;
+           |$allocation
+           |$newElemAssignment
+           |$pos = $pos + 1;
+           |for (int $i = 0; $i < $arr.numElements(); $i ++) {
+           |  $assignment
+           |  $pos = $pos + 1;
+           |}
+           |${ev.value} = $newArray;
+           |""".stripMargin
+      })
+  }
+
+  override def prettyName: String = "array_prepend"
+  override protected def withNewChildrenInternal(
+                                                  newLeft: Expression,
+                                                  newRight: Expression): ArrayPrepend =
+    copy(left = newLeft, right = newRight)
+  override def dataType: DataType = left.dataType
+  override def checkInputDataTypes(): TypeCheckResult = {
+    (left.dataType, right.dataType) match {
+      case (_, NullType) | (NullType, _) =>
+        DataTypeMismatch(
+          errorSubClass = "NULL_TYPE",
+          messageParameters = Map("functionName" -> toSQLId(prettyName)))
+      case (l, _) if !ArrayType.acceptsType(l) =>
+        DataTypeMismatch(
+          errorSubClass = "UNEXPECTED_INPUT_TYPE",
+          messageParameters = Map(
+            "paramIndex" -> "1",
+            "requiredType" -> toSQLType(ArrayType),
+            "inputSql" -> toSQLExpr(left),
+            "inputType" -> toSQLType(left.dataType)))
+      case (ArrayType(e1, _), e2) if e1.sameType(e2) =>
+        TypeUtils.checkForOrderingExpr(e2, prettyName)

Review Comment:
   There seems to be an ordering requirement for `ArrayData` since it has a `compare` and is backed by a `scala.collection.Seq`. But I'm ok with just returning a `TypeCheckResult.TypeCheckSuccess` here. I followed what `ArrayContains` did here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] navinvishy commented on a diff in pull request #38947: [SPARK-41231][SQL] Adds an array_prepend function to catalyst

Posted by GitBox <gi...@apache.org>.

navinvishy commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1042620296


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1399,6 +1399,119 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage =
+    "_FUNC_(array, value) - Returns an array containing value as well as all elements from array. The new element is positioned at the beginning of the array.",
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array(1, 2, 3), 4);
+       [4, 1, 2, 3]
+  """,
+  group = "array_funcs",
+  since = "3.4.0")
+case class ArrayPrepend(left: Expression, right: Expression)
+  extends BinaryExpression
+    with ImplicitCastInputTypes
+    with NullIntolerant
+    with QueryErrorsBase {
+
+  override def nullSafeEval(arr: Any, value: Any): Any = {
+    val numberOfElements = arr.asInstanceOf[ArrayData].numElements()
+    if (numberOfElements + 1 > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
+      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
+    }
+    val newArray = new Array[Any](numberOfElements + 1)
+    newArray(0) = value
+    var pos = 1
+    arr
+      .asInstanceOf[ArrayData]
+      .foreach(
+        right.dataType,
+        (i, v) => {
+          newArray(pos) = v
+          pos += 1
+        })
+    new GenericArrayData(newArray)
+  }
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    nullSafeCodeGen(
+      ctx,
+      ev,
+      (arr, value) => {
+        val newArraySize = ctx.freshName("newArraySize")
+        val newArray = ctx.freshName("newArray")
+        val i = ctx.freshName("i")
+        val pos = ctx.freshName("pos")
+        val allocation = CodeGenerator.createArrayData(
+          newArray,
+          right.dataType,
+          newArraySize,
+          s" $prettyName failed.")
+        val assignment =
+          CodeGenerator.createArrayAssignment(newArray, right.dataType, arr, pos, i, false)
+        val newElemAssignment =
+          CodeGenerator.setArrayElement(newArray, right.dataType, pos, value)
+        s"""
+           |int $pos = 0;
+           |int $newArraySize = $arr.numElements() + 1;
+           |$allocation
+           |$newElemAssignment
+           |$pos = $pos + 1;
+           |for (int $i = 0; $i < $arr.numElements(); $i ++) {
+           |  $assignment
+           |  $pos = $pos + 1;
+           |}
+           |${ev.value} = $newArray;
+           |""".stripMargin
+      })
+  }
+
+  override def prettyName: String = "array_prepend"
+  override protected def withNewChildrenInternal(
+                                                  newLeft: Expression,
+                                                  newRight: Expression): ArrayPrepend =
+    copy(left = newLeft, right = newRight)
+  override def dataType: DataType = left.dataType
+  override def checkInputDataTypes(): TypeCheckResult = {

Review Comment:
   I did what `ArrayContains` does. Maybe we should consolidate this, since it makes sense for many of these to do the same thing? eg. `ArrayContains`, `ArrayRemove`, `ArrayPrepend`, `ArrayAppend` etc.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #38947: [SPARK-41233][SQL] Adds an array_prepend function to catalyst

Posted by GitBox <gi...@apache.org>.

AmplabJenkins commented on PR #38947:
URL: https://github.com/apache/spark/pull/38947#issuecomment-1343716355

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] beliefer commented on pull request #38947: [SPARK-41233][SQL] Add `array_prepend` function

Posted by GitBox <gi...@apache.org>.

beliefer commented on PR #38947:
URL: https://github.com/apache/spark/pull/38947#issuecomment-1365775810

   Could we wait array_append merged then this PR could follow some convention and make a better abstract ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] navinvishy commented on a diff in pull request #38947: [SPARK-41233][SQL] Add `array_prepend` function

Posted by "navinvishy (via GitHub)" <gi...@apache.org>.

navinvishy commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1089643942


##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala:
##########
@@ -1840,6 +1840,47 @@ class CollectionExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper
     checkEvaluation(ArrayRepeat(Literal("hi"), Literal(null, IntegerType)), null)
   }
 
+  test("SPARK-41233: ArrayPrepend") {
+    val a0 = Literal.create(Seq(1, 2, 3, 4), ArrayType(IntegerType))
+    val a1 = Literal.create(Seq("a", "b", "c"), ArrayType(StringType))
+    val a2 = Literal.create(Seq.empty[Integer], ArrayType(IntegerType))
+    val a3 = Literal.create(null, ArrayType(StringType))
+
+    checkEvaluation(ArrayPrepend(a0, Literal(0)), Seq(0, 1, 2, 3, 4))
+    checkEvaluation(ArrayPrepend(a1, Literal("a")), Seq("a", "a", "b", "c"))
+    checkEvaluation(ArrayPrepend(a2, Literal(1)), Seq(1))
+    checkEvaluation(ArrayPrepend(a2, Literal(null, IntegerType)), null)

Review Comment:
   @zhengruifeng sorry about the delay here. I've updated this to reflect the behavior you mentioned. Fixed the tests to show that as well. Thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on pull request #38947: [SPARK-41233][SQL] Add `array_prepend` function

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on PR #38947:
URL: https://github.com/apache/spark/pull/38947#issuecomment-1407311495

   I found `ArrayAppend ` already merged


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] navinvishy commented on pull request #38947: [SPARK-41233][SQL] Add `array_prepend` function

Posted by "navinvishy (via GitHub)" <gi...@apache.org>.

navinvishy commented on PR #38947:
URL: https://github.com/apache/spark/pull/38947#issuecomment-1422013069

   > @navinvishy mind fix the scala linter failure?
   
   Sorry for the delay @zhengruifeng . Fixing now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] zhengruifeng commented on a diff in pull request #38947: [SPARK-41233][SQL] Add `array_prepend` function

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.

zhengruifeng commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1109187704


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1399,6 +1399,149 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage =
+    "_FUNC_(array, value) - Returns an array containing value as well as all elements from array. The new element is positioned at the beginning of the array.",
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array(1, 2, 3), 4);
+       [4, 1, 2, 3]
+  """,
+  group = "array_funcs",
+  since = "3.4.0")
+case class ArrayPrepend(left: Expression, right: Expression)
+  extends BinaryExpression
+    with ImplicitCastInputTypes
+    with ComplexTypeMergingExpression
+    with QueryErrorsBase {
+
+  override def nullable: Boolean = left.nullable
+
+  override def eval(input: InternalRow): Any = {
+    val value1 = left.eval(input)
+    if (value1 == null) {
+      null
+    } else {
+      val value2 = right.eval(input)
+      nullSafeEval(value1, value2)
+    }
+  }
+  override def nullSafeEval(arr: Any, value: Any): Any = {
+    val numberOfElements = arr.asInstanceOf[ArrayData].numElements()
+    if (numberOfElements + 1 > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
+      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
+    }
+    val newArray = new Array[Any](numberOfElements + 1)
+    newArray(0) = value
+    var pos = 1
+    arr
+      .asInstanceOf[ArrayData]
+      .foreach(
+        right.dataType,
+        (i, v) => {
+          newArray(pos) = v
+          pos += 1
+        })
+    new GenericArrayData(newArray)
+  }
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    val leftGen = left.genCode(ctx)
+    val rightGen = right.genCode(ctx)
+    val f = (arr: String, value: String) => {
+      val newArraySize = ctx.freshName("newArraySize")
+      val newArray = ctx.freshName("newArray")
+      val i = ctx.freshName("i")
+      val pos = ctx.freshName("pos")
+      val allocation = CodeGenerator.createArrayData(
+        newArray,
+        right.dataType,
+        newArraySize,
+        s" $prettyName failed.")
+      val assignment =
+        CodeGenerator.createArrayAssignment(newArray, right.dataType, arr, pos, i, false)
+      val newElemAssignment =
+        CodeGenerator.setArrayElement(newArray, right.dataType, pos, value, Some(rightGen.isNull))
+      s"""
+         |int $pos = 0;
+         |int $newArraySize = $arr.numElements() + 1;
+         |$allocation
+         |$newElemAssignment
+         |$pos = $pos + 1;
+         |for (int $i = 0; $i < $arr.numElements(); $i ++) {
+         |  $assignment
+         |  $pos = $pos + 1;
+         |}
+         |${ev.value} = $newArray;
+         |""".stripMargin
+    }
+    val resultCode = f(leftGen.value, rightGen.value)
+    if(nullable) {
+      val nullSafeEval = leftGen.code + rightGen.code + ctx.nullSafeExec(nullable, leftGen.isNull) {
+        s"""
+           |${ev.isNull} = false;
+           |${resultCode}
+           |""".stripMargin
+      }
+      ev.copy(code =
+        code"""
+        boolean ${ev.isNull} = true;
+        ${CodeGenerator.javaType(dataType)} ${ev.value} = ${CodeGenerator.defaultValue(dataType)};
+        $nullSafeEval
+      """)
+    } else {
+      ev.copy(code =
+        code"""
+            ${leftGen.code}
+            ${rightGen.code}
+            ${CodeGenerator.javaType(dataType)} ${ev.value} = ${CodeGenerator.defaultValue(dataType)};
+            $resultCode""", isNull = FalseLiteral)
+    }
+  }
+
+  override def prettyName: String = "array_prepend"
+
+  override protected def withNewChildrenInternal(
+      newLeft: Expression, newRight: Expression): ArrayPrepend =
+    copy(left = newLeft, right = newRight)
+
+  override def dataType: DataType = left.dataType

Review Comment:
   Yes, I think so



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] navinvishy commented on a diff in pull request #38947: [SPARK-41233][SQL] Add `array_prepend` function

Posted by "navinvishy (via GitHub)" <gi...@apache.org>.

navinvishy commented on code in PR #38947:
URL: https://github.com/apache/spark/pull/38947#discussion_r1118279709


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -1399,6 +1399,119 @@ case class ArrayContains(left: Expression, right: Expression)
     copy(left = newLeft, right = newRight)
 }
 
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage =
+    "_FUNC_(array, value) - Returns an array containing value as well as all elements from array. The new element is positioned at the beginning of the array.",
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array(1, 2, 3), 4);
+       [4, 1, 2, 3]
+  """,
+  group = "array_funcs",
+  since = "3.4.0")
+case class ArrayPrepend(left: Expression, right: Expression)
+  extends BinaryExpression
+    with ImplicitCastInputTypes
+    with NullIntolerant
+    with QueryErrorsBase {
+
+  override def nullSafeEval(arr: Any, value: Any): Any = {
+    val numberOfElements = arr.asInstanceOf[ArrayData].numElements()
+    if (numberOfElements + 1 > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
+      throw QueryExecutionErrors.concatArraysWithElementsExceedLimitError(numberOfElements)
+    }
+    val newArray = new Array[Any](numberOfElements + 1)
+    newArray(0) = value
+    var pos = 1
+    arr

Review Comment:
   Thanks, I've made this consistent with `ArrayAppend`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] zhengruifeng closed pull request #38947: [SPARK-41233][SQL][PYTHON] Add `array_prepend` function

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.

zhengruifeng closed pull request #38947: [SPARK-41233][SQL][PYTHON] Add `array_prepend` function
URL: https://github.com/apache/spark/pull/38947


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org