You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "beliefer (via GitHub)" <gi...@apache.org> on 2023/04/18 05:44:00 UTC

[GitHub] [spark] beliefer opened a new pull request, #40833: [SPARK-43137][SQL] Improve ArrayInsert if the position is foldable and positive.

beliefer opened a new pull request, #40833:
URL: https://github.com/apache/spark/pull/40833

   ### What changes were proposed in this pull request?
   Currently, Spark supports the `array_insert` and `array_prepend`. Users insert an element into the head of array is common operation. Considered, we want make array_prepend reuse the implementation of array_insert, but it seems a bit performance worse if the position is foldable and positive.
   The reason is that always do the check for position is negative or positive, and the code is too long. Too long code will lead to JIT failed.
   
   
   ### Why are the changes needed?
   Improve ArrayInsert if the position is foldable and positive.
   
   
   ### Does this PR introduce _any_ user-facing change?
   'No'.
   Just change the inner implementation.
   
   
   ### How was this patch tested?
   Exists test cases.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #40833: [SPARK-43137][SQL] Improve ArrayInsert if the position is foldable and positive.

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #40833:
URL: https://github.com/apache/spark/pull/40833#discussion_r1169721404


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -4898,69 +4930,99 @@ case class ArrayInsert(srcArrayExpr: Expression, posExpr: Expression, itemExpr:
       val assignment = CodeGenerator.createArrayAssignment(values, elementType, arr,
         adjustedAllocIdx, i, first.dataType.asInstanceOf[ArrayType].containsNull)
       val errorContext = getContextOrNullCode(ctx)
-
-      s"""
-         |int $itemInsertionIndex = 0;
-         |int $resLength = 0;
-         |int $adjustedAllocIdx = 0;
-         |boolean $insertedItemIsNull = ${itemExpr.isNull};
-         |
-         |if ($pos == 0) {
-         |  throw QueryExecutionErrors.invalidIndexOfZeroError($errorContext);
-         |}
-         |
-         |if ($pos < 0 && (java.lang.Math.abs($pos) > $arr.numElements())) {
-         |
-         |  $resLength = java.lang.Math.abs($pos) + 1;
-         |  if ($resLength > ${ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH}) {
-         |    throw QueryExecutionErrors.createArrayWithElementsExceedLimitError($resLength);
-         |  }
-         |
-         |  $allocation
-         |  for (int $i = 0; $i < $arr.numElements(); $i ++) {
-         |    $adjustedAllocIdx = $i + 1 + java.lang.Math.abs($pos + $arr.numElements());
-         |    $assignment
-         |  }
-         |  ${CodeGenerator.setArrayElement(
-              values, elementType, itemInsertionIndex, item, Some(insertedItemIsNull))}
-         |
-         |  for (int $j = $pos + $arr.numElements(); $j < 0; $j ++) {
-         |    $values.setNullAt($j + 1 + java.lang.Math.abs($pos + $arr.numElements()));
-         |  }
-         |
-         |  ${ev.value} = $values;
-         |} else {
-         |
-         |  $itemInsertionIndex = 0;
-         |  if ($pos < 0) {
-         |    $itemInsertionIndex = $pos + $arr.numElements();
-         |  } else if ($pos > 0) {
-         |    $itemInsertionIndex = $pos - 1;
-         |  }
-         |
-         |  $resLength = java.lang.Math.max($arr.numElements() + 1, $itemInsertionIndex + 1);
-         |  if ($resLength > ${ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH}) {
-         |    throw QueryExecutionErrors.createArrayWithElementsExceedLimitError($resLength);
-         |  }
-         |
-         |  $allocation
-         |  for (int $i = 0; $i < $arr.numElements(); $i ++) {
-         |    $adjustedAllocIdx = $i;
-         |    if ($i >= $itemInsertionIndex) {
-         |      $adjustedAllocIdx = $adjustedAllocIdx + 1;
-         |    }
-         |    $assignment
-         |  }
-         |  ${CodeGenerator.setArrayElement(
+      if (position.isDefined) {
+        s"""
+           |int $itemInsertionIndex = ${position.get} - 1;

Review Comment:
   ```suggestion
              |int $itemInsertionIndex = ${position.get - 1};
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan closed pull request #40833: [SPARK-43137][SQL] Improve ArrayInsert if the position is foldable and positive.

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan closed pull request #40833: [SPARK-43137][SQL] Improve ArrayInsert if the position is foldable and positive.
URL: https://github.com/apache/spark/pull/40833


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on pull request #40833: [SPARK-43137][SQL] Improve ArrayInsert if the position is foldable and positive.

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on PR #40833:
URL: https://github.com/apache/spark/pull/40833#issuecomment-1514120599

   @cloud-fan The failed GA is unrelated to this PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #40833: [SPARK-43137][SQL] Improve ArrayInsert if the position is foldable and positive.

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on PR #40833:
URL: https://github.com/apache/spark/pull/40833#issuecomment-1514809153

   thanks, merging to master!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #40833: [SPARK-43137][SQL] Improve ArrayInsert if the position is foldable and positive.

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #40833:
URL: https://github.com/apache/spark/pull/40833#discussion_r1169725227


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##########
@@ -4806,6 +4806,17 @@ case class ArrayInsert(srcArrayExpr: Expression, posExpr: Expression, itemExpr:
     }
   }
 
+  private lazy val position = if (second.foldable) {

Review Comment:
   ```suggestion
     private lazy val positivePos = if (second.foldable) {
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on pull request #40833: [SPARK-43137][SQL] Improve ArrayInsert if the position is foldable and positive.

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on PR #40833:
URL: https://github.com/apache/spark/pull/40833#issuecomment-1515638428

   @cloud-fan Thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org