You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "beliefer (via GitHub)" <gi...@apache.org> on 2023/10/24 11:39:34 UTC

[PR] [SPARK-45649][SQL] Unify the prepare framework for `OffsetWindowFunctionFrame` [spark]

beliefer opened a new pull request, #43507:
URL: https://github.com/apache/spark/pull/43507

   ### What changes were proposed in this pull request?
   Currently, the implementation of the `prepare` of all the `OffsetWindowFunctionFrame` have the same code logic show below.
   ```
     override def prepare(rows: ExternalAppendOnlyUnsafeRowArray): Unit = {
       if (offset > rows.length) {
         fillDefaultValue(EmptyRow)
       } else {
         resetStates(rows)
         if (ignoreNulls) {
           ...
         } else {
           ...
         }
       }
     }
   ```
   This PR want unify the prepare framework for `OffsetWindowFunctionFrame`
   
   ### Why are the changes needed?
   Unify the prepare framework for `OffsetWindowFunctionFrame`
   
   
   ### Does this PR introduce _any_ user-facing change?
   'No'.
   Inner update.
   
   
   ### How was this patch tested?
   Exists test cases.
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   'No'.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45649][SQL] Unify the prepare framework for `OffsetWindowFunctionFrame` [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #43507:
URL: https://github.com/apache/spark/pull/43507#discussion_r1374095537


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowFunctionFrame.scala:
##########
@@ -284,7 +293,7 @@ class FrameLessOffsetWindowFunctionFrame(
       }
   } else {
     (current: InternalRow) =>
-      if (inputIndex >= 0 && inputIndex < input.length) {
+      if (inputIndex >= 0 && input != null && inputIndex < input.length) {

Review Comment:
   I feel the code is a bit subtle now. In the branch above here, there is no null check: `while (nextSelectedRow == EmptyRow && inputIndex < input.length) {`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45649][SQL] Unify the prepare framework for `OffsetWindowFunctionFrame` [spark]

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on PR #43507:
URL: https://github.com/apache/spark/pull/43507#issuecomment-1783758706

   @cloud-fan Thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45649][SQL] Unify the prepare framework for `OffsetWindowFunctionFrame` [spark]

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on code in PR #43507:
URL: https://github.com/apache/spark/pull/43507#discussion_r1372717736


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowFunctionFrame.scala:
##########
@@ -284,7 +293,7 @@ class FrameLessOffsetWindowFunctionFrame(
       }
   } else {
     (current: InternalRow) =>
-      if (inputIndex >= 0 && inputIndex < input.length) {
+      if (inputIndex >= 0 && input != null && inputIndex < input.length) {

Review Comment:
   If `Math.abs(offset) > rows.length` is true, no need to assign a value to `input`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45649][SQL] Unify the prepare framework for `OffsetWindowFunctionFrame` [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #43507:
URL: https://github.com/apache/spark/pull/43507#discussion_r1373100286


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowFunctionFrame.scala:
##########
@@ -284,7 +293,7 @@ class FrameLessOffsetWindowFunctionFrame(
       }
   } else {
     (current: InternalRow) =>
-      if (inputIndex >= 0 && inputIndex < input.length) {
+      if (inputIndex >= 0 && input != null && inputIndex < input.length) {

Review Comment:
   Why we don't need the null check before? where did we instantiate `input`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45649][SQL] Unify the prepare framework for `OffsetWindowFunctionFrame` [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #43507:
URL: https://github.com/apache/spark/pull/43507#discussion_r1400162514


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowFunctionFrame.scala:
##########
@@ -196,24 +224,15 @@ class FrameLessOffsetWindowFunctionFrame(
     offset: Int,
     ignoreNulls: Boolean = false)
   extends OffsetWindowFunctionFrameBase(
-    target, ordinal, expressions, inputSchema, newMutableProjection, offset) {
+    target, ordinal, expressions, inputSchema, newMutableProjection, offset, ignoreNulls) {
 
-  override def prepare(rows: ExternalAppendOnlyUnsafeRowArray): Unit = {
-    resetStates(rows)
-    if (ignoreNulls) {
-      if (Math.abs(offset) > rows.length) {
-        fillDefaultValue(EmptyRow)

Review Comment:
   This PR is not a pure refactor. Some logic is changed. e.g. before this PR, `fillDefaultValue(EmptyRow)` is never called if `ignoreNulls == false`. We need to understand the code better before making these changes.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45649][SQL] Unify the prepare framework for `OffsetWindowFunctionFrame` [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan closed pull request #43507: [SPARK-45649][SQL] Unify the prepare framework for `OffsetWindowFunctionFrame`
URL: https://github.com/apache/spark/pull/43507


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45649][SQL] Unify the prepare framework for `OffsetWindowFunctionFrame` [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #43507:
URL: https://github.com/apache/spark/pull/43507#discussion_r1372650267


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowFunctionFrame.scala:
##########
@@ -284,7 +293,7 @@ class FrameLessOffsetWindowFunctionFrame(
       }
   } else {
     (current: InternalRow) =>
-      if (inputIndex >= 0 && inputIndex < input.length) {
+      if (inputIndex >= 0 && input != null && inputIndex < input.length) {

Review Comment:
   why this change?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45649][SQL] Unify the prepare framework for `OffsetWindowFunctionFrame` [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on PR #43507:
URL: https://github.com/apache/spark/pull/43507#issuecomment-1782942277

   thanks, merging to master!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45649][SQL] Unify the prepare framework for `OffsetWindowFunctionFrame` [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #43507:
URL: https://github.com/apache/spark/pull/43507#discussion_r1374093866


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowFunctionFrame.scala:
##########
@@ -175,6 +178,23 @@ abstract class OffsetWindowFunctionFrameBase(
     }
   }
 
+  override def prepare(rows: ExternalAppendOnlyUnsafeRowArray): Unit = {

Review Comment:
   shall we overwrite `doWrite` do skip if `input == null`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45649][SQL] Unify the prepare framework for `OffsetWindowFunctionFrame` [spark]

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on code in PR #43507:
URL: https://github.com/apache/spark/pull/43507#discussion_r1374123448


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowFunctionFrame.scala:
##########
@@ -175,6 +178,23 @@ abstract class OffsetWindowFunctionFrameBase(
     }
   }
 
+  override def prepare(rows: ExternalAppendOnlyUnsafeRowArray): Unit = {

Review Comment:
   Got it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45649][SQL] Unify the prepare framework for `OffsetWindowFunctionFrame` [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on PR #43507:
URL: https://github.com/apache/spark/pull/43507#issuecomment-1820405407

   This has been reverted. See the reason in https://issues.apache.org/jira/browse/SPARK-45649


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45649][SQL] Unify the prepare framework for `OffsetWindowFunctionFrame` [spark]

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on PR #43507:
URL: https://github.com/apache/spark/pull/43507#issuecomment-1780978940

   The GA failure is unrelated.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45649][SQL] Unify the prepare framework for `OffsetWindowFunctionFrame` [spark]

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on code in PR #43507:
URL: https://github.com/apache/spark/pull/43507#discussion_r1374004543


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowFunctionFrame.scala:
##########
@@ -284,7 +293,7 @@ class FrameLessOffsetWindowFunctionFrame(
       }
   } else {
     (current: InternalRow) =>
-      if (inputIndex >= 0 && inputIndex < input.length) {
+      if (inputIndex >= 0 && input != null && inputIndex < input.length) {

Review Comment:
   Because the origin code always call `resetStates`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45649][SQL] Unify the prepare framework for `OffsetWindowFunctionFrame` [spark]

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on PR #43507:
URL: https://github.com/apache/spark/pull/43507#issuecomment-1778923842

   ping @cloud-fan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45649][SQL] Unify the prepare framework for `OffsetWindowFunctionFrame` [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #43507:
URL: https://github.com/apache/spark/pull/43507#discussion_r1372646818


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowFunctionFrame.scala:
##########
@@ -175,6 +176,23 @@ abstract class OffsetWindowFunctionFrameBase(
     }
   }
 
+  override def prepare(rows: ExternalAppendOnlyUnsafeRowArray): Unit = {
+    if (offset > rows.length) {

Review Comment:
   shall we use `Math.abs(offset)`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45649][SQL] Unify the prepare framework for `OffsetWindowFunctionFrame` [spark]

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on PR #43507:
URL: https://github.com/apache/spark/pull/43507#issuecomment-1778721347

   The GA failure is unrelated.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org