You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/10/29 03:47:12 UTC
[GitHub] [spark] beliefer opened a new pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
beliefer opened a new pull request #30178:
URL: https://github.com/apache/spark/pull/30178
### What changes were proposed in this pull request?
https://github.com/apache/spark/pull/29800 provides a performance improvement for `NTH_VALUE`.
`FIRST_VALUE` also could uses the `UnboundedOffsetWindowFunctionFrame` and `UnboundedPrecedingOffsetWindowFunctionFrame`.
### Why are the changes needed?
Improve the performance for `FIRST_VALUE`.
### Does this PR introduce _any_ user-facing change?
'No'.
### How was this patch tested?
Jenkins test.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725512545
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725373573
**[Test build #130934 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130934/testReport)** for PR 30178 at commit [`fd7e02e`](https://github.com/apache/spark/commit/fd7e02eccb5c4ac6112e34a43634c8f656a447d3).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #30178:
URL: https://github.com/apache/spark/pull/30178#discussion_r521389613
##########
File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/OptimizeWindowFunctionsSuite.scala
##########
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import org.apache.spark.sql.catalyst.dsl.expressions._
+import org.apache.spark.sql.catalyst.dsl.plans._
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.catalyst.expressions.aggregate.First
+import org.apache.spark.sql.catalyst.plans.PlanTest
+import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, LogicalPlan}
+import org.apache.spark.sql.catalyst.rules.RuleExecutor
+
+class OptimizeWindowFunctionsSuite extends PlanTest {
+ object Optimize extends RuleExecutor[LogicalPlan] {
+ val batches = Batch("OptimizeWindowFunctions", FixedPoint(10),
+ OptimizeWindowFunctions) :: Nil
+ }
+
+ test("check OptimizeWindowFunctions") {
Review comment:
let's add a negative test: if the window frame is ordered, don't optimize.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-724475678
**[Test build #130826 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130826/testReport)** for PR 30178 at commit [`2f3fbda`](https://github.com/apache/spark/commit/2f3fbda5857d861021aa272d3dec085def4f638b).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725868607
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723745492
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/130762/
Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-724581136
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-718691371
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723967193
Merged build finished. Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723025136
**[Test build #130709 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130709/testReport)** for PR 30178 at commit [`879d6c7`](https://github.com/apache/spark/commit/879d6c7687e57004cc7f5925e53afb11e2064f9f).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725458289
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725816099
Merged build finished. Test PASSed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-718688759
**[Test build #130397 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130397/testReport)** for PR 30178 at commit [`181186c`](https://github.com/apache/spark/commit/181186c6cce9c3b4e3061dc84b667ee898dd3f40).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723736454
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725868607
Merged build finished. Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-718445051
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35001/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-724578203
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-724581136
Merged build finished. Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723011048
Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35319/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] beliefer commented on a change in pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
beliefer commented on a change in pull request #30178:
URL: https://github.com/apache/spark/pull/30178#discussion_r521779003
##########
File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/OptimizeWindowFunctionsSuite.scala
##########
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import org.apache.spark.sql.catalyst.dsl.expressions._
+import org.apache.spark.sql.catalyst.dsl.plans._
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.catalyst.expressions.aggregate.First
+import org.apache.spark.sql.catalyst.plans.PlanTest
+import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, LogicalPlan}
+import org.apache.spark.sql.catalyst.rules.RuleExecutor
+
+class OptimizeWindowFunctionsSuite extends PlanTest {
+ object Optimize extends RuleExecutor[LogicalPlan] {
+ val batches = Batch("OptimizeWindowFunctions", FixedPoint(10),
+ OptimizeWindowFunctions) :: Nil
+ }
+
+ test("check OptimizeWindowFunctions") {
Review comment:
OK
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-718356033
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34996/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725412421
Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35536/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-724476717
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-724420337
**[Test build #130826 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130826/testReport)** for PR 30178 at commit [`2f3fbda`](https://github.com/apache/spark/commit/2f3fbda5857d861021aa272d3dec085def4f638b).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725412440
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723745486
Merged build finished. Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725841076
**[Test build #130967 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130967/testReport)** for PR 30178 at commit [`68d3388`](https://github.com/apache/spark/commit/68d3388001615841685e9942c4220d7904f33665).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-724581145
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/35457/
Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723723310
**[Test build #130764 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130764/testReport)** for PR 30178 at commit [`7b99d27`](https://github.com/apache/spark/commit/7b99d2720b38e5f9a67a9c0810cde23b6e1ae797).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725370017
**[Test build #130933 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130933/testReport)** for PR 30178 at commit [`f851a4c`](https://github.com/apache/spark/commit/f851a4ceae8eaca7a1460f48f0266b3fd42519af).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725427516
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725902781
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-718342791
**[Test build #130393 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130393/testReport)** for PR 30178 at commit [`181186c`](https://github.com/apache/spark/commit/181186c6cce9c3b4e3061dc84b667ee898dd3f40).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723739706
Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35373/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723723310
**[Test build #130764 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130764/testReport)** for PR 30178 at commit [`7b99d27`](https://github.com/apache/spark/commit/7b99d2720b38e5f9a67a9c0810cde23b6e1ae797).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-724560196
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35453/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725924793
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35583/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-722958337
**[Test build #130709 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130709/testReport)** for PR 30178 at commit [`879d6c7`](https://github.com/apache/spark/commit/879d6c7687e57004cc7f5925e53afb11e2064f9f).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725911942
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723745421
**[Test build #130762 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130762/testReport)** for PR 30178 at commit [`57c7ef1`](https://github.com/apache/spark/commit/57c7ef1fe17fe7448505f8b1976f153b275125af).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723733146
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35373/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan closed pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
cloud-fan closed pull request #30178:
URL: https://github.com/apache/spark/pull/30178
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-724679268
**[Test build #130849 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130849/testReport)** for PR 30178 at commit [`2f3fbda`](https://github.com/apache/spark/commit/2f3fbda5857d861021aa272d3dec085def4f638b).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725816104
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/35565/
Test PASSed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-718408701
**[Test build #130397 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130397/testReport)** for PR 30178 at commit [`181186c`](https://github.com/apache/spark/commit/181186c6cce9c3b4e3061dc84b667ee898dd3f40).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-724536515
**[Test build #130849 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130849/testReport)** for PR 30178 at commit [`2f3fbda`](https://github.com/apache/spark/commit/2f3fbda5857d861021aa272d3dec085def4f638b).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725370017
**[Test build #130933 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130933/testReport)** for PR 30178 at commit [`f851a4c`](https://github.com/apache/spark/commit/f851a4ceae8eaca7a1460f48f0266b3fd42519af).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725427516
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] beliefer commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
beliefer commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-718407326
retest this please
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725960801
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35586/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723823063
Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35382/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723788824
**[Test build #130773 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130773/testReport)** for PR 30178 at commit [`0c953ff`](https://github.com/apache/spark/commit/0c953ff5e8b249debcf0fe2050840afdbb176ad8).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723736454
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-724581123
Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35457/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] beliefer commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
beliefer commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725913697
retest this please
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723895898
Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35386/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723967214
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/130778/
Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] beliefer commented on a change in pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
beliefer commented on a change in pull request #30178:
URL: https://github.com/apache/spark/pull/30178#discussion_r521828212
##########
File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/OptimizeWindowFunctionsSuite.scala
##########
@@ -0,0 +1,65 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import org.apache.spark.sql.catalyst.dsl.expressions._
+import org.apache.spark.sql.catalyst.dsl.plans._
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.catalyst.expressions.aggregate.First
+import org.apache.spark.sql.catalyst.plans.PlanTest
+import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, LogicalPlan}
+import org.apache.spark.sql.catalyst.rules.RuleExecutor
+
+class OptimizeWindowFunctionsSuite extends PlanTest {
+ object Optimize extends RuleExecutor[LogicalPlan] {
+ val batches = Batch("OptimizeWindowFunctions", FixedPoint(10),
+ OptimizeWindowFunctions) :: Nil
+ }
+
+ val testRelation = LocalRelation('a.double, 'b.double, 'c.string)
+ val a = testRelation.output(0)
+ val b = testRelation.output(1)
+ val c = testRelation.output(2)
+
+ test("replace first(col) by nth_value(col, 1) if the window frame is ordered") {
+ val inputPlan = testRelation.select(
+ WindowExpression(
+ First(a, false).toAggregateExpression(),
+ WindowSpecDefinition(b :: Nil, c.asc :: Nil,
+ SpecifiedWindowFrame(RowFrame, UnboundedPreceding, CurrentRow))))
Review comment:
In the case of `RangeFrame`, there is no need to convert `first` to `nth_value`.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] beliefer commented on a change in pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
beliefer commented on a change in pull request #30178:
URL: https://github.com/apache/spark/pull/30178#discussion_r521828212
##########
File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/OptimizeWindowFunctionsSuite.scala
##########
@@ -0,0 +1,65 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import org.apache.spark.sql.catalyst.dsl.expressions._
+import org.apache.spark.sql.catalyst.dsl.plans._
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.catalyst.expressions.aggregate.First
+import org.apache.spark.sql.catalyst.plans.PlanTest
+import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, LogicalPlan}
+import org.apache.spark.sql.catalyst.rules.RuleExecutor
+
+class OptimizeWindowFunctionsSuite extends PlanTest {
+ object Optimize extends RuleExecutor[LogicalPlan] {
+ val batches = Batch("OptimizeWindowFunctions", FixedPoint(10),
+ OptimizeWindowFunctions) :: Nil
+ }
+
+ val testRelation = LocalRelation('a.double, 'b.double, 'c.string)
+ val a = testRelation.output(0)
+ val b = testRelation.output(1)
+ val c = testRelation.output(2)
+
+ test("replace first(col) by nth_value(col, 1) if the window frame is ordered") {
+ val inputPlan = testRelation.select(
+ WindowExpression(
+ First(a, false).toAggregateExpression(),
+ WindowSpecDefinition(b :: Nil, c.asc :: Nil,
+ SpecifiedWindowFrame(RowFrame, UnboundedPreceding, CurrentRow))))
Review comment:
Good question. In the case of `RangeFrame`, there is no need to convert `first` to `nth_value`.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725500528
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/130933/
Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-718462684
Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35001/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] beliefer commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
beliefer commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-724536194
retest this please
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] beliefer commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
beliefer commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-724507630
retest this please
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-724476717
Merged build finished. Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-724437876
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35435/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] beliefer commented on a change in pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
beliefer commented on a change in pull request #30178:
URL: https://github.com/apache/spark/pull/30178#discussion_r526545753
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
##########
@@ -806,6 +807,18 @@ object CollapseRepartition extends Rule[LogicalPlan] {
}
}
+/**
+ * Replaces first(col) to nth_value(col, 1) for better performance.
+ */
+object OptimizeWindowFunctions extends Rule[LogicalPlan] {
+ def apply(plan: LogicalPlan): LogicalPlan = plan resolveExpressions {
+ case we @ WindowExpression(AggregateExpression(first: First, _, _, _, _), spec)
+ if spec.orderSpec.nonEmpty &&
+ spec.frameSpecification.asInstanceOf[SpecifiedWindowFrame].frameType == RowFrame =>
Review comment:
It is harmless to not do the `UnboundedPreceding` check.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-718462709
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] beliefer commented on a change in pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
beliefer commented on a change in pull request #30178:
URL: https://github.com/apache/spark/pull/30178#discussion_r521774289
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
##########
@@ -806,6 +807,18 @@ object CollapseRepartition extends Rule[LogicalPlan] {
}
}
+/**
+ * Substitute the aggregate expression which uses [[First]] as the aggregate function
Review comment:
OK
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723840203
**[Test build #130773 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130773/testReport)** for PR 30178 at commit [`0c953ff`](https://github.com/apache/spark/commit/0c953ff5e8b249debcf0fe2050840afdbb176ad8).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725512545
Merged build finished. Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] beliefer commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
beliefer commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725084965
cc @cloud-fan
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725500517
Merged build finished. Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-724534425
**[Test build #130845 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130845/testReport)** for PR 30178 at commit [`2f3fbda`](https://github.com/apache/spark/commit/2f3fbda5857d861021aa272d3dec085def4f638b).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725914668
**[Test build #130980 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130980/testReport)** for PR 30178 at commit [`3a7f4e7`](https://github.com/apache/spark/commit/3a7f4e740eb5a9cecf880bc5cc294b2459e98cf1).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-718362027
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725816099
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-718362021
Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34996/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-724476729
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/130826/
Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723882781
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35386/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-724563333
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35457/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] beliefer commented on a change in pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
beliefer commented on a change in pull request #30178:
URL: https://github.com/apache/spark/pull/30178#discussion_r526545753
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
##########
@@ -806,6 +807,18 @@ object CollapseRepartition extends Rule[LogicalPlan] {
}
}
+/**
+ * Replaces first(col) to nth_value(col, 1) for better performance.
+ */
+object OptimizeWindowFunctions extends Rule[LogicalPlan] {
+ def apply(plan: LogicalPlan): LogicalPlan = plan resolveExpressions {
+ case we @ WindowExpression(AggregateExpression(first: First, _, _, _, _), spec)
+ if spec.orderSpec.nonEmpty &&
+ spec.frameSpecification.asInstanceOf[SpecifiedWindowFrame].frameType == RowFrame =>
Review comment:
OK
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-722958337
**[Test build #130709 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130709/testReport)** for PR 30178 at commit [`879d6c7`](https://github.com/apache/spark/commit/879d6c7687e57004cc7f5925e53afb11e2064f9f).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723967193
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725373573
**[Test build #130934 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130934/testReport)** for PR 30178 at commit [`fd7e02e`](https://github.com/apache/spark/commit/fd7e02eccb5c4ac6112e34a43634c8f656a447d3).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723025652
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725868615
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/35573/
Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725456994
**[Test build #130930 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130930/testReport)** for PR 30178 at commit [`e296eb6`](https://github.com/apache/spark/commit/e296eb65acb5ee30045c71e54777216cc1ebd243).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723731152
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35371/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725856249
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35573/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725902747
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725940177
Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35583/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723758691
Merged build finished. Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725442665
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-724508955
**[Test build #130845 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130845/testReport)** for PR 30178 at commit [`2f3fbda`](https://github.com/apache/spark/commit/2f3fbda5857d861021aa272d3dec085def4f638b).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725417561
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35539/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723721434
**[Test build #130762 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130762/testReport)** for PR 30178 at commit [`57c7ef1`](https://github.com/apache/spark/commit/57c7ef1fe17fe7448505f8b1976f153b275125af).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725427494
Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35539/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-724536515
**[Test build #130849 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130849/testReport)** for PR 30178 at commit [`2f3fbda`](https://github.com/apache/spark/commit/2f3fbda5857d861021aa272d3dec085def4f638b).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725322600
**[Test build #130930 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130930/testReport)** for PR 30178 at commit [`e296eb6`](https://github.com/apache/spark/commit/e296eb65acb5ee30045c71e54777216cc1ebd243).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725411540
Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35535/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-718342791
**[Test build #130393 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130393/testReport)** for PR 30178 at commit [`181186c`](https://github.com/apache/spark/commit/181186c6cce9c3b4e3061dc84b667ee898dd3f40).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #30178:
URL: https://github.com/apache/spark/pull/30178#discussion_r521808606
##########
File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/OptimizeWindowFunctionsSuite.scala
##########
@@ -0,0 +1,65 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import org.apache.spark.sql.catalyst.dsl.expressions._
+import org.apache.spark.sql.catalyst.dsl.plans._
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.catalyst.expressions.aggregate.First
+import org.apache.spark.sql.catalyst.plans.PlanTest
+import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, LogicalPlan}
+import org.apache.spark.sql.catalyst.rules.RuleExecutor
+
+class OptimizeWindowFunctionsSuite extends PlanTest {
+ object Optimize extends RuleExecutor[LogicalPlan] {
+ val batches = Batch("OptimizeWindowFunctions", FixedPoint(10),
+ OptimizeWindowFunctions) :: Nil
+ }
+
+ val testRelation = LocalRelation('a.double, 'b.double, 'c.string)
+ val a = testRelation.output(0)
+ val b = testRelation.output(1)
+ val c = testRelation.output(2)
+
+ test("replace first(col) by nth_value(col, 1) if the window frame is ordered") {
+ val inputPlan = testRelation.select(
+ WindowExpression(
+ First(a, false).toAggregateExpression(),
+ WindowSpecDefinition(b :: Nil, c.asc :: Nil,
+ SpecifiedWindowFrame(RowFrame, UnboundedPreceding, CurrentRow))))
Review comment:
how about `RangeFrame`?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725797504
**[Test build #130959 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130959/testReport)** for PR 30178 at commit [`72ceacc`](https://github.com/apache/spark/commit/72ceacc2677c66d884d5e4f4ac124c3f61975edd).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] beliefer commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
beliefer commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723853552
retest this please
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #30178:
URL: https://github.com/apache/spark/pull/30178#discussion_r521890293
##########
File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/OptimizeWindowFunctionsSuite.scala
##########
@@ -0,0 +1,76 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import org.apache.spark.sql.catalyst.dsl.expressions._
+import org.apache.spark.sql.catalyst.dsl.plans._
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.catalyst.expressions.aggregate.First
+import org.apache.spark.sql.catalyst.plans.PlanTest
+import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, LogicalPlan}
+import org.apache.spark.sql.catalyst.rules.RuleExecutor
+
+class OptimizeWindowFunctionsSuite extends PlanTest {
+ object Optimize extends RuleExecutor[LogicalPlan] {
+ val batches = Batch("OptimizeWindowFunctions", FixedPoint(10),
+ OptimizeWindowFunctions) :: Nil
+ }
+
+ val testRelation = LocalRelation('a.double, 'b.double, 'c.string)
+ val a = testRelation.output(0)
+ val b = testRelation.output(1)
+ val c = testRelation.output(2)
+
+ test("replace first(col) by nth_value(col, 1)") {
+ val inputPlan = testRelation.select(
+ WindowExpression(
+ First(a, false).toAggregateExpression(),
+ WindowSpecDefinition(b :: Nil, c.asc :: Nil,
+ SpecifiedWindowFrame(RowFrame, UnboundedPreceding, CurrentRow))))
+ val correctAnswer = testRelation.select(
+ WindowExpression(
+ NthValue(a, Literal(1), false),
+ WindowSpecDefinition(b :: Nil, c.asc :: Nil,
+ SpecifiedWindowFrame(RowFrame, UnboundedPreceding, CurrentRow))))
+
+ val optimized = Optimize.execute(inputPlan)
+ assert(optimized == correctAnswer)
+ }
+
+ test("can't replace first(col) by nth_value(col, 1) if the window frame type is row") {
Review comment:
`row` -> `range`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725511488
**[Test build #130934 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130934/testReport)** for PR 30178 at commit [`fd7e02e`](https://github.com/apache/spark/commit/fd7e02eccb5c4ac6112e34a43634c8f656a447d3).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723025652
Merged build finished. Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-724508955
**[Test build #130845 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130845/testReport)** for PR 30178 at commit [`2f3fbda`](https://github.com/apache/spark/commit/2f3fbda5857d861021aa272d3dec085def4f638b).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723895941
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723736444
Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35371/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723011063
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725911882
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725499570
**[Test build #130933 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130933/testReport)** for PR 30178 at commit [`f851a4c`](https://github.com/apache/spark/commit/f851a4ceae8eaca7a1460f48f0266b3fd42519af).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
* `class OptimizeWindowFunctionsSuite extends PlanTest `
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725914668
**[Test build #130980 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130980/testReport)** for PR 30178 at commit [`3a7f4e7`](https://github.com/apache/spark/commit/3a7f4e740eb5a9cecf880bc5cc294b2459e98cf1).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725412440
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-724444130
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725940198
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725973080
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-724534636
Merged build finished. Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-726130588
thanks, merging to master!
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723858208
**[Test build #130778 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130778/testReport)** for PR 30178 at commit [`0c953ff`](https://github.com/apache/spark/commit/0c953ff5e8b249debcf0fe2050840afdbb176ad8).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723758536
**[Test build #130764 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130764/testReport)** for PR 30178 at commit [`7b99d27`](https://github.com/apache/spark/commit/7b99d2720b38e5f9a67a9c0810cde23b6e1ae797).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725816087
Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35565/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723000961
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35319/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725903800
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723823086
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723895941
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723745486
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723809806
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35382/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-724680298
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725500517
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #30178:
URL: https://github.com/apache/spark/pull/30178#discussion_r521390409
##########
File path: sql/core/src/test/resources/sql-tests/inputs/window.sql
##########
@@ -239,6 +264,11 @@ SELECT
employee_name,
department,
salary,
+ FIRST_VALUE(employee_name) OVER (
Review comment:
we can use the named window frame syntax to avoid duplicating the window frame definition.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723758702
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/130764/
Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725411559
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #30178:
URL: https://github.com/apache/spark/pull/30178#discussion_r521388978
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
##########
@@ -806,6 +807,18 @@ object CollapseRepartition extends Rule[LogicalPlan] {
}
}
+/**
+ * Substitute the aggregate expression which uses [[First]] as the aggregate function
Review comment:
nit: `Replaces first(col) to nth_value(col, 1) for better performance.`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-718408701
**[Test build #130397 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130397/testReport)** for PR 30178 at commit [`181186c`](https://github.com/apache/spark/commit/181186c6cce9c3b4e3061dc84b667ee898dd3f40).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725810045
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35565/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-718407119
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-718406582
**[Test build #130393 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130393/testReport)** for PR 30178 at commit [`181186c`](https://github.com/apache/spark/commit/181186c6cce9c3b4e3061dc84b667ee898dd3f40).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725411559
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-724444110
Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35435/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723840707
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723025665
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/130709/
Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #30178:
URL: https://github.com/apache/spark/pull/30178#discussion_r525826043
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
##########
@@ -806,6 +807,18 @@ object CollapseRepartition extends Rule[LogicalPlan] {
}
}
+/**
+ * Replaces first(col) to nth_value(col, 1) for better performance.
+ */
+object OptimizeWindowFunctions extends Rule[LogicalPlan] {
+ def apply(plan: LogicalPlan): LogicalPlan = plan resolveExpressions {
+ case we @ WindowExpression(AggregateExpression(first: First, _, _, _, _), spec)
+ if spec.orderSpec.nonEmpty &&
+ spec.frameSpecification.asInstanceOf[SpecifiedWindowFrame].frameType == RowFrame =>
Review comment:
shall we also check if the lower bound is `UnboundedPreceding`? otherwise we can't use the offset optimization for nth_value and `first` is probably faster than `nth_value(1)`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725512560
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/130934/
Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-724578203
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] beliefer commented on a change in pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
beliefer commented on a change in pull request #30178:
URL: https://github.com/apache/spark/pull/30178#discussion_r526545753
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
##########
@@ -806,6 +807,18 @@ object CollapseRepartition extends Rule[LogicalPlan] {
}
}
+/**
+ * Replaces first(col) to nth_value(col, 1) for better performance.
+ */
+object OptimizeWindowFunctions extends Rule[LogicalPlan] {
+ def apply(plan: LogicalPlan): LogicalPlan = plan resolveExpressions {
+ case we @ WindowExpression(AggregateExpression(first: First, _, _, _, _), spec)
+ if spec.orderSpec.nonEmpty &&
+ spec.frameSpecification.asInstanceOf[SpecifiedWindowFrame].frameType == RowFrame =>
Review comment:
OK. I created the https://github.com/apache/spark/pull/30419 to make this check.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-726061968
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #30178:
URL: https://github.com/apache/spark/pull/30178#discussion_r521808339
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
##########
@@ -806,6 +807,17 @@ object CollapseRepartition extends Rule[LogicalPlan] {
}
}
+/**
+ * Replaces first(col) to nth_value(col, 1) for better performance.
+ */
+object OptimizeWindowFunctions extends Rule[LogicalPlan] {
+ def apply(plan: LogicalPlan): LogicalPlan = plan resolveExpressions {
+ case we @ WindowExpression(AggregateExpression(first: First, _, _, _, _), spec)
+ if !spec.orderSpec.isEmpty =>
Review comment:
nonEmpty
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-724444130
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723858208
**[Test build #130778 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130778/testReport)** for PR 30178 at commit [`0c953ff`](https://github.com/apache/spark/commit/0c953ff5e8b249debcf0fe2050840afdbb176ad8).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723011063
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723739717
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-718362027
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725940198
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-724534636
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723721434
**[Test build #130762 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130762/testReport)** for PR 30178 at commit [`57c7ef1`](https://github.com/apache/spark/commit/57c7ef1fe17fe7448505f8b1976f153b275125af).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725398569
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35536/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725401743
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35535/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725431796
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35540/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723788824
**[Test build #130773 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130773/testReport)** for PR 30178 at commit [`0c953ff`](https://github.com/apache/spark/commit/0c953ff5e8b249debcf0fe2050840afdbb176ad8).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] beliefer commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
beliefer commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-726474375
@cloud-fan Thanks for your help!
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723966740
**[Test build #130778 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130778/testReport)** for PR 30178 at commit [`0c953ff`](https://github.com/apache/spark/commit/0c953ff5e8b249debcf0fe2050840afdbb176ad8).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-726061968
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725868588
Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35573/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-724420337
**[Test build #130826 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130826/testReport)** for PR 30178 at commit [`2f3fbda`](https://github.com/apache/spark/commit/2f3fbda5857d861021aa272d3dec085def4f638b).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723823086
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725442665
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725322600
**[Test build #130930 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130930/testReport)** for PR 30178 at commit [`e296eb6`](https://github.com/apache/spark/commit/e296eb65acb5ee30045c71e54777216cc1ebd243).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725973080
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-724534660
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/130845/
Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725458289
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725442641
Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35540/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725797504
**[Test build #130959 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130959/testReport)** for PR 30178 at commit [`72ceacc`](https://github.com/apache/spark/commit/72ceacc2677c66d884d5e4f4ac124c3f61975edd).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723739717
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723840707
Merged build finished. Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] beliefer commented on a change in pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
beliefer commented on a change in pull request #30178:
URL: https://github.com/apache/spark/pull/30178#discussion_r521785541
##########
File path: sql/core/src/test/resources/sql-tests/inputs/window.sql
##########
@@ -239,6 +264,11 @@ SELECT
employee_name,
department,
salary,
+ FIRST_VALUE(employee_name) OVER (
Review comment:
OK
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-718462709
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723758691
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725912407
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/130967/
Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725911942
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725903800
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-724578178
Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35453/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725912390
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-723840714
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/130773/
Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] beliefer commented on a change in pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
beliefer commented on a change in pull request #30178:
URL: https://github.com/apache/spark/pull/30178#discussion_r521823258
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
##########
@@ -806,6 +807,17 @@ object CollapseRepartition extends Rule[LogicalPlan] {
}
}
+/**
+ * Replaces first(col) to nth_value(col, 1) for better performance.
+ */
+object OptimizeWindowFunctions extends Rule[LogicalPlan] {
+ def apply(plan: LogicalPlan): LogicalPlan = plan resolveExpressions {
+ case we @ WindowExpression(AggregateExpression(first: First, _, _, _, _), spec)
+ if !spec.orderSpec.isEmpty =>
Review comment:
OK
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-718407119
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-718691371
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-726059283
**[Test build #130980 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130980/testReport)** for PR 30178 at commit [`3a7f4e7`](https://github.com/apache/spark/commit/3a7f4e740eb5a9cecf880bc5cc294b2459e98cf1).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] beliefer commented on a change in pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
beliefer commented on a change in pull request #30178:
URL: https://github.com/apache/spark/pull/30178#discussion_r521894055
##########
File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/OptimizeWindowFunctionsSuite.scala
##########
@@ -0,0 +1,76 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import org.apache.spark.sql.catalyst.dsl.expressions._
+import org.apache.spark.sql.catalyst.dsl.plans._
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.catalyst.expressions.aggregate.First
+import org.apache.spark.sql.catalyst.plans.PlanTest
+import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, LogicalPlan}
+import org.apache.spark.sql.catalyst.rules.RuleExecutor
+
+class OptimizeWindowFunctionsSuite extends PlanTest {
+ object Optimize extends RuleExecutor[LogicalPlan] {
+ val batches = Batch("OptimizeWindowFunctions", FixedPoint(10),
+ OptimizeWindowFunctions) :: Nil
+ }
+
+ val testRelation = LocalRelation('a.double, 'b.double, 'c.string)
+ val a = testRelation.output(0)
+ val b = testRelation.output(1)
+ val c = testRelation.output(2)
+
+ test("replace first(col) by nth_value(col, 1)") {
+ val inputPlan = testRelation.select(
+ WindowExpression(
+ First(a, false).toAggregateExpression(),
+ WindowSpecDefinition(b :: Nil, c.asc :: Nil,
+ SpecifiedWindowFrame(RowFrame, UnboundedPreceding, CurrentRow))))
+ val correctAnswer = testRelation.select(
+ WindowExpression(
+ NthValue(a, Literal(1), false),
+ WindowSpecDefinition(b :: Nil, c.asc :: Nil,
+ SpecifiedWindowFrame(RowFrame, UnboundedPreceding, CurrentRow))))
+
+ val optimized = Optimize.execute(inputPlan)
+ assert(optimized == correctAnswer)
+ }
+
+ test("can't replace first(col) by nth_value(col, 1) if the window frame type is row") {
Review comment:
OK
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-724680298
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30178:
URL: https://github.com/apache/spark/pull/30178#issuecomment-725973055
Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35586/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org