You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/03/13 09:13:45 UTC

[GitHub] [spark] LuciferYang opened a new pull request #35833: [SPARK-38360][SQL][FOLLOWUP] Replace `TreeNode.collectFirst` + `isDefined/isEmpty` with `exists`

LuciferYang opened a new pull request #35833:
URL: https://github.com/apache/spark/pull/35833


   ### What changes were proposed in this pull request?
   This is a follow up of SPARK-38360 to simplify code patterns related to `TreeNode.collectFirst`, the simplified rules are as follows:
   
   - `treeNode.collectFirst(condition).isDefined` -> `treeNode.exists(condition)`
   - `treeNode.collectFirst(condition).isEmpty` -> `!treeNode.exists(condition)`
   
   
   ### Why are the changes needed?
   Code simplification
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   
   ### How was this patch tested?
   Pass GA


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on pull request #35833: [SPARK-38360][SQL][AVRO][SS][FOLLOWUP] Replace `TreeNode.collectFirst` + `isDefined/isEmpty` with `exists`

Posted by GitBox <gi...@apache.org>.
LuciferYang commented on pull request #35833:
URL: https://github.com/apache/spark/pull/35833#issuecomment-1066423981


   thanks all


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang edited a comment on pull request #35833: [SPARK-38360][SQL][FOLLOWUP] Replace `TreeNode.collectFirst` + `isDefined/isEmpty` with `exists`

Posted by GitBox <gi...@apache.org>.
LuciferYang edited a comment on pull request #35833:
URL: https://github.com/apache/spark/pull/35833#issuecomment-1066058444


   @zhengruifeng 
   
   this is the follow of SPARK-38360 refer to your comments (https://github.com/apache/spark/pull/35694#pullrequestreview-905784925)
   
   also cc @cloud-fan @viirya @HyukjinKwon 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on a change in pull request #35833: [SPARK-38360][SQL][AVRO][SS][FOLLOWUP] Replace `TreeNode.collectFirst` + `isDefined/isEmpty` with `exists`

Posted by GitBox <gi...@apache.org>.
zhengruifeng commented on a change in pull request #35833:
URL: https://github.com/apache/spark/pull/35833#discussion_r825539735



##########
File path: external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaContinuousSourceSuite.scala
##########
@@ -212,13 +212,14 @@ class KafkaContinuousSourceTopicDeletionSuite extends KafkaContinuousTest {
         testUtils.createTopic(topic2, partitions = 5)
         eventually(timeout(streamingTimeout)) {
           assert(
-            query.lastExecution.executedPlan.collectFirst {
-              case scan: ContinuousScanExec
-                if scan.stream.isInstanceOf[KafkaContinuousStream] =>
-                scan.stream.asInstanceOf[KafkaContinuousStream]
-            }.exists { stream =>
-              // Ensure the new topic is present and the old topic is gone.
-              stream.knownPartitions.exists(_.topic == topic2)
+            query.lastExecution.executedPlan.exists {

Review comment:
       I guess this logic is changed?
   
   suppose there are two `ContinuousScanExec` and the first one do not satisify `stream.knownPartitions.exists(_.topic == topic2)` but the second one satisify it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on a change in pull request #35833: [SPARK-38360][SQL][AVRO][SS][FOLLOWUP] Replace `TreeNode.collectFirst` + `isDefined/isEmpty` with `exists`

Posted by GitBox <gi...@apache.org>.
LuciferYang commented on a change in pull request #35833:
URL: https://github.com/apache/spark/pull/35833#discussion_r825554190



##########
File path: external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaContinuousSourceSuite.scala
##########
@@ -212,13 +212,14 @@ class KafkaContinuousSourceTopicDeletionSuite extends KafkaContinuousTest {
         testUtils.createTopic(topic2, partitions = 5)
         eventually(timeout(streamingTimeout)) {
           assert(
-            query.lastExecution.executedPlan.collectFirst {
-              case scan: ContinuousScanExec
-                if scan.stream.isInstanceOf[KafkaContinuousStream] =>
-                scan.stream.asInstanceOf[KafkaContinuousStream]
-            }.exists { stream =>
-              // Ensure the new topic is present and the old topic is gone.
-              stream.knownPartitions.exists(_.topic == topic2)
+            query.lastExecution.executedPlan.exists {

Review comment:
       You are right. The logic has changed, [0031b96](https://github.com/apache/spark/pull/35833/commits/0031b96ae460fa0adf42ae14d0ea6c66b5df0501) revert this changed.
   
   Thanks ~
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on a change in pull request #35833: [SPARK-38360][SQL][AVRO][SS][FOLLOWUP] Replace `TreeNode.collectFirst` + `isDefined/isEmpty` with `exists`

Posted by GitBox <gi...@apache.org>.
LuciferYang commented on a change in pull request #35833:
URL: https://github.com/apache/spark/pull/35833#discussion_r825554258



##########
File path: external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaContinuousTest.scala
##########
@@ -45,11 +45,15 @@ trait KafkaContinuousTest extends KafkaSourceTest {
     testUtils.addPartitions(topic, newCount)
     eventually(timeout(streamingTimeout)) {
       assert(
-        query.lastExecution.executedPlan.collectFirst {
-          case scan: ContinuousScanExec
-              if scan.stream.isInstanceOf[KafkaContinuousStream] =>
-            scan.stream.asInstanceOf[KafkaContinuousStream]
-        }.exists(_.knownPartitions.size == newCount),
+        query.lastExecution.executedPlan.exists {

Review comment:
       [0031b96](https://github.com/apache/spark/pull/35833/commits/0031b96ae460fa0adf42ae14d0ea6c66b5df0501) revert this changed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng closed pull request #35833: [SPARK-38360][SQL][AVRO][SS][FOLLOWUP] Replace `TreeNode.collectFirst` + `isDefined/isEmpty` with `exists`

Posted by GitBox <gi...@apache.org>.
zhengruifeng closed pull request #35833:
URL: https://github.com/apache/spark/pull/35833


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on a change in pull request #35833: [SPARK-38360][SQL][AVRO][SS][FOLLOWUP] Replace `TreeNode.collectFirst` + `isDefined/isEmpty` with `exists`

Posted by GitBox <gi...@apache.org>.
zhengruifeng commented on a change in pull request #35833:
URL: https://github.com/apache/spark/pull/35833#discussion_r825539858



##########
File path: external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaContinuousTest.scala
##########
@@ -45,11 +45,15 @@ trait KafkaContinuousTest extends KafkaSourceTest {
     testUtils.addPartitions(topic, newCount)
     eventually(timeout(streamingTimeout)) {
       assert(
-        query.lastExecution.executedPlan.collectFirst {
-          case scan: ContinuousScanExec
-              if scan.stream.isInstanceOf[KafkaContinuousStream] =>
-            scan.stream.asInstanceOf[KafkaContinuousStream]
-        }.exists(_.knownPartitions.size == newCount),
+        query.lastExecution.executedPlan.exists {

Review comment:
       ditto




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on pull request #35833: [SPARK-38360][SQL][AVRO][SS][FOLLOWUP] Replace `TreeNode.collectFirst` + `isDefined/isEmpty` with `exists`

Posted by GitBox <gi...@apache.org>.
LuciferYang commented on pull request #35833:
URL: https://github.com/apache/spark/pull/35833#issuecomment-1066280566


   > hmm, there are some `case _ => false` added since collectFirst use a partialfunction.
   
   Yes, the code has not been reduced :(


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on pull request #35833: [SPARK-38360][SQL][AVRO][SS][FOLLOWUP] Replace `TreeNode.collectFirst` + `isDefined/isEmpty` with `exists`

Posted by GitBox <gi...@apache.org>.
zhengruifeng commented on pull request #35833:
URL: https://github.com/apache/spark/pull/35833#issuecomment-1066418740


   Thanks, merged to master!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on pull request #35833: [SPARK-38360][SQL][FOLLOWUP] Replace `TreeNode.collectFirst` + `isDefined/isEmpty` with `exists`

Posted by GitBox <gi...@apache.org>.
LuciferYang commented on pull request #35833:
URL: https://github.com/apache/spark/pull/35833#issuecomment-1066058444


   @zhengruifeng 
   
   this is the follow of SPARK-38360 refer to https://github.com/apache/spark/pull/35694#pullrequestreview-905784925
   
   also cc @cloud-fan @viirya @HyukjinKwon 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on pull request #35833: [SPARK-38360][SQL][AVRO][SS][FOLLOWUP] Replace `TreeNode.collectFirst` + `isDefined/isEmpty` with `exists`

Posted by GitBox <gi...@apache.org>.
zhengruifeng commented on pull request #35833:
URL: https://github.com/apache/spark/pull/35833#issuecomment-1066250236


   hmm, there are some `case _ => false` added since collectFirst use a partialfunction.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org