You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/01/10 05:45:35 UTC

[GitHub] [spark] HyukjinKwon opened a new pull request #31110: [SPARK-34059][SQL][CORE] Use for/foreach rather than map to make sure execute it eagerly

HyukjinKwon opened a new pull request #31110:
URL: https://github.com/apache/spark/pull/31110


   ### What changes were proposed in this pull request?
   
   This PR is basically a followup of https://github.com/apache/spark/pull/14332. 
   Calling `map` alone might leave it not executed due to lazy evaluation, e.g.)
   
   ```
   scala> val foo = Seq(1,2,3)
   foo: Seq[Int] = List(1, 2, 3)
   
   scala> foo.map(println)
   1
   2
   3
   res0: Seq[Unit] = List((), (), ())
   
   scala> foo.view.map(println)
   res1: scala.collection.SeqView[Unit,Seq[_]] = SeqViewM(...)
   
   scala> foo.view.foreach(println)
   1
   2
   3
   ```
   
   We should better use `foreach` to make sure it's executed where the output is unused or `Unit`.
   
   ### Why are the changes needed?
   
   To prevent the potential issues by not executing `map`.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No, the current codes look not causing any problem for now.
   
   ### How was this patch tested?
   
   I found these item by running IntelliJ inspection, double checked one by one, and fixed them. These should be all instances across the codebase ideally.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #31110: [SPARK-34059][SQL][CORE] Use for/foreach rather than map to make sure execute it eagerly

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #31110:
URL: https://github.com/apache/spark/pull/31110#issuecomment-758253914


   Sure I'll prepare backport


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #31110: [SPARK-34059][SQL][CORE] Use for/foreach rather than map to make sure execute it eagerly

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #31110:
URL: https://github.com/apache/spark/pull/31110#issuecomment-757444327


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/133886/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31110: [SPARK-34059][SQL][CORE] Use for/foreach rather than map to make sure execute it eagerly

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31110:
URL: https://github.com/apache/spark/pull/31110#issuecomment-757456458


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38479/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #31110: [SPARK-34059][SQL][CORE] Use for/foreach rather than map to make sure execute it eagerly

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #31110:
URL: https://github.com/apache/spark/pull/31110#issuecomment-757425590


   **[Test build #133886 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133886/testReport)** for PR 31110 at commit [`f3e864e`](https://github.com/apache/spark/commit/f3e864e07873dee75765ac2a4c16ecbd4a6471c8).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] mridulm commented on pull request #31110: [SPARK-34059][SQL][CORE] Use for/foreach rather than map to make sure execute it eagerly

Posted by GitBox <gi...@apache.org>.
mridulm commented on pull request #31110:
URL: https://github.com/apache/spark/pull/31110#issuecomment-758060932


   Thanks for fixing this @HyukjinKwon ! Surprised these did not cause problems earlier.
   Do we want to backport this to older branches as well ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya closed pull request #31110: [SPARK-34059][SQL][CORE] Use for/foreach rather than map to make sure execute it eagerly

Posted by GitBox <gi...@apache.org>.
viirya closed pull request #31110:
URL: https://github.com/apache/spark/pull/31110


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31110: [SPARK-34059][SQL][CORE] Use for/foreach rather than map to make sure execute it eagerly

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31110:
URL: https://github.com/apache/spark/pull/31110#issuecomment-757425590


   **[Test build #133886 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133886/testReport)** for PR 31110 at commit [`f3e864e`](https://github.com/apache/spark/commit/f3e864e07873dee75765ac2a4c16ecbd4a6471c8).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31110: [SPARK-34059][SQL][CORE] Use for/foreach rather than map to make sure execute it eagerly

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31110:
URL: https://github.com/apache/spark/pull/31110#issuecomment-757438894


   **[Test build #133886 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133886/testReport)** for PR 31110 at commit [`f3e864e`](https://github.com/apache/spark/commit/f3e864e07873dee75765ac2a4c16ecbd4a6471c8).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on pull request #31110: [SPARK-34059][SQL][CORE] Use for/foreach rather than map to make sure execute it eagerly

Posted by GitBox <gi...@apache.org>.
viirya commented on pull request #31110:
URL: https://github.com/apache/spark/pull/31110#issuecomment-757562263


   Thanks! Merging to master.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31110: [SPARK-34059][SQL][CORE] Use for/foreach rather than map to make sure execute it eagerly

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31110:
URL: https://github.com/apache/spark/pull/31110#issuecomment-757446322


   **[Test build #133890 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133890/testReport)** for PR 31110 at commit [`74bc183`](https://github.com/apache/spark/commit/74bc18322fe258aaed7ccb961cdd7f8fc0ad6e76).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #31110: [SPARK-34059][SQL][CORE] Use for/foreach rather than map to make sure execute it eagerly

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #31110:
URL: https://github.com/apache/spark/pull/31110#issuecomment-757437590


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/38475/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #31110: [SPARK-34059][SQL][CORE] Use for/foreach rather than map to make sure execute it eagerly

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #31110:
URL: https://github.com/apache/spark/pull/31110#issuecomment-757467620


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/133890/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #31110: [SPARK-34059][SQL][CORE] Use for/foreach rather than map to make sure execute it eagerly

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #31110:
URL: https://github.com/apache/spark/pull/31110#issuecomment-757459511


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/38479/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #31110: [SPARK-34059][SQL][CORE] Use for/foreach rather than map to make sure execute it eagerly

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #31110:
URL: https://github.com/apache/spark/pull/31110#issuecomment-757437590


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/38475/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #31110: [SPARK-34059][SQL][CORE] Use for/foreach rather than map to make sure execute it eagerly

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #31110:
URL: https://github.com/apache/spark/pull/31110#issuecomment-757444327


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/133886/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31110: [SPARK-34059][SQL][CORE] Use for/foreach rather than map to make sure execute it eagerly

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31110:
URL: https://github.com/apache/spark/pull/31110#issuecomment-757431928


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38475/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #31110: [SPARK-34059][SQL][CORE] Use for/foreach rather than map to make sure execute it eagerly

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #31110:
URL: https://github.com/apache/spark/pull/31110#issuecomment-757572667


   Thanks @srowen and @viirya!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31110: [SPARK-34059][SQL][CORE] Use for/foreach rather than map to make sure execute it eagerly

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31110:
URL: https://github.com/apache/spark/pull/31110#issuecomment-757451857


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38479/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #31110: [SPARK-34059][SQL][CORE] Use for/foreach rather than map to make sure execute it eagerly

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #31110:
URL: https://github.com/apache/spark/pull/31110#issuecomment-757459511


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/38479/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31110: [SPARK-34059][SQL][CORE] Use for/foreach rather than map to make sure execute it eagerly

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31110:
URL: https://github.com/apache/spark/pull/31110#issuecomment-757466404


   **[Test build #133890 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133890/testReport)** for PR 31110 at commit [`74bc183`](https://github.com/apache/spark/commit/74bc18322fe258aaed7ccb961cdd7f8fc0ad6e76).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #31110: [SPARK-34059][SQL][CORE] Use for/foreach rather than map to make sure execute it eagerly

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #31110:
URL: https://github.com/apache/spark/pull/31110#issuecomment-757446322


   **[Test build #133890 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133890/testReport)** for PR 31110 at commit [`74bc183`](https://github.com/apache/spark/commit/74bc18322fe258aaed7ccb961cdd7f8fc0ad6e76).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #31110: [SPARK-34059][SQL][CORE] Use for/foreach rather than map to make sure execute it eagerly

Posted by GitBox <gi...@apache.org>.
srowen commented on pull request #31110:
URL: https://github.com/apache/spark/pull/31110#issuecomment-758254779


   It will definitely have the same effect if the collection is not lazy, so I doubt it currently is the cause of a bug. But yeah it'd be fine to backport in case there is any chance the collection that something returns suddenly gets lazy.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #31110: [SPARK-34059][SQL][CORE] Use for/foreach rather than map to make sure execute it eagerly

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #31110:
URL: https://github.com/apache/spark/pull/31110#issuecomment-757467620


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/133890/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #31110: [SPARK-34059][SQL][CORE] Use for/foreach rather than map to make sure execute it eagerly

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #31110:
URL: https://github.com/apache/spark/pull/31110#issuecomment-757421658


   cc @srowen FYI


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31110: [SPARK-34059][SQL][CORE] Use for/foreach rather than map to make sure execute it eagerly

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31110:
URL: https://github.com/apache/spark/pull/31110#issuecomment-757428528


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38475/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org