You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/02/23 21:53:00 UTC

[GitHub] [spark] zsxwing opened a new pull request #27678: [SPARK-30927][SS]StreamingQueryManager should avoid keeping reference to terminated StreamingQuery

zsxwing opened a new pull request #27678: [SPARK-30927][SS]StreamingQueryManager should avoid keeping reference to terminated StreamingQuery
URL: https://github.com/apache/spark/pull/27678
 
 
   ### What changes were proposed in this pull request?
   
   Right now `StreamingQueryManager` will keep the last terminated query until `resetTerminated` is called. When the last terminated query has lots of states (a large sql plan, cached RDDs, etc.), it will waste these memory unnecessarily. Actually, what `StreamingQueryManager` really needs is just the exception of the last failed query.
   
   This PR changes the internal field of `StreamingQueryManager` to remember the last exception instead to save the memory.
   
   ### Why are the changes needed?
   
   Avoid keeping memory unnecessarily.
   
   ### Does this PR introduce any user-facing change?
   
   No
   
   ### How was this patch tested?
   
   This PR doesn't change any public behaviors. The existing tests have covered the touched codes.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27678: [SPARK-30927][SS]StreamingQueryManager should avoid keeping reference to terminated StreamingQuery

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27678: [SPARK-30927][SS]StreamingQueryManager should avoid keeping reference to terminated StreamingQuery
URL: https://github.com/apache/spark/pull/27678#issuecomment-590144164
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27678: [SPARK-30927][SS]StreamingQueryManager should avoid keeping reference to terminated StreamingQuery

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27678: [SPARK-30927][SS]StreamingQueryManager should avoid keeping reference to terminated StreamingQuery
URL: https://github.com/apache/spark/pull/27678#issuecomment-590121050
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zsxwing commented on issue #27678: [SPARK-30927][SS]StreamingQueryManager should avoid keeping reference to terminated StreamingQuery

Posted by GitBox <gi...@apache.org>.
zsxwing commented on issue #27678: [SPARK-30927][SS]StreamingQueryManager should avoid keeping reference to terminated StreamingQuery
URL: https://github.com/apache/spark/pull/27678#issuecomment-590120775
 
 
   cc @brkyvz 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on issue #27678: [SPARK-30927][SS]StreamingQueryManager should avoid keeping reference to terminated StreamingQuery

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on issue #27678: [SPARK-30927][SS]StreamingQueryManager should avoid keeping reference to terminated StreamingQuery
URL: https://github.com/apache/spark/pull/27678#issuecomment-590239785
 
 
   Merged to master and branch-3.0.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27678: [SPARK-30927][SS]StreamingQueryManager should avoid keeping reference to terminated StreamingQuery

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27678: [SPARK-30927][SS]StreamingQueryManager should avoid keeping reference to terminated StreamingQuery
URL: https://github.com/apache/spark/pull/27678#issuecomment-590144164
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27678: [SPARK-30927][SS]StreamingQueryManager should avoid keeping reference to terminated StreamingQuery

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27678: [SPARK-30927][SS]StreamingQueryManager should avoid keeping reference to terminated StreamingQuery
URL: https://github.com/apache/spark/pull/27678#issuecomment-590121050
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27678: [SPARK-30927][SS]StreamingQueryManager should avoid keeping reference to terminated StreamingQuery

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27678: [SPARK-30927][SS]StreamingQueryManager should avoid keeping reference to terminated StreamingQuery
URL: https://github.com/apache/spark/pull/27678#issuecomment-590144166
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118837/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on a change in pull request #27678: [SPARK-30927][SS]StreamingQueryManager should avoid keeping reference to terminated StreamingQuery

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on a change in pull request #27678: [SPARK-30927][SS]StreamingQueryManager should avoid keeping reference to terminated StreamingQuery
URL: https://github.com/apache/spark/pull/27678#discussion_r383070001
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryManager.scala
 ##########
 @@ -125,11 +134,11 @@ class StreamingQueryManager private[sql] (sparkSession: SparkSession) extends Lo
   @throws[StreamingQueryException]
   def awaitAnyTermination(): Unit = {
     awaitTerminationLock.synchronized {
 
 Review comment:
   minor: While we're here, how about simply calling `awaitAnyTermination(Long.MAX_VALUE)` which is effectively same?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27678: [SPARK-30927][SS]StreamingQueryManager should avoid keeping reference to terminated StreamingQuery

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27678: [SPARK-30927][SS]StreamingQueryManager should avoid keeping reference to terminated StreamingQuery
URL: https://github.com/apache/spark/pull/27678#issuecomment-590120946
 
 
   **[Test build #118837 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118837/testReport)** for PR 27678 at commit [`e8e77f8`](https://github.com/apache/spark/commit/e8e77f80082ff5853ee014bbc3aa07cd8917ce67).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27678: [SPARK-30927][SS]StreamingQueryManager should avoid keeping reference to terminated StreamingQuery

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27678: [SPARK-30927][SS]StreamingQueryManager should avoid keeping reference to terminated StreamingQuery
URL: https://github.com/apache/spark/pull/27678#issuecomment-590121051
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23586/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon closed pull request #27678: [SPARK-30927][SS]StreamingQueryManager should avoid keeping reference to terminated StreamingQuery

Posted by GitBox <gi...@apache.org>.
HyukjinKwon closed pull request #27678: [SPARK-30927][SS]StreamingQueryManager should avoid keeping reference to terminated StreamingQuery
URL: https://github.com/apache/spark/pull/27678
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27678: [SPARK-30927][SS]StreamingQueryManager should avoid keeping reference to terminated StreamingQuery

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27678: [SPARK-30927][SS]StreamingQueryManager should avoid keeping reference to terminated StreamingQuery
URL: https://github.com/apache/spark/pull/27678#issuecomment-590144166
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118837/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27678: [SPARK-30927][SS]StreamingQueryManager should avoid keeping reference to terminated StreamingQuery

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27678: [SPARK-30927][SS]StreamingQueryManager should avoid keeping reference to terminated StreamingQuery
URL: https://github.com/apache/spark/pull/27678#issuecomment-590143979
 
 
   **[Test build #118837 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118837/testReport)** for PR 27678 at commit [`e8e77f8`](https://github.com/apache/spark/commit/e8e77f80082ff5853ee014bbc3aa07cd8917ce67).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27678: [SPARK-30927][SS]StreamingQueryManager should avoid keeping reference to terminated StreamingQuery

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27678: [SPARK-30927][SS]StreamingQueryManager should avoid keeping reference to terminated StreamingQuery
URL: https://github.com/apache/spark/pull/27678#issuecomment-590120946
 
 
   **[Test build #118837 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118837/testReport)** for PR 27678 at commit [`e8e77f8`](https://github.com/apache/spark/commit/e8e77f80082ff5853ee014bbc3aa07cd8917ce67).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on a change in pull request #27678: [SPARK-30927][SS]StreamingQueryManager should avoid keeping reference to terminated StreamingQuery

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on a change in pull request #27678: [SPARK-30927][SS]StreamingQueryManager should avoid keeping reference to terminated StreamingQuery
URL: https://github.com/apache/spark/pull/27678#discussion_r383070001
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryManager.scala
 ##########
 @@ -125,11 +134,11 @@ class StreamingQueryManager private[sql] (sparkSession: SparkSession) extends Lo
   @throws[StreamingQueryException]
   def awaitAnyTermination(): Unit = {
     awaitTerminationLock.synchronized {
 
 Review comment:
   While we're here, how about simply calling `awaitAnyTermination(Long.MAX_VALUE)` which is effectively same?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27678: [SPARK-30927][SS]StreamingQueryManager should avoid keeping reference to terminated StreamingQuery

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27678: [SPARK-30927][SS]StreamingQueryManager should avoid keeping reference to terminated StreamingQuery
URL: https://github.com/apache/spark/pull/27678#issuecomment-590121051
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23586/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org