You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/09/18 08:48:42 UTC

[GitHub] [spark] BOOTMGR opened a new pull request #34039: [SPARK-36798]: Wait for listeners to finish before flushing metrics

BOOTMGR opened a new pull request #34039:
URL: https://github.com/apache/spark/pull/34039


   ### What changes were proposed in this pull request?
   When `SparkContext` is shutting down, wait for listener bus to finish and then only flush `MetricsSystem`.
   
   
   ### Why are the changes needed?
   In current implementation, when `SparkContext.stop()` is called, `metricsSystem.report()` is called before `listenerBus.stop()`. In this case, if some listener is producing some metrics, they would never reach sink.
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   NA
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] skonto commented on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
skonto commented on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-922871546


   One of use cases of a listener is to ship metrics and other info when a specific event occur eg. batch completed, I was wondering (thinking out loud) what kind of metrics are produced from a listener that either a) are not core metrics and b) are not shipped from the listener to some external storage.  
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] skonto commented on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
skonto commented on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-926435483


   @BOOTMGR imho triggerExecution time looks a good candidate to be part of the metrics that should be exposed to the metrics system by default (what else is available from the query reporting info?) without having to depend on the listener. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] skonto edited a comment on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
skonto edited a comment on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-926435483


   @BOOTMGR imho triggerExecution time looks a good candidate to be part of the metrics that should be exposed to the metrics system if needed automatically (what else is available from the query reporting info?) without having to depend on the listener. 
   
   >I do not want to push it at every batch end (expensive sink), rather I would like to have them flushed along with other metrics via MetricsSystem. 
   
   Makes sense if the sink is too expensive. However, in a pull model things are not much different metrics will be sampled anyway at some interval eg. Prometheus, via some metrics exporter, independently of flushing. So the equivalent in terms of costs, for a push model would be to push at some appropriate rate.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] skonto edited a comment on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
skonto edited a comment on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-926435483


   @BOOTMGR imho triggerExecution time looks a good candidate to be part of the metrics that should be exposed to the metrics system if needed automatically (what else is available from the query reporting info?) without having to depend on the listener. 
   
   >I do not want to push it at every batch end (expensive sink), rather I would like to have them flushed along with other metrics via MetricsSystem. 
   Makes sense if the sink is too expensive. However, in a pull model things are not much different metrics will be sampled anyway at some interval eg. Prometheus, via some metrics exporter, independently of flushing. So the equivalent for a push model would be to push at some appropriate rate.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] skonto edited a comment on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
skonto edited a comment on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-926435483


   @BOOTMGR imho triggerExecution time looks a good candidate to be part of the metrics that should be exposed to the metrics system if needed automatically (what else is available from the query reporting info?) without having to depend on the listener.
   I think long term would be nice to have that.
   
   >I do not want to push it at every batch end (expensive sink), rather I would like to have them flushed along with other metrics via MetricsSystem. 
   
   Makes sense if the sink is too expensive. However, in a pull model things are not much different, metrics will be sampled anyway at some interval eg. Prometheus, via some metrics exporter, independently of flushing. So the equivalent in terms of costs, for a push model would be to push at some appropriate rate.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] skonto edited a comment on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
skonto edited a comment on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-926435483


   @BOOTMGR imho triggerExecution time looks a good candidate to be part of the metrics that should be exposed to the metrics system if needed automatically (what else is available from the query reporting info?) without having to depend on the listener. 
   
   >I do not want to push it at every batch end (expensive sink), rather I would like to have them flushed along with other metrics via MetricsSystem. 
   
   Makes sense if the sink is too expensive. However, in a pull model things are not much different, metrics will be sampled anyway at some interval eg. Prometheus, via some metrics exporter, independently of flushing. So the equivalent in terms of costs, for a push model would be to push at some appropriate rate.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-937867485


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48464/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-937983513


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48472/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] BOOTMGR commented on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
BOOTMGR commented on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-923667197


   @skonto I don't have usecase to create a custom metric in the listener, what I want to achieve is to collect p99 and the absolute value of triggerExecution time. I do not want to push it at every batch end (expensive sink), rather I would like to have them flushed along with other metrics via MetricsSystem. If there is any other way to collect triggerExecution, kindly let me know.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-937983474


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48472/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] mridulm commented on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
mridulm commented on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-937777752


   Merged to master.
   Thanks for working on this @BOOTMGR !
   
   Thanks for the reviews @Ngone51, @AngersZhuuuu :-)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-927222515


   **[Test build #143629 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143629/testReport)** for PR 34039 at commit [`5e6b359`](https://github.com/apache/spark/commit/5e6b3596da38ed0a98ef47c97169faf3ce52fa70).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-927232425


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48141/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] skonto edited a comment on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
skonto edited a comment on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-926435483


   @BOOTMGR imho triggerExecution time looks a good candidate to be part of the metrics that should be exposed to the metrics system if needed (what else is available from the query reporting info?) without having to depend on the listener. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] asfgit closed pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #34039:
URL: https://github.com/apache/spark/pull/34039


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-937728315


   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-937728315


   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-927227822


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48141/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-927240723


   **[Test build #143629 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143629/testReport)** for PR 34039 at commit [`5e6b359`](https://github.com/apache/spark/commit/5e6b3596da38ed0a98ef47c97169faf3ce52fa70).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-927222094


   ok to test


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-927233298


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48141/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-927241164


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143629/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] mridulm commented on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
mridulm commented on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-937777752


   Merged to master.
   Thanks for working on this @BOOTMGR !
   
   Thanks for the reviews @Ngone51, @AngersZhuuuu :-)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-937867485






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-937867485


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48464/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-922243538


   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] skonto edited a comment on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
skonto edited a comment on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-922871546


   One of the basic use cases of a listener is to ship metrics and other info when a specific event occurs eg. batch completed, I was wondering (thinking out loud) what kind of metrics are produced from a listener that either a) are not core Spark metrics and b) are not shipped from the listener to some external storage instead of exporting them locally.  
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34039: [SPARK-36798]: Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-922243538


   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-937983513


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48472/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-937728315






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] asfgit closed pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #34039:
URL: https://github.com/apache/spark/pull/34039


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] skonto edited a comment on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
skonto edited a comment on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-926435483


   @BOOTMGR imho triggerExecution time looks a good candidate to be part of the metrics that should be exposed by Spark to the metrics system if needed automatically (what else is missing from the query reporting info?) without having to depend on the listener.I think long term would be nice to have that. It is fine to make the metrics system work as expected in the listener (this issue, it should be fixed), but the listener use case matches better for custom metrics (that is why I mentioned that).
   
   >I do not want to push it at every batch end (expensive sink), rather I would like to have them flushed along with other metrics via MetricsSystem. 
   
   Makes sense if the sink is too expensive. However, in a pull model things are not much different, metrics will be sampled anyway at some interval eg. Prometheus, via some metrics exporter, independently of flushing. So the equivalent in terms of costs, for a push model would be to push at some appropriate rate.
   
   Anyway I was thinking out loud to see if there is more work to be done here than initially captured with this ticket.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-927241164


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143629/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-927222515


   **[Test build #143629 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143629/testReport)** for PR 34039 at commit [`5e6b359`](https://github.com/apache/spark/commit/5e6b3596da38ed0a98ef47c97169faf3ce52fa70).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-922595787


   cc @mridulm, @tgravescs and @Ngone51 FYI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-927233298


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48141/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] skonto edited a comment on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
skonto edited a comment on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-922871546






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-937977382


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48472/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-937977382






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org