You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/12/24 10:24:16 UTC

[GitHub] [spark] liupc opened a new pull request #27002: [SPARK-30346]Improve logging when events dropped

liupc opened a new pull request #27002: [SPARK-30346]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002
 
 
   
   ### What changes were proposed in this pull request?
   
   1. Make logging events dropping every 60s works fine, the orignal implementaion some times not working due to susequent events comming and updating the DroppedEventCounter
   2. Logging thread dump when the events dropping was logged every 60s.
   
   ### Why are the changes needed?
   
   Currenly, the logging may be skipped and delayed a long time under high concurrency, that make debugging hard. So This PR will try to fix it.
   
   Also this PR added logging for thread dump of dispatcher thread to help debugging performance issue that may causing events dropped.
   
   ### Does this PR introduce any user-facing change?
   
   No
   
   
   ### How was this patch tested?
   
   NA
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] liupc commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
liupc commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r368448171
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -167,20 +170,27 @@ private class AsyncEventQueue(
     }
     logTrace(s"Dropping event $event")
 
-    val droppedCount = droppedEventsCounter.get
+    val droppedCount = droppedEventsCounter.get - lastDroppedEventsCounter
+    val lastReportTime = lastReportTimestamp.get
+    val curTime = System.currentTimeMillis()
     if (droppedCount > 0) {
       // Don't log too frequently
-      if (System.currentTimeMillis() - lastReportTimestamp >= 60 * 1000) {
-        // There may be multiple threads trying to decrease droppedEventsCounter.
-        // Use "compareAndSet" to make sure only one thread can win.
-        // And if another thread is increasing droppedEventsCounter, "compareAndSet" will fail and
-        // then that thread will update it.
-        if (droppedEventsCounter.compareAndSet(droppedCount, 0)) {
-          val prevLastReportTimestamp = lastReportTimestamp
-          lastReportTimestamp = System.currentTimeMillis()
-          val previous = new java.util.Date(prevLastReportTimestamp)
+      if (curTime - lastReportTime >= LOGGING_INTERVAL) {
+        // There may be multiple threads trying to logging dropped events,
+        // Use 'compareAndSet' to make sure only one thread can win.
+        if (lastReportTimestamp.compareAndSet(lastReportTime, curTime)) {
+          val lastReportTime = lastReportTimestamp.get
 
 Review comment:
   Good catching, I'll update it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jiangxb1987 commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
jiangxb1987 commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r379206377
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -167,20 +170,29 @@ private class AsyncEventQueue(
     }
     logTrace(s"Dropping event $event")
 
-    val droppedCount = droppedEventsCounter.get
+    val droppedCount = droppedEventsCounter.get - lastDroppedEventsCounter
+    val lastReportTime = lastReportTimestamp.get
+    val curTime = System.currentTimeMillis()
     if (droppedCount > 0) {
       // Don't log too frequently
-      if (System.currentTimeMillis() - lastReportTimestamp >= 60 * 1000) {
-        // There may be multiple threads trying to decrease droppedEventsCounter.
-        // Use "compareAndSet" to make sure only one thread can win.
-        // And if another thread is increasing droppedEventsCounter, "compareAndSet" will fail and
-        // then that thread will update it.
-        if (droppedEventsCounter.compareAndSet(droppedCount, 0)) {
 
 Review comment:
   Can we just remove this check and use `droppedEventsCounter.getAndSet()` instead?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] seanli-rallyhealth commented on a change in pull request #27002: [SPARK-30346]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
seanli-rallyhealth commented on a change in pull request #27002: [SPARK-30346]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r361224019
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -167,20 +170,29 @@ private class AsyncEventQueue(
     }
     logTrace(s"Dropping event $event")
 
-    val droppedCount = droppedEventsCounter.get
+    val droppedCount = droppedEventsCounter.get - lastDroppedEventsCounter
+    val lastReportTime = lastReportTimestamp.get
+    val curTime = System.currentTimeMillis()
     if (droppedCount > 0) {
       // Don't log too frequently
-      if (System.currentTimeMillis() - lastReportTimestamp >= 60 * 1000) {
-        // There may be multiple threads trying to decrease droppedEventsCounter.
-        // Use "compareAndSet" to make sure only one thread can win.
-        // And if another thread is increasing droppedEventsCounter, "compareAndSet" will fail and
-        // then that thread will update it.
-        if (droppedEventsCounter.compareAndSet(droppedCount, 0)) {
-          val prevLastReportTimestamp = lastReportTimestamp
-          lastReportTimestamp = System.currentTimeMillis()
+      if (curTime - lastReportTime >= 60 * 1000) {
 
 Review comment:
   define constant for 60 * 1000

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-576238002
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] vanzin commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
vanzin commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r363990340
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -167,20 +170,27 @@ private class AsyncEventQueue(
     }
     logTrace(s"Dropping event $event")
 
-    val droppedCount = droppedEventsCounter.get
+    val droppedCount = droppedEventsCounter.get - lastDroppedEventsCounter
+    val lastReportTime = lastReportTimestamp.get
+    val curTime = System.currentTimeMillis()
     if (droppedCount > 0) {
       // Don't log too frequently
-      if (System.currentTimeMillis() - lastReportTimestamp >= 60 * 1000) {
-        // There may be multiple threads trying to decrease droppedEventsCounter.
-        // Use "compareAndSet" to make sure only one thread can win.
-        // And if another thread is increasing droppedEventsCounter, "compareAndSet" will fail and
-        // then that thread will update it.
-        if (droppedEventsCounter.compareAndSet(droppedCount, 0)) {
-          val prevLastReportTimestamp = lastReportTimestamp
-          lastReportTimestamp = System.currentTimeMillis()
-          val previous = new java.util.Date(prevLastReportTimestamp)
+      if (curTime - lastReportTime >= LOGGING_INTERVAL) {
+        // There may be multiple threads trying to logging dropped events,
+        // Use 'compareAndSet' to make sure only one thread can win.
+        if (lastReportTimestamp.compareAndSet(lastReportTime, curTime)) {
+          val lastReportTime = lastReportTimestamp.get
 
 Review comment:
   Actually this is masking `lastReportTime` from outside the condition, and making the log always print the current time instead of the intended timestamp...

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on a change in pull request #27002: [SPARK-30346]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on a change in pull request #27002: [SPARK-30346]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r361574397
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -213,4 +225,6 @@ private object AsyncEventQueue {
 
   val POISON_PILL = new SparkListenerEvent() { }
 
+  val LOGGING_INTERVAL = 60 * 1000
+
 
 Review comment:
   nit: unnecessary empty line

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-576237689
 
 
   **[Test build #117111 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117111/testReport)** for PR 27002 at commit [`a34d020`](https://github.com/apache/spark/commit/a34d0207b6b763f9ff9732cf5d6646c1d5359152).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27002: [SPARK-30346]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27002: [SPARK-30346]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-568717959
 
 
   Can one of the admins verify this patch?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-586798746
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23276/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r364008558
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -167,20 +170,27 @@ private class AsyncEventQueue(
     }
     logTrace(s"Dropping event $event")
 
-    val droppedCount = droppedEventsCounter.get
+    val droppedCount = droppedEventsCounter.get - lastDroppedEventsCounter
+    val lastReportTime = lastReportTimestamp.get
+    val curTime = System.currentTimeMillis()
     if (droppedCount > 0) {
       // Don't log too frequently
-      if (System.currentTimeMillis() - lastReportTimestamp >= 60 * 1000) {
-        // There may be multiple threads trying to decrease droppedEventsCounter.
-        // Use "compareAndSet" to make sure only one thread can win.
-        // And if another thread is increasing droppedEventsCounter, "compareAndSet" will fail and
-        // then that thread will update it.
-        if (droppedEventsCounter.compareAndSet(droppedCount, 0)) {
-          val prevLastReportTimestamp = lastReportTimestamp
-          lastReportTimestamp = System.currentTimeMillis()
-          val previous = new java.util.Date(prevLastReportTimestamp)
+      if (curTime - lastReportTime >= LOGGING_INTERVAL) {
+        // There may be multiple threads trying to logging dropped events,
+        // Use 'compareAndSet' to make sure only one thread can win.
+        if (lastReportTimestamp.compareAndSet(lastReportTime, curTime)) {
+          val lastReportTime = lastReportTimestamp.get
 
 Review comment:
   Nice finding. Looks like this line should be just removed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on a change in pull request #27002: [SPARK-30346]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on a change in pull request #27002: [SPARK-30346]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r361573462
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -67,8 +67,11 @@ private class AsyncEventQueue(
   /** A counter for dropped events. It will be reset every time we log it. */
   private val droppedEventsCounter = new AtomicLong(0L)
 
+  /** A counter to keep number of dropped events last time it was logged */
+  private var lastDroppedEventsCounter: Long = 0L
 
 Review comment:
   should have `@volatile` to safely access this.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on a change in pull request #27002: [SPARK-30346]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on a change in pull request #27002: [SPARK-30346]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r361573232
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -167,20 +170,29 @@ private class AsyncEventQueue(
     }
     logTrace(s"Dropping event $event")
 
-    val droppedCount = droppedEventsCounter.get
+    val droppedCount = droppedEventsCounter.get - lastDroppedEventsCounter
+    val lastReportTime = lastReportTimestamp.get
+    val curTime = System.currentTimeMillis()
     if (droppedCount > 0) {
       // Don't log too frequently
-      if (System.currentTimeMillis() - lastReportTimestamp >= 60 * 1000) {
-        // There may be multiple threads trying to decrease droppedEventsCounter.
-        // Use "compareAndSet" to make sure only one thread can win.
-        // And if another thread is increasing droppedEventsCounter, "compareAndSet" will fail and
-        // then that thread will update it.
-        if (droppedEventsCounter.compareAndSet(droppedCount, 0)) {
 
 Review comment:
   So if I understand correctly, droppedEventsCounter can be changed between L170 to L178 even without contention of L178, as L161 allows other threads to increase the value as well. Do I understand correctly?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-576189254
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-571344786
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-571382954
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116192/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-586833013
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118521/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-576191992
 
 
   **[Test build #117111 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117111/testReport)** for PR 27002 at commit [`a34d020`](https://github.com/apache/spark/commit/a34d0207b6b763f9ff9732cf5d6646c1d5359152).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-571382954
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116192/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR edited a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
HeartSaVioR edited a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-571408053
 
 
   Quoting my comment again:
   
   > Spark will introduce a new configuration to log the event and listener name if the execution takes more than configured threshold; it would be more helpful. The commit is 0346afa
   
   This configuration will directly point out which listener & event contributes the lag.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-586833013
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118521/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on a change in pull request #27002: [SPARK-30346]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on a change in pull request #27002: [SPARK-30346]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r361573232
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -167,20 +170,29 @@ private class AsyncEventQueue(
     }
     logTrace(s"Dropping event $event")
 
-    val droppedCount = droppedEventsCounter.get
+    val droppedCount = droppedEventsCounter.get - lastDroppedEventsCounter
+    val lastReportTime = lastReportTimestamp.get
+    val curTime = System.currentTimeMillis()
     if (droppedCount > 0) {
       // Don't log too frequently
-      if (System.currentTimeMillis() - lastReportTimestamp >= 60 * 1000) {
-        // There may be multiple threads trying to decrease droppedEventsCounter.
-        // Use "compareAndSet" to make sure only one thread can win.
-        // And if another thread is increasing droppedEventsCounter, "compareAndSet" will fail and
-        // then that thread will update it.
-        if (droppedEventsCounter.compareAndSet(droppedCount, 0)) {
 
 Review comment:
   So if I understand correctly, droppedEventsCounter can be changed between L170 to L178 even without contention of L178, as L161 allows other threads to increase the value as well. So if the contention occurs continuously, effectively no one will pass this if statement. Do I understand correctly?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on a change in pull request #27002: [SPARK-30346]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on a change in pull request #27002: [SPARK-30346]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r361573515
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -67,8 +67,11 @@ private class AsyncEventQueue(
   /** A counter for dropped events. It will be reset every time we log it. */
   private val droppedEventsCounter = new AtomicLong(0L)
 
+  /** A counter to keep number of dropped events last time it was logged */
+  private var lastDroppedEventsCounter: Long = 0L
+
   /** When `droppedEventsCounter` was logged last time in milliseconds. */
-  @volatile private var lastReportTimestamp = 0L
+  @volatile private val lastReportTimestamp = new AtomicLong(0L)
 
 Review comment:
   No longer need to have `@volatile` as it becomes val.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-568718244
 
 
   Can one of the admins verify this patch?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on a change in pull request #27002: [SPARK-30346]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on a change in pull request #27002: [SPARK-30346]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r361574151
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -167,20 +170,29 @@ private class AsyncEventQueue(
     }
     logTrace(s"Dropping event $event")
 
-    val droppedCount = droppedEventsCounter.get
+    val droppedCount = droppedEventsCounter.get - lastDroppedEventsCounter
+    val lastReportTime = lastReportTimestamp.get
+    val curTime = System.currentTimeMillis()
     if (droppedCount > 0) {
       // Don't log too frequently
-      if (System.currentTimeMillis() - lastReportTimestamp >= 60 * 1000) {
-        // There may be multiple threads trying to decrease droppedEventsCounter.
-        // Use "compareAndSet" to make sure only one thread can win.
-        // And if another thread is increasing droppedEventsCounter, "compareAndSet" will fail and
-        // then that thread will update it.
-        if (droppedEventsCounter.compareAndSet(droppedCount, 0)) {
-          val prevLastReportTimestamp = lastReportTimestamp
-          lastReportTimestamp = System.currentTimeMillis()
+      if (curTime - lastReportTime >= LOGGING_INTERVAL) {
+        // There may be multiple threads trying to logging dropped events,
+        // Use 'compareAndSet' to make sure only one thread can win.
+        // After set the 'lastReportTimestamp', the next time we come here will
 
 Review comment:
   Comments in L181-182 might be just redundant as above if statement is clearly showing the intention.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jiangxb1987 commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
jiangxb1987 commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r379709740
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -167,22 +170,19 @@ private class AsyncEventQueue(
     }
     logTrace(s"Dropping event $event")
 
-    val droppedCount = droppedEventsCounter.get
-    if (droppedCount > 0) {
-      // Don't log too frequently
-      if (System.currentTimeMillis() - lastReportTimestamp >= 60 * 1000) {
-        // There may be multiple threads trying to decrease droppedEventsCounter.
-        // Use "compareAndSet" to make sure only one thread can win.
-        // And if another thread is increasing droppedEventsCounter, "compareAndSet" will fail and
-        // then that thread will update it.
-        if (droppedEventsCounter.compareAndSet(droppedCount, 0)) {
-          val prevLastReportTimestamp = lastReportTimestamp
-          lastReportTimestamp = System.currentTimeMillis()
-          val previous = new java.util.Date(prevLastReportTimestamp)
+    val droppedCount = droppedEventsCounter.get - lastDroppedEventsCounter
+    val lastReportTime = lastReportTimestamp.get
+    val curTime = System.currentTimeMillis()
+    // Don't log too frequently
+    if (droppedCount > 0 && curTime - lastReportTime >= LOGGING_INTERVAL) {
+        // There may be multiple threads trying to logging dropped events,
+        // Use 'compareAndSet' to make sure only one thread can win.
+        if (lastReportTimestamp.compareAndSet(lastReportTime, curTime)) {
+          val previous = new java.util.Date(lastReportTime)
+          lastDroppedEventsCounter = droppedCount
 
 Review comment:
   IIUC this should be reset to `droppedEventsCounter.get` ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on a change in pull request #27002: [SPARK-30346]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on a change in pull request #27002: [SPARK-30346]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r361574381
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -167,20 +170,29 @@ private class AsyncEventQueue(
     }
     logTrace(s"Dropping event $event")
 
-    val droppedCount = droppedEventsCounter.get
+    val droppedCount = droppedEventsCounter.get - lastDroppedEventsCounter
+    val lastReportTime = lastReportTimestamp.get
+    val curTime = System.currentTimeMillis()
     if (droppedCount > 0) {
       // Don't log too frequently
-      if (System.currentTimeMillis() - lastReportTimestamp >= 60 * 1000) {
-        // There may be multiple threads trying to decrease droppedEventsCounter.
-        // Use "compareAndSet" to make sure only one thread can win.
-        // And if another thread is increasing droppedEventsCounter, "compareAndSet" will fail and
-        // then that thread will update it.
-        if (droppedEventsCounter.compareAndSet(droppedCount, 0)) {
-          val prevLastReportTimestamp = lastReportTimestamp
-          lastReportTimestamp = System.currentTimeMillis()
+      if (curTime - lastReportTime >= LOGGING_INTERVAL) {
+        // There may be multiple threads trying to logging dropped events,
+        // Use 'compareAndSet' to make sure only one thread can win.
+        // After set the 'lastReportTimestamp', the next time we come here will
+        // be 60s later.
+        if (lastReportTimestamp.compareAndSet(lastReportTime, curTime)) {
+          val prevLastReportTimestamp = lastReportTimestamp.get
 
 Review comment:
   Here you'll get the updated timestamp (curTime) instead of previous one. use `lastReportTime`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-576238002
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-586959387
 
 
   thanks, merging to master!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-586061828
 
 
   @liupc Any update here?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] liupc commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
liupc commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r368448689
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -167,20 +170,27 @@ private class AsyncEventQueue(
     }
     logTrace(s"Dropping event $event")
 
-    val droppedCount = droppedEventsCounter.get
+    val droppedCount = droppedEventsCounter.get - lastDroppedEventsCounter
+    val lastReportTime = lastReportTimestamp.get
+    val curTime = System.currentTimeMillis()
     if (droppedCount > 0) {
       // Don't log too frequently
-      if (System.currentTimeMillis() - lastReportTimestamp >= 60 * 1000) {
-        // There may be multiple threads trying to decrease droppedEventsCounter.
-        // Use "compareAndSet" to make sure only one thread can win.
-        // And if another thread is increasing droppedEventsCounter, "compareAndSet" will fail and
-        // then that thread will update it.
-        if (droppedEventsCounter.compareAndSet(droppedCount, 0)) {
-          val prevLastReportTimestamp = lastReportTimestamp
-          lastReportTimestamp = System.currentTimeMillis()
-          val previous = new java.util.Date(prevLastReportTimestamp)
+      if (curTime - lastReportTime >= LOGGING_INTERVAL) {
 
 Review comment:
   Done

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-586833003
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] liupc commented on a change in pull request #27002: [SPARK-30346]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
liupc commented on a change in pull request #27002: [SPARK-30346]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r361300152
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -167,20 +170,29 @@ private class AsyncEventQueue(
     }
     logTrace(s"Dropping event $event")
 
-    val droppedCount = droppedEventsCounter.get
+    val droppedCount = droppedEventsCounter.get - lastDroppedEventsCounter
+    val lastReportTime = lastReportTimestamp.get
+    val curTime = System.currentTimeMillis()
     if (droppedCount > 0) {
       // Don't log too frequently
-      if (System.currentTimeMillis() - lastReportTimestamp >= 60 * 1000) {
-        // There may be multiple threads trying to decrease droppedEventsCounter.
-        // Use "compareAndSet" to make sure only one thread can win.
-        // And if another thread is increasing droppedEventsCounter, "compareAndSet" will fail and
-        // then that thread will update it.
-        if (droppedEventsCounter.compareAndSet(droppedCount, 0)) {
-          val prevLastReportTimestamp = lastReportTimestamp
-          lastReportTimestamp = System.currentTimeMillis()
+      if (curTime - lastReportTime >= 60 * 1000) {
 
 Review comment:
   Done

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] vanzin commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
vanzin commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r363511842
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -167,20 +170,29 @@ private class AsyncEventQueue(
     }
     logTrace(s"Dropping event $event")
 
-    val droppedCount = droppedEventsCounter.get
+    val droppedCount = droppedEventsCounter.get - lastDroppedEventsCounter
+    val lastReportTime = lastReportTimestamp.get
+    val curTime = System.currentTimeMillis()
     if (droppedCount > 0) {
       // Don't log too frequently
-      if (System.currentTimeMillis() - lastReportTimestamp >= 60 * 1000) {
-        // There may be multiple threads trying to decrease droppedEventsCounter.
-        // Use "compareAndSet" to make sure only one thread can win.
-        // And if another thread is increasing droppedEventsCounter, "compareAndSet" will fail and
-        // then that thread will update it.
-        if (droppedEventsCounter.compareAndSet(droppedCount, 0)) {
-          val prevLastReportTimestamp = lastReportTimestamp
-          lastReportTimestamp = System.currentTimeMillis()
+      if (curTime - lastReportTime >= LOGGING_INTERVAL) {
+        // There may be multiple threads trying to logging dropped events,
+        // Use 'compareAndSet' to make sure only one thread can win.
+        // After set the 'lastReportTimestamp', the next time we come here will
+        // be 60s later.
+        if (lastReportTimestamp.compareAndSet(lastReportTime, curTime)) {
+          val prevLastReportTimestamp = lastReportTimestamp.get
           val previous = new java.util.Date(prevLastReportTimestamp)
+          lastDroppedEventsCounter = droppedCount
           logWarning(s"Dropped $droppedCount events from $name since " +
             s"${if (prevLastReportTimestamp == 0) "the application started" else s"$previous"}.")
+          // Logging thread dump when events from appStatus was dropped
 
 Review comment:
   Agree that this log is less useful than it might seem at first.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-586799832
 
 
   **[Test build #118521 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118521/testReport)** for PR 27002 at commit [`a216f58`](https://github.com/apache/spark/commit/a216f58ca2eb991408097d3f9aadc6ce07b31f8d).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-571344300
 
 
   **[Test build #116192 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116192/testReport)** for PR 27002 at commit [`5bf09c2`](https://github.com/apache/spark/commit/5bf09c264a9b147b8b1eda5bb32dfc16640e5b58).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on issue #27002: [SPARK-30346]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on issue #27002: [SPARK-30346]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-569179996
 
 
   cc. @vanzin 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] liupc commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
liupc commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r363199041
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -167,20 +170,29 @@ private class AsyncEventQueue(
     }
     logTrace(s"Dropping event $event")
 
-    val droppedCount = droppedEventsCounter.get
+    val droppedCount = droppedEventsCounter.get - lastDroppedEventsCounter
+    val lastReportTime = lastReportTimestamp.get
+    val curTime = System.currentTimeMillis()
     if (droppedCount > 0) {
       // Don't log too frequently
-      if (System.currentTimeMillis() - lastReportTimestamp >= 60 * 1000) {
-        // There may be multiple threads trying to decrease droppedEventsCounter.
-        // Use "compareAndSet" to make sure only one thread can win.
-        // And if another thread is increasing droppedEventsCounter, "compareAndSet" will fail and
-        // then that thread will update it.
-        if (droppedEventsCounter.compareAndSet(droppedCount, 0)) {
-          val prevLastReportTimestamp = lastReportTimestamp
-          lastReportTimestamp = System.currentTimeMillis()
+      if (curTime - lastReportTime >= LOGGING_INTERVAL) {
+        // There may be multiple threads trying to logging dropped events,
+        // Use 'compareAndSet' to make sure only one thread can win.
+        // After set the 'lastReportTimestamp', the next time we come here will
+        // be 60s later.
+        if (lastReportTimestamp.compareAndSet(lastReportTime, curTime)) {
+          val prevLastReportTimestamp = lastReportTimestamp.get
 
 Review comment:
   Done

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-586799832
 
 
   **[Test build #118521 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118521/testReport)** for PR 27002 at commit [`a216f58`](https://github.com/apache/spark/commit/a216f58ca2eb991408097d3f9aadc6ce07b31f8d).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-571344786
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-571344802
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/20986/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on a change in pull request #27002: [SPARK-30346]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on a change in pull request #27002: [SPARK-30346]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r361572976
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -167,20 +170,29 @@ private class AsyncEventQueue(
     }
     logTrace(s"Dropping event $event")
 
-    val droppedCount = droppedEventsCounter.get
+    val droppedCount = droppedEventsCounter.get - lastDroppedEventsCounter
+    val lastReportTime = lastReportTimestamp.get
+    val curTime = System.currentTimeMillis()
     if (droppedCount > 0) {
       // Don't log too frequently
-      if (System.currentTimeMillis() - lastReportTimestamp >= 60 * 1000) {
-        // There may be multiple threads trying to decrease droppedEventsCounter.
-        // Use "compareAndSet" to make sure only one thread can win.
-        // And if another thread is increasing droppedEventsCounter, "compareAndSet" will fail and
-        // then that thread will update it.
-        if (droppedEventsCounter.compareAndSet(droppedCount, 0)) {
-          val prevLastReportTimestamp = lastReportTimestamp
-          lastReportTimestamp = System.currentTimeMillis()
+      if (curTime - lastReportTime >= LOGGING_INTERVAL) {
+        // There may be multiple threads trying to logging dropped events,
+        // Use 'compareAndSet' to make sure only one thread can win.
+        // After set the 'lastReportTimestamp', the next time we come here will
+        // be 60s later.
+        if (lastReportTimestamp.compareAndSet(lastReportTime, curTime)) {
+          val prevLastReportTimestamp = lastReportTimestamp.get
           val previous = new java.util.Date(prevLastReportTimestamp)
+          lastDroppedEventsCounter = droppedCount
           logWarning(s"Dropped $droppedCount events from $name since " +
             s"${if (prevLastReportTimestamp == 0) "the application started" else s"$previous"}.")
+          // Logging thread dump when events from appStatus was dropped
 
 Review comment:
   I'm not sure how much this helps; as current event being processed may not be culprit to lag the overall executions. 
   
   Spark will introduce a new configuration to log the event and listener name if the execution takes more than configured threshold; it would be more helpful. The commit is 
   https://github.com/apache/spark/commit/0346afa8fc348aa1b3f5110df747a64e3b2da388

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27002: [SPARK-30346]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27002: [SPARK-30346]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-568718244
 
 
   Can one of the admins verify this patch?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] seanli-rallyhealth commented on a change in pull request #27002: [SPARK-30346]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
seanli-rallyhealth commented on a change in pull request #27002: [SPARK-30346]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r361223862
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -67,8 +67,11 @@ private class AsyncEventQueue(
   /** A counter for dropped events. It will be reset every time we log it. */
   private val droppedEventsCounter = new AtomicLong(0L)
 
+  /** A counter to keep dropped events count last time it was logged */
 
 Review comment:
   hint: "keep number of dropped events"

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-588013199
 
 
   also merged to branch-3.0.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] liupc commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
liupc commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-571377672
 
 
   Thanks for reply @vanzin, I think this log can still do much help even after it was logged at first, in real cluster, the logging always configured as rolling by size, and the log is rolling fast, expecially for SparkSQL thriftserver case. With this log we can still know what happened and probably find out what make it slow with the threaddump even after the log was rolled.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] liupc commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
liupc commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r363198001
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -67,8 +67,11 @@ private class AsyncEventQueue(
   /** A counter for dropped events. It will be reset every time we log it. */
   private val droppedEventsCounter = new AtomicLong(0L)
 
+  /** A counter to keep number of dropped events last time it was logged */
+  private var lastDroppedEventsCounter: Long = 0L
+
   /** When `droppedEventsCounter` was logged last time in milliseconds. */
-  @volatile private var lastReportTimestamp = 0L
+  @volatile private val lastReportTimestamp = new AtomicLong(0L)
 
 Review comment:
   Yes, I'ill update it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-586832445
 
 
   **[Test build #118521 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118521/testReport)** for PR 27002 at commit [`a216f58`](https://github.com/apache/spark/commit/a216f58ca2eb991408097d3f9aadc6ce07b31f8d).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-576192761
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21876/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] vanzin commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
vanzin commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r363990881
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -167,20 +170,27 @@ private class AsyncEventQueue(
     }
     logTrace(s"Dropping event $event")
 
-    val droppedCount = droppedEventsCounter.get
+    val droppedCount = droppedEventsCounter.get - lastDroppedEventsCounter
+    val lastReportTime = lastReportTimestamp.get
+    val curTime = System.currentTimeMillis()
     if (droppedCount > 0) {
       // Don't log too frequently
-      if (System.currentTimeMillis() - lastReportTimestamp >= 60 * 1000) {
-        // There may be multiple threads trying to decrease droppedEventsCounter.
-        // Use "compareAndSet" to make sure only one thread can win.
-        // And if another thread is increasing droppedEventsCounter, "compareAndSet" will fail and
-        // then that thread will update it.
-        if (droppedEventsCounter.compareAndSet(droppedCount, 0)) {
-          val prevLastReportTimestamp = lastReportTimestamp
-          lastReportTimestamp = System.currentTimeMillis()
-          val previous = new java.util.Date(prevLastReportTimestamp)
+      if (curTime - lastReportTime >= LOGGING_INTERVAL) {
+        // There may be multiple threads trying to logging dropped events,
+        // Use 'compareAndSet' to make sure only one thread can win.
+        if (lastReportTimestamp.compareAndSet(lastReportTime, curTime)) {
+          val lastReportTime = lastReportTimestamp.get
+          val previous = new java.util.Date(lastReportTime)
+          lastDroppedEventsCounter = droppedCount
           logWarning(s"Dropped $droppedCount events from $name since " +
-            s"${if (prevLastReportTimestamp == 0) "the application started" else s"$previous"}.")
+            s"${if (lastReportTime == 0) "the application started" else s"$previous"}.")
+          // Logging thread dump when events from appStatus was dropped
+          Utils.getThreadDumpForThread(dispatchThread.getId).foreach { thread =>
+            if (thread.threadName.contains(LiveListenerBus.APP_STATUS_QUEUE)) {
+              logWarning(s"Event dropped!!!Thread dump from dispatch thread " +
 
 Review comment:
   If you want to keep this I'd demote it from warning to info or even debug. The separate functionality to measure each event's processing time is really the better option for debugging things here.
   
   (And also, in that case, gate the whole block with e.g. `log.isDebugEnabled`.)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on a change in pull request #27002: [SPARK-30346]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on a change in pull request #27002: [SPARK-30346]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r361573462
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -67,8 +67,11 @@ private class AsyncEventQueue(
   /** A counter for dropped events. It will be reset every time we log it. */
   private val droppedEventsCounter = new AtomicLong(0L)
 
+  /** A counter to keep number of dropped events last time it was logged */
+  private var lastDroppedEventsCounter: Long = 0L
 
 Review comment:
   should have `@volatile` to safely get/update this across multiple threads.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-571379982
 
 
   The thing is that the current event dispatcher is processing now may not be the culprit of slowdown, and the log could give wrong information which points out innocent one which would lead us wasting time to investigate. What's worse, we cannot determine whether the information is correct or not. Please correct me if I'm missing here.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan closed pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
cloud-fan closed pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-586798741
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-571408053
 
 
   Quoting my comment again:
   
   > Spark will introduce a new configuration to log the event and listener name if the execution takes more than configured threshold; it would be more helpful. The commit is 0346afa

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jiangxb1987 commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
jiangxb1987 commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r379206377
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -167,20 +170,29 @@ private class AsyncEventQueue(
     }
     logTrace(s"Dropping event $event")
 
-    val droppedCount = droppedEventsCounter.get
+    val droppedCount = droppedEventsCounter.get - lastDroppedEventsCounter
+    val lastReportTime = lastReportTimestamp.get
+    val curTime = System.currentTimeMillis()
     if (droppedCount > 0) {
       // Don't log too frequently
-      if (System.currentTimeMillis() - lastReportTimestamp >= 60 * 1000) {
-        // There may be multiple threads trying to decrease droppedEventsCounter.
-        // Use "compareAndSet" to make sure only one thread can win.
-        // And if another thread is increasing droppedEventsCounter, "compareAndSet" will fail and
-        // then that thread will update it.
-        if (droppedEventsCounter.compareAndSet(droppedCount, 0)) {
 
 Review comment:
   Can we just remove this check and use `droppedEventsCounter.getAndSet()` instead?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-586798746
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23276/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] liupc commented on issue #27002: [SPARK-30346]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
liupc commented on issue #27002: [SPARK-30346]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-568886889
 
 
   @seanli-rallyhealth Thanks for reply, I've updated as commented.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] vanzin commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
vanzin commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-571342214
 
 
   ok to test

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-576189263
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21874/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-576192761
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21876/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-571382948
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] liupc commented on a change in pull request #27002: [SPARK-30346]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
liupc commented on a change in pull request #27002: [SPARK-30346]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r361300181
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -67,8 +67,11 @@ private class AsyncEventQueue(
   /** A counter for dropped events. It will be reset every time we log it. */
   private val droppedEventsCounter = new AtomicLong(0L)
 
+  /** A counter to keep dropped events count last time it was logged */
 
 Review comment:
   Done

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] seanli-rallyhealth commented on a change in pull request #27002: [SPARK-30346]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
seanli-rallyhealth commented on a change in pull request #27002: [SPARK-30346]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r361223967
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -67,8 +67,11 @@ private class AsyncEventQueue(
   /** A counter for dropped events. It will be reset every time we log it. */
   private val droppedEventsCounter = new AtomicLong(0L)
 
+  /** A counter to keep dropped events count last time it was logged */
+  private var lastDroppedEventsCounter: Long = 0L
+
   /** When `droppedEventsCounter` was logged last time in milliseconds. */
-  @volatile private var lastReportTimestamp = 0L
+  @volatile private var lastReportTimestamp = new AtomicLong(0L)
 
 Review comment:
   no need to be var, use val set()

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-576189254
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-576192753
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r379828940
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -167,22 +170,19 @@ private class AsyncEventQueue(
     }
     logTrace(s"Dropping event $event")
 
-    val droppedCount = droppedEventsCounter.get
-    if (droppedCount > 0) {
-      // Don't log too frequently
-      if (System.currentTimeMillis() - lastReportTimestamp >= 60 * 1000) {
-        // There may be multiple threads trying to decrease droppedEventsCounter.
-        // Use "compareAndSet" to make sure only one thread can win.
-        // And if another thread is increasing droppedEventsCounter, "compareAndSet" will fail and
-        // then that thread will update it.
-        if (droppedEventsCounter.compareAndSet(droppedCount, 0)) {
-          val prevLastReportTimestamp = lastReportTimestamp
-          lastReportTimestamp = System.currentTimeMillis()
-          val previous = new java.util.Date(prevLastReportTimestamp)
+    val droppedCount = droppedEventsCounter.get - lastDroppedEventsCounter
+    val lastReportTime = lastReportTimestamp.get
+    val curTime = System.currentTimeMillis()
+    // Don't log too frequently
+    if (droppedCount > 0 && curTime - lastReportTime >= LOGGING_INTERVAL) {
+        // There may be multiple threads trying to logging dropped events,
+        // Use 'compareAndSet' to make sure only one thread can win.
+        if (lastReportTimestamp.compareAndSet(lastReportTime, curTime)) {
+          val previous = new java.util.Date(lastReportTime)
+          lastDroppedEventsCounter = droppedCount
 
 Review comment:
   Nice finding. I'd use a variable to store `droppedEventsCounter.get` before calculating droppedCount and use it for consistency.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-586798741
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-571344802
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/20986/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-576189263
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21874/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-571156244
 
 
   adding cc. @cloud-fan and @squito for expanding visibility

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-571344300
 
 
   **[Test build #116192 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116192/testReport)** for PR 27002 at commit [`5bf09c2`](https://github.com/apache/spark/commit/5bf09c264a9b147b8b1eda5bb32dfc16640e5b58).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] liupc commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
liupc commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r363197377
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -167,20 +170,29 @@ private class AsyncEventQueue(
     }
     logTrace(s"Dropping event $event")
 
-    val droppedCount = droppedEventsCounter.get
+    val droppedCount = droppedEventsCounter.get - lastDroppedEventsCounter
+    val lastReportTime = lastReportTimestamp.get
+    val curTime = System.currentTimeMillis()
     if (droppedCount > 0) {
       // Don't log too frequently
-      if (System.currentTimeMillis() - lastReportTimestamp >= 60 * 1000) {
-        // There may be multiple threads trying to decrease droppedEventsCounter.
-        // Use "compareAndSet" to make sure only one thread can win.
-        // And if another thread is increasing droppedEventsCounter, "compareAndSet" will fail and
-        // then that thread will update it.
-        if (droppedEventsCounter.compareAndSet(droppedCount, 0)) {
 
 Review comment:
   @HeartSaVioR  Yes, that's what I mean. In some cases, the log can never be logged and it's hard to know what happened that may causing it slow.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-576238007
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/117111/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27002: [SPARK-30346]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27002: [SPARK-30346]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-568717959
 
 
   Can one of the admins verify this patch?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] liupc commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
liupc commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-571405141
 
 
   @HeartSaVioR I agree that this log can not definitely tell us where the slow must be, but at least can provide some clues, it's no worse than no clues. Moreover, I think if there are some hot points are really slow, then there are larger probability that it will appear in the periodic log. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR edited a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
HeartSaVioR edited a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-571379982
 
 
   The thing is that the current event dispatcher is processing now may not be the culprit of slowdown, and the log could give wrong information which points out innocent one which would lead us wasting time to investigate. What's worse, there's no way to determine whether the information is correct or not. Please correct me if I'm missing here.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r364008558
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -167,20 +170,27 @@ private class AsyncEventQueue(
     }
     logTrace(s"Dropping event $event")
 
-    val droppedCount = droppedEventsCounter.get
+    val droppedCount = droppedEventsCounter.get - lastDroppedEventsCounter
+    val lastReportTime = lastReportTimestamp.get
+    val curTime = System.currentTimeMillis()
     if (droppedCount > 0) {
       // Don't log too frequently
-      if (System.currentTimeMillis() - lastReportTimestamp >= 60 * 1000) {
-        // There may be multiple threads trying to decrease droppedEventsCounter.
-        // Use "compareAndSet" to make sure only one thread can win.
-        // And if another thread is increasing droppedEventsCounter, "compareAndSet" will fail and
-        // then that thread will update it.
-        if (droppedEventsCounter.compareAndSet(droppedCount, 0)) {
-          val prevLastReportTimestamp = lastReportTimestamp
-          lastReportTimestamp = System.currentTimeMillis()
-          val previous = new java.util.Date(prevLastReportTimestamp)
+      if (curTime - lastReportTime >= LOGGING_INTERVAL) {
+        // There may be multiple threads trying to logging dropped events,
+        // Use 'compareAndSet' to make sure only one thread can win.
+        if (lastReportTimestamp.compareAndSet(lastReportTime, curTime)) {
+          val lastReportTime = lastReportTimestamp.get
 
 Review comment:
   Nice finding. Looks like this line should be removed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-576238007
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/117111/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-576191992
 
 
   **[Test build #117111 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117111/testReport)** for PR 27002 at commit [`a34d020`](https://github.com/apache/spark/commit/a34d0207b6b763f9ff9732cf5d6646c1d5359152).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] liupc commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
liupc commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r379969140
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -167,22 +170,19 @@ private class AsyncEventQueue(
     }
     logTrace(s"Dropping event $event")
 
-    val droppedCount = droppedEventsCounter.get
-    if (droppedCount > 0) {
-      // Don't log too frequently
-      if (System.currentTimeMillis() - lastReportTimestamp >= 60 * 1000) {
-        // There may be multiple threads trying to decrease droppedEventsCounter.
-        // Use "compareAndSet" to make sure only one thread can win.
-        // And if another thread is increasing droppedEventsCounter, "compareAndSet" will fail and
-        // then that thread will update it.
-        if (droppedEventsCounter.compareAndSet(droppedCount, 0)) {
-          val prevLastReportTimestamp = lastReportTimestamp
-          lastReportTimestamp = System.currentTimeMillis()
-          val previous = new java.util.Date(prevLastReportTimestamp)
+    val droppedCount = droppedEventsCounter.get - lastDroppedEventsCounter
+    val lastReportTime = lastReportTimestamp.get
+    val curTime = System.currentTimeMillis()
+    // Don't log too frequently
+    if (droppedCount > 0 && curTime - lastReportTime >= LOGGING_INTERVAL) {
+        // There may be multiple threads trying to logging dropped events,
 
 Review comment:
   Done

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] liupc commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
liupc commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r363199734
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -167,20 +170,29 @@ private class AsyncEventQueue(
     }
     logTrace(s"Dropping event $event")
 
-    val droppedCount = droppedEventsCounter.get
+    val droppedCount = droppedEventsCounter.get - lastDroppedEventsCounter
+    val lastReportTime = lastReportTimestamp.get
+    val curTime = System.currentTimeMillis()
     if (droppedCount > 0) {
       // Don't log too frequently
-      if (System.currentTimeMillis() - lastReportTimestamp >= 60 * 1000) {
-        // There may be multiple threads trying to decrease droppedEventsCounter.
-        // Use "compareAndSet" to make sure only one thread can win.
-        // And if another thread is increasing droppedEventsCounter, "compareAndSet" will fail and
-        // then that thread will update it.
-        if (droppedEventsCounter.compareAndSet(droppedCount, 0)) {
-          val prevLastReportTimestamp = lastReportTimestamp
-          lastReportTimestamp = System.currentTimeMillis()
+      if (curTime - lastReportTime >= LOGGING_INTERVAL) {
+        // There may be multiple threads trying to logging dropped events,
+        // Use 'compareAndSet' to make sure only one thread can win.
+        // After set the 'lastReportTimestamp', the next time we come here will
 
 Review comment:
   Done

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on a change in pull request #27002: [SPARK-30346]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on a change in pull request #27002: [SPARK-30346]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r361573593
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -67,8 +67,11 @@ private class AsyncEventQueue(
   /** A counter for dropped events. It will be reset every time we log it. */
 
 Review comment:
   Let's update this comment as well since we no longer reset the counter.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-576192753
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] liupc commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
liupc commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r368445746
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -167,20 +170,27 @@ private class AsyncEventQueue(
     }
     logTrace(s"Dropping event $event")
 
-    val droppedCount = droppedEventsCounter.get
+    val droppedCount = droppedEventsCounter.get - lastDroppedEventsCounter
+    val lastReportTime = lastReportTimestamp.get
+    val curTime = System.currentTimeMillis()
     if (droppedCount > 0) {
       // Don't log too frequently
-      if (System.currentTimeMillis() - lastReportTimestamp >= 60 * 1000) {
-        // There may be multiple threads trying to decrease droppedEventsCounter.
-        // Use "compareAndSet" to make sure only one thread can win.
-        // And if another thread is increasing droppedEventsCounter, "compareAndSet" will fail and
-        // then that thread will update it.
-        if (droppedEventsCounter.compareAndSet(droppedCount, 0)) {
-          val prevLastReportTimestamp = lastReportTimestamp
-          lastReportTimestamp = System.currentTimeMillis()
-          val previous = new java.util.Date(prevLastReportTimestamp)
+      if (curTime - lastReportTime >= LOGGING_INTERVAL) {
+        // There may be multiple threads trying to logging dropped events,
+        // Use 'compareAndSet' to make sure only one thread can win.
+        if (lastReportTimestamp.compareAndSet(lastReportTime, curTime)) {
+          val lastReportTime = lastReportTimestamp.get
+          val previous = new java.util.Date(lastReportTime)
+          lastDroppedEventsCounter = droppedCount
           logWarning(s"Dropped $droppedCount events from $name since " +
-            s"${if (prevLastReportTimestamp == 0) "the application started" else s"$previous"}.")
+            s"${if (lastReportTime == 0) "the application started" else s"$previous"}.")
+          // Logging thread dump when events from appStatus was dropped
+          Utils.getThreadDumpForThread(dispatchThread.getId).foreach { thread =>
+            if (thread.threadName.contains(LiveListenerBus.APP_STATUS_QUEUE)) {
+              logWarning(s"Event dropped!!!Thread dump from dispatch thread " +
 
 Review comment:
   @vanzin  Sorry for late reply, I looked into [#25702](https://github.com/apache/spark/pull/25702), I think you are right, measuring each event's processing time will be better. I'll remove this.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r368493390
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -167,22 +170,19 @@ private class AsyncEventQueue(
     }
     logTrace(s"Dropping event $event")
 
-    val droppedCount = droppedEventsCounter.get
-    if (droppedCount > 0) {
-      // Don't log too frequently
-      if (System.currentTimeMillis() - lastReportTimestamp >= 60 * 1000) {
-        // There may be multiple threads trying to decrease droppedEventsCounter.
-        // Use "compareAndSet" to make sure only one thread can win.
-        // And if another thread is increasing droppedEventsCounter, "compareAndSet" will fail and
-        // then that thread will update it.
-        if (droppedEventsCounter.compareAndSet(droppedCount, 0)) {
-          val prevLastReportTimestamp = lastReportTimestamp
-          lastReportTimestamp = System.currentTimeMillis()
-          val previous = new java.util.Date(prevLastReportTimestamp)
+    val droppedCount = droppedEventsCounter.get - lastDroppedEventsCounter
+    val lastReportTime = lastReportTimestamp.get
+    val curTime = System.currentTimeMillis()
+    // Don't log too frequently
+    if (droppedCount > 0 && curTime - lastReportTime >= LOGGING_INTERVAL) {
+        // There may be multiple threads trying to logging dropped events,
 
 Review comment:
   Please fix indent, as we've consolidated two nested `if` into one because of reducing necessary indentation.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] liupc commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
liupc commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r363199682
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -213,4 +225,6 @@ private object AsyncEventQueue {
 
   val POISON_PILL = new SparkListenerEvent() { }
 
+  val LOGGING_INTERVAL = 60 * 1000
+
 
 Review comment:
   Done

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] vanzin commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
vanzin commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r363511770
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -167,20 +170,27 @@ private class AsyncEventQueue(
     }
     logTrace(s"Dropping event $event")
 
-    val droppedCount = droppedEventsCounter.get
+    val droppedCount = droppedEventsCounter.get - lastDroppedEventsCounter
+    val lastReportTime = lastReportTimestamp.get
+    val curTime = System.currentTimeMillis()
     if (droppedCount > 0) {
       // Don't log too frequently
-      if (System.currentTimeMillis() - lastReportTimestamp >= 60 * 1000) {
-        // There may be multiple threads trying to decrease droppedEventsCounter.
-        // Use "compareAndSet" to make sure only one thread can win.
-        // And if another thread is increasing droppedEventsCounter, "compareAndSet" will fail and
-        // then that thread will update it.
-        if (droppedEventsCounter.compareAndSet(droppedCount, 0)) {
-          val prevLastReportTimestamp = lastReportTimestamp
-          lastReportTimestamp = System.currentTimeMillis()
-          val previous = new java.util.Date(prevLastReportTimestamp)
+      if (curTime - lastReportTime >= LOGGING_INTERVAL) {
 
 Review comment:
   This can be merged with the previous condition to avoid extra indentation.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-586833003
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-571382519
 
 
   **[Test build #116192 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116192/testReport)** for PR 27002 at commit [`5bf09c2`](https://github.com/apache/spark/commit/5bf09c264a9b147b8b1eda5bb32dfc16640e5b58).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-571382948
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] liupc commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
liupc commented on issue #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#issuecomment-586795164
 
 
   > @liupc Any update here?
   
   Sorry for late reply, already fixed the indent.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] liupc commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
liupc commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r379969435
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -167,22 +170,19 @@ private class AsyncEventQueue(
     }
     logTrace(s"Dropping event $event")
 
-    val droppedCount = droppedEventsCounter.get
-    if (droppedCount > 0) {
-      // Don't log too frequently
-      if (System.currentTimeMillis() - lastReportTimestamp >= 60 * 1000) {
-        // There may be multiple threads trying to decrease droppedEventsCounter.
-        // Use "compareAndSet" to make sure only one thread can win.
-        // And if another thread is increasing droppedEventsCounter, "compareAndSet" will fail and
-        // then that thread will update it.
-        if (droppedEventsCounter.compareAndSet(droppedCount, 0)) {
-          val prevLastReportTimestamp = lastReportTimestamp
-          lastReportTimestamp = System.currentTimeMillis()
-          val previous = new java.util.Date(prevLastReportTimestamp)
+    val droppedCount = droppedEventsCounter.get - lastDroppedEventsCounter
+    val lastReportTime = lastReportTimestamp.get
+    val curTime = System.currentTimeMillis()
+    // Don't log too frequently
+    if (droppedCount > 0 && curTime - lastReportTime >= LOGGING_INTERVAL) {
+        // There may be multiple threads trying to logging dropped events,
+        // Use 'compareAndSet' to make sure only one thread can win.
+        if (lastReportTimestamp.compareAndSet(lastReportTime, curTime)) {
+          val previous = new java.util.Date(lastReportTime)
+          lastDroppedEventsCounter = droppedCount
 
 Review comment:
   @jiangxb1987 @HeartSaVioR Thanks, I've fixed it and I think it's more clear to store `droppedEventsCounter.get` before calculating droppedCount.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] liupc commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
liupc commented on a change in pull request #27002: [SPARK-30346][CORE]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r363197914
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -67,8 +67,11 @@ private class AsyncEventQueue(
   /** A counter for dropped events. It will be reset every time we log it. */
   private val droppedEventsCounter = new AtomicLong(0L)
 
+  /** A counter to keep number of dropped events last time it was logged */
+  private var lastDroppedEventsCounter: Long = 0L
 
 Review comment:
   Good comments, I will update.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] liupc commented on a change in pull request #27002: [SPARK-30346]Improve logging when events dropped

Posted by GitBox <gi...@apache.org>.
liupc commented on a change in pull request #27002: [SPARK-30346]Improve logging when events dropped
URL: https://github.com/apache/spark/pull/27002#discussion_r361300167
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala
 ##########
 @@ -67,8 +67,11 @@ private class AsyncEventQueue(
   /** A counter for dropped events. It will be reset every time we log it. */
   private val droppedEventsCounter = new AtomicLong(0L)
 
+  /** A counter to keep dropped events count last time it was logged */
+  private var lastDroppedEventsCounter: Long = 0L
+
   /** When `droppedEventsCounter` was logged last time in milliseconds. */
-  @volatile private var lastReportTimestamp = 0L
+  @volatile private var lastReportTimestamp = new AtomicLong(0L)
 
 Review comment:
   Done

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org