You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/10/01 08:12:03 UTC

[GitHub] [spark] HeartSaVioR commented on a change in pull request #24936: [SPARK-24634][SS] Add a new metric regarding number of rows later than watermark plus allowed delay

HeartSaVioR commented on a change in pull request #24936: [SPARK-24634][SS] Add a new metric regarding number of rows later than watermark plus allowed delay
URL: https://github.com/apache/spark/pull/24936#discussion_r329928176
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ProgressReporter.scala
 ##########
 @@ -201,26 +201,32 @@ trait ProgressReporter extends Logging {
   }
 
   /** Extract statistics about stateful operators from the executed query plan. */
-  private def extractStateOperatorMetrics(hasNewData: Boolean): Seq[StateOperatorProgress] = {
+  private def extractStateOperatorMetrics(
+      hasNewData: Boolean,
+      runBatch: Boolean): Seq[StateOperatorProgress] = {
     if (lastExecution == null) return Nil
-    // lastExecution could belong to one of the previous triggers if `!hasNewData`.
+    // lastExecution could belong to one of the previous triggers if `!hasNewData && !runBatch`.
     // Walking the plan again should be inexpensive.
     lastExecution.executedPlan.collect {
       case p if p.isInstanceOf[StateStoreWriter] =>
         val progress = p.asInstanceOf[StateStoreWriter].getProgress()
-        if (hasNewData) progress else progress.copy(newNumRowsUpdated = 0)
+        if (hasNewData || runBatch) {
 
 Review comment:
   Here `runBatch` is needed here because we don't want to reset the values for `newNumLateInputRows` if batch ran, even the batch ran with empty data.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org