You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/11/30 08:05:33 UTC

[GitHub] [spark] uncleGen opened a new pull request #30545: [SPARK-33596][SS] NPE when there is no watermark metrics

uncleGen opened a new pull request #30545:
URL: https://github.com/apache/spark/pull/30545


   ### What changes were proposed in this pull request?
   add check if there is watermark metrics or not
   
   
   ### Why are the changes needed?
   NPE when there is no watermark metrics, make some UT flaky.
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   Existing UTs.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] uncleGen edited a comment on pull request #30545: [SPARK-33596][SS] NPE when there is no watermark metrics

Posted by GitBox <gi...@apache.org>.
uncleGen edited a comment on pull request #30545:
URL: https://github.com/apache/spark/pull/30545#issuecomment-735624193


   cc @HeartSaVioR @xuanyuanking 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30545: [SPARK-33596][SS] NPE when there is no watermark metrics

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30545:
URL: https://github.com/apache/spark/pull/30545#issuecomment-735669733






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] uncleGen commented on a change in pull request #30545: [SPARK-33596][SS] NPE when there is no watermark metrics

Posted by GitBox <gi...@apache.org>.
uncleGen commented on a change in pull request #30545:
URL: https://github.com/apache/spark/pull/30545#discussion_r532411294



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/streaming/ui/StreamingQueryStatisticsPage.scala
##########
@@ -150,10 +150,10 @@ private[ui] class StreamingQueryStatisticsPage(parent: StreamingQueryTab)
     if (query.lastProgress.eventTime.containsKey("watermark")) {
       val watermarkData = query.recentProgress.flatMap { p =>
         val batchTimestamp = parseProgressTimestamp(p.timestamp)
-        val watermarkValue = parseProgressTimestamp(p.eventTime.get("watermark"))

Review comment:
       when there is no watermark metrics, `p.eventTime.get("watermark")` will return a `null`. NPE will be thrown in `parseProgressTimestamp`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] uncleGen commented on a change in pull request #30545: [SPARK-33596][SS] NPE when there is no watermark metrics

Posted by GitBox <gi...@apache.org>.
uncleGen commented on a change in pull request #30545:
URL: https://github.com/apache/spark/pull/30545#discussion_r532511631



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/streaming/ui/StreamingQueryStatisticsPage.scala
##########
@@ -150,10 +150,10 @@ private[ui] class StreamingQueryStatisticsPage(parent: StreamingQueryTab)
     if (query.lastProgress.eventTime.containsKey("watermark")) {
       val watermarkData = query.recentProgress.flatMap { p =>
         val batchTimestamp = parseProgressTimestamp(p.timestamp)
-        val watermarkValue = parseProgressTimestamp(p.eventTime.get("watermark"))

Review comment:
       @HeartSaVioR @xuanyuanking Ignore this false alarm




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] xuanyuanking commented on a change in pull request #30545: [SPARK-33596][SS] NPE when there is no watermark metrics

Posted by GitBox <gi...@apache.org>.
xuanyuanking commented on a change in pull request #30545:
URL: https://github.com/apache/spark/pull/30545#discussion_r532438591



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/streaming/ui/StreamingQueryStatisticsPage.scala
##########
@@ -150,10 +150,10 @@ private[ui] class StreamingQueryStatisticsPage(parent: StreamingQueryTab)
     if (query.lastProgress.eventTime.containsKey("watermark")) {
       val watermarkData = query.recentProgress.flatMap { p =>
         val batchTimestamp = parseProgressTimestamp(p.timestamp)
-        val watermarkValue = parseProgressTimestamp(p.eventTime.get("watermark"))

Review comment:
       https://github.com/apache/spark/blob/b665d5881915f042930f502bcc3c6ee3cb00c50d/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ProgressReporter.scala#L244-L246
   
   It's possible that lastProgress.eventTime contains `watermark` but recentProgress doesn't, when `hasNewData` = true.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] uncleGen commented on pull request #30545: [SPARK-33596][SS] NPE when there is no watermark metrics

Posted by GitBox <gi...@apache.org>.
uncleGen commented on pull request #30545:
URL: https://github.com/apache/spark/pull/30545#issuecomment-735624193


   cc @HeartSaVioR 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30545: [SPARK-33596][SS] NPE when there is no watermark metrics

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30545:
URL: https://github.com/apache/spark/pull/30545#issuecomment-735635751


   **[Test build #131969 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131969/testReport)** for PR 30545 at commit [`c9c4978`](https://github.com/apache/spark/commit/c9c4978486b51f576b2fc4737f93e9fa272620c1).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] uncleGen closed pull request #30545: [SPARK-33596][SS] NPE when there is no watermark metrics

Posted by GitBox <gi...@apache.org>.
uncleGen closed pull request #30545:
URL: https://github.com/apache/spark/pull/30545


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30545: [SPARK-33596][SS] NPE when there is no watermark metrics

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30545:
URL: https://github.com/apache/spark/pull/30545#issuecomment-735730013






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] xuanyuanking commented on a change in pull request #30545: [SPARK-33596][SS] NPE when there is no watermark metrics

Posted by GitBox <gi...@apache.org>.
xuanyuanking commented on a change in pull request #30545:
URL: https://github.com/apache/spark/pull/30545#discussion_r532420800



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/streaming/ui/StreamingQueryStatisticsPage.scala
##########
@@ -150,10 +150,10 @@ private[ui] class StreamingQueryStatisticsPage(parent: StreamingQueryTab)
     if (query.lastProgress.eventTime.containsKey("watermark")) {
       val watermarkData = query.recentProgress.flatMap { p =>
         val batchTimestamp = parseProgressTimestamp(p.timestamp)
-        val watermarkValue = parseProgressTimestamp(p.eventTime.get("watermark"))

Review comment:
       Could you give more details about the NPE? We already check the `watermark` key in lastProgress.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30545: [SPARK-33596][SS] NPE when there is no watermark metrics

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30545:
URL: https://github.com/apache/spark/pull/30545#issuecomment-735729693


   **[Test build #131969 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131969/testReport)** for PR 30545 at commit [`c9c4978`](https://github.com/apache/spark/commit/c9c4978486b51f576b2fc4737f93e9fa272620c1).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30545: [SPARK-33596][SS] NPE when there is no watermark metrics

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30545:
URL: https://github.com/apache/spark/pull/30545#issuecomment-735730013






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] uncleGen commented on pull request #30545: [SPARK-33596][SS] NPE when there is no watermark metrics

Posted by GitBox <gi...@apache.org>.
uncleGen commented on pull request #30545:
URL: https://github.com/apache/spark/pull/30545#issuecomment-735644088


   can not reproduce again, close first


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30545: [SPARK-33596][SS] NPE when there is no watermark metrics

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30545:
URL: https://github.com/apache/spark/pull/30545#issuecomment-735635751


   **[Test build #131969 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131969/testReport)** for PR 30545 at commit [`c9c4978`](https://github.com/apache/spark/commit/c9c4978486b51f576b2fc4737f93e9fa272620c1).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] uncleGen commented on a change in pull request #30545: [SPARK-33596][SS] NPE when there is no watermark metrics

Posted by GitBox <gi...@apache.org>.
uncleGen commented on a change in pull request #30545:
URL: https://github.com/apache/spark/pull/30545#discussion_r532411294



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/streaming/ui/StreamingQueryStatisticsPage.scala
##########
@@ -150,10 +150,10 @@ private[ui] class StreamingQueryStatisticsPage(parent: StreamingQueryTab)
     if (query.lastProgress.eventTime.containsKey("watermark")) {
       val watermarkData = query.recentProgress.flatMap { p =>
         val batchTimestamp = parseProgressTimestamp(p.timestamp)
-        val watermarkValue = parseProgressTimestamp(p.eventTime.get("watermark"))

Review comment:
       when there is not watermark metrics, `p.eventTime.get("watermark")` will return a `null`. NPE will be thrown in `parseProgressTimestamp`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30545: [SPARK-33596][SS] NPE when there is no watermark metrics

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30545:
URL: https://github.com/apache/spark/pull/30545#issuecomment-735669733






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on a change in pull request #30545: [SPARK-33596][SS] NPE when there is no watermark metrics

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on a change in pull request #30545:
URL: https://github.com/apache/spark/pull/30545#discussion_r532423401



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/streaming/ui/StreamingQueryStatisticsPage.scala
##########
@@ -150,10 +150,10 @@ private[ui] class StreamingQueryStatisticsPage(parent: StreamingQueryTab)
     if (query.lastProgress.eventTime.containsKey("watermark")) {
       val watermarkData = query.recentProgress.flatMap { p =>
         val batchTimestamp = parseProgressTimestamp(p.timestamp)
-        val watermarkValue = parseProgressTimestamp(p.eventTime.get("watermark"))

Review comment:
       Is there a case earlier batch doesn't have "watermark" while later batch does? It'd be nice if you could refer somewhere you encounter this.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org