You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/10/13 11:45:37 UTC

[GitHub] [spark] akiyamaneko opened a new pull request #30030: [SPARK-33132] Fix The 'Shuffle Read Size / Records' field in Stage Summary metrics was shown as 'NaN Undefined'

akiyamaneko opened a new pull request #30030:
URL: https://github.com/apache/spark/pull/30030


   
   ### What changes were proposed in this pull request?
   when bytesRead metric was negative, ensure the `Shuffle Read Size / Records` shows normal.
   
   
   ### Why are the changes needed?
   Improve metric display on Summary Metrics o  Stage UI.
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   It's a small change, just manual test.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on a change in pull request #30030: [SPARK-33132][WEBUI] Make `formatBytes` return `0.0 B` for negative input instead of `NaN`

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on a change in pull request #30030:
URL: https://github.com/apache/spark/pull/30030#discussion_r504095721



##########
File path: core/src/main/resources/org/apache/spark/ui/static/utils.js
##########
@@ -39,7 +39,7 @@ function formatDuration(milliseconds) {
 
 function formatBytes(bytes, type) {
     if (type !== 'display') return bytes;
-    if (bytes == 0) return '0.0 B';
+    if (bytes <= 0) return '0.0 B';

Review comment:
       @akiyamaneko  Thanks for the fix!
   BTW, when will the `bytes` be negative here? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] akiyamaneko commented on a change in pull request #30030: [SPARK-33132][WEBUI] Make `formatBytes` return `0.0 B` for negative input instead of `NaN`

Posted by GitBox <gi...@apache.org>.
akiyamaneko commented on a change in pull request #30030:
URL: https://github.com/apache/spark/pull/30030#discussion_r508747975



##########
File path: core/src/main/resources/org/apache/spark/ui/static/utils.js
##########
@@ -39,7 +39,7 @@ function formatDuration(milliseconds) {
 
 function formatBytes(bytes, type) {
     if (type !== 'display') return bytes;
-    if (bytes == 0) return '0.0 B';
+    if (bytes <= 0) return '0.0 B';

Review comment:
       @gengliangwang sorry for the delay,I had tried to reproduce the problem 
   When many tasks  failed  in a  stage,  the `shuffleTotalReads` metric can be -1L sometime([storeTypes.scala](https://github.com/apache/spark/blob/46ad325e56abd95c0ffdbe64aad78582da8c725d/core/src/main/scala/org/apache/spark/status/storeTypes.scala#L344))
   
   the code: [AppStatusStore#taskSummary](https://github.com/apache/spark/blob/46ad325e56abd95c0ffdbe64aad78582da8c725d/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala#L165) may has bug, It doesn't consider negative values: [AppStatusStore.toValues](https://github.com/apache/spark/blob/46ad325e56abd95c0ffdbe64aad78582da8c725d/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala#L206) and [AppStatusStore.scanTasks](https://github.com/apache/spark/blob/46ad325e56abd95c0ffdbe64aad78582da8c725d/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala#L267)
   
   do you think is it necessary to ensure the non-negative metric value ? 
   I have found the restAPI:`http://host:18081/api/v1/applications/application_Id/stages/stageId/attemptId/taskSummary` can return negative metrics values like `shuffleReadMetrics.readBytes`, `totalBlocksFetched`.
   
   
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on pull request #30030: [SPARK-33132][WEBUI] Fix The 'Shuffle Read Size / Records' field in Stage Summary metrics was shown as 'NaN Undefined'

Posted by GitBox <gi...@apache.org>.
maropu commented on pull request #30030:
URL: https://github.com/apache/spark/pull/30030#issuecomment-707690073


   cc: @sarutak @gengliangwang 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #30030: [SPARK-33132][WEBUI] Make `formatBytes` return `0.0 B` for negative input instead of `NaN`

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #30030:
URL: https://github.com/apache/spark/pull/30030#issuecomment-707862268


   +1, LGTM. Thank you for your first contribution, @akiyamaneko .
   - This is merged to `master` branch for Apache Spark 3.1.0.
   - I added you to the Apache Spark contributor group.
   - SPARK-33132 is assigned to you.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #30030: [SPARK-33132][WEBUI] Fix The 'Shuffle Read Size / Records' field in Stage Summary metrics was shown as 'NaN Undefined'

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #30030:
URL: https://github.com/apache/spark/pull/30030#discussion_r504091343



##########
File path: core/src/main/resources/org/apache/spark/ui/static/utils.js
##########
@@ -39,7 +39,7 @@ function formatDuration(milliseconds) {
 
 function formatBytes(bytes, type) {
     if (type !== 'display') return bytes;
-    if (bytes == 0) return '0.0 B';
+    if (bytes <= 0) return '0.0 B';

Review comment:
       Okay. We already prevent `-Infinity` at `Math.log(0)`. It looks safe if we prevent `NaN` at `Math.log(-1)`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun closed pull request #30030: [SPARK-33132][WEBUI] Make `formatBytes` return `0.0 B` for negative input instead of `NaN`

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun closed pull request #30030:
URL: https://github.com/apache/spark/pull/30030


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on a change in pull request #30030: [SPARK-33132][WEBUI] Make `formatBytes` return `0.0 B` for negative input instead of `NaN`

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on a change in pull request #30030:
URL: https://github.com/apache/spark/pull/30030#discussion_r509125870



##########
File path: core/src/main/resources/org/apache/spark/ui/static/utils.js
##########
@@ -39,7 +39,7 @@ function formatDuration(milliseconds) {
 
 function formatBytes(bytes, type) {
     if (type !== 'display') return bytes;
-    if (bytes == 0) return '0.0 B';
+    if (bytes <= 0) return '0.0 B';

Review comment:
       @akiyamaneko It seems that the negative value is made on purpose to represent that there is no such metrics. I think it's fine for now since there are no complaints about this.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] akiyamaneko commented on a change in pull request #30030: [SPARK-33132][WEBUI] Make `formatBytes` return `0.0 B` for negative input instead of `NaN`

Posted by GitBox <gi...@apache.org>.
akiyamaneko commented on a change in pull request #30030:
URL: https://github.com/apache/spark/pull/30030#discussion_r508747975



##########
File path: core/src/main/resources/org/apache/spark/ui/static/utils.js
##########
@@ -39,7 +39,7 @@ function formatDuration(milliseconds) {
 
 function formatBytes(bytes, type) {
     if (type !== 'display') return bytes;
-    if (bytes == 0) return '0.0 B';
+    if (bytes <= 0) return '0.0 B';

Review comment:
       @gengliangwang sorry for the delay,I had tried to reproduce the problem 
   When many tasks  failed  in a  stage,  the `shuffleTotalReads` metric can be -1L sometime([storeTypes.scala](https://github.com/apache/spark/blob/46ad325e56abd95c0ffdbe64aad78582da8c725d/core/src/main/scala/org/apache/spark/status/storeTypes.scala#L344))
   
   the code: [AppStatusStore#taskSummary](https://github.com/apache/spark/blob/46ad325e56abd95c0ffdbe64aad78582da8c725d/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala#L165) may has bug, It doesn't consider negative values: [AppStatusStore.toValues](https://github.com/apache/spark/blob/46ad325e56abd95c0ffdbe64aad78582da8c725d/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala#L206) and [AppStatusStore.scanTasks](https://github.com/apache/spark/blob/46ad325e56abd95c0ffdbe64aad78582da8c725d/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala#L267)
   
   Do you think is it necessary to ensure the non-negative metric value ? 
   I have found the restAPI:`http://host:18081/api/v1/applications/application_Id/stages/stageId/attemptId/taskSummary` sometimes return negative metrics values like `shuffleReadMetrics.readBytes`, `totalBlocksFetched`.
   
   
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org