You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "thejdeep (via GitHub)" <gi...@apache.org> on 2023/05/02 16:38:18 UTC

[GitHub] [spark] thejdeep opened a new pull request, #41017: [SPARK-43334] [UI] Fix error while serializing ExecutorPeakMetricsDistributions into API response

thejdeep opened a new pull request, #41017:
URL: https://github.com/apache/spark/pull/41017

    When we calculate the quantile information from the peak executor metrics values for the distribution, there is a possibility of running into an `ArrayIndexOutOfBounds` exception when the metric values are empty. This PR addresses that and fixes it by returning an empty array if the values are empty.
   
    ### Why are the changes needed?
    Without these changes, when the withDetails query parameter is used to query the stages REST API, we encounter a partial JSON response since the peak executor metrics distribution cannot be serialized due to the above index error.
   
    ### Does this PR introduce _any_ user-facing change?
    No
   
    ### How was this patch tested?
    Added a unit test to test this behavior
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on a diff in pull request #41017: [SPARK-43334] [UI] Fix error while serializing ExecutorPeakMetricsDistributions into API response

Posted by "srowen (via GitHub)" <gi...@apache.org>.
srowen commented on code in PR #41017:
URL: https://github.com/apache/spark/pull/41017#discussion_r1195325477


##########
core/src/main/scala/org/apache/spark/util/Utils.scala:
##########
@@ -232,6 +232,18 @@ private[spark] object Utils extends Logging {
     // scalastyle:on classforname
   }
 
+  def getQuantilesValue(
+    values: IndexedSeq[Double],
+    quantiles: Array[Double]): IndexedSeq[Double] = {
+    val count = values.size
+    val indices = quantiles.map { q => math.min((q * count).toLong, count - 1) }
+    if (count > 0) {
+      indices.map(i => values(i.toInt)).toIndexedSeq
+    } else {
+      quantiles.map(_ => 0.0)

Review Comment:
   Just return a Seq of the right length filled with 0.0?



##########
core/src/main/scala/org/apache/spark/util/Utils.scala:
##########
@@ -232,6 +232,18 @@ private[spark] object Utils extends Logging {
     // scalastyle:on classforname
   }
 
+  def getQuantilesValue(
+    values: IndexedSeq[Double],
+    quantiles: Array[Double]): IndexedSeq[Double] = {
+    val count = values.size
+    val indices = quantiles.map { q => math.min((q * count).toLong, count - 1) }

Review Comment:
   Move this inside 'if' branch



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] thejdeep commented on pull request #41017: [SPARK-43334] [UI] Fix error while serializing ExecutorPeakMetricsDistributions into API response

Posted by "thejdeep (via GitHub)" <gi...@apache.org>.
thejdeep commented on PR #41017:
URL: https://github.com/apache/spark/pull/41017#issuecomment-1561383823

   @srowen Thanks for the review! Updated the PR after addressing the comments, can you please take another pass ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] thejdeep commented on a diff in pull request #41017: [SPARK-43334] [UI] Fix error while serializing ExecutorPeakMetricsDistributions into API response

Posted by "thejdeep (via GitHub)" <gi...@apache.org>.
thejdeep commented on code in PR #41017:
URL: https://github.com/apache/spark/pull/41017#discussion_r1202751611


##########
core/src/main/scala/org/apache/spark/status/api/v1/api.scala:
##########
@@ -454,13 +455,11 @@ class ExecutorMetricsDistributions private[spark](
 class ExecutorPeakMetricsDistributions private[spark](

Review Comment:
   Yes, this changes applies generally to all quantile calculations. Thanks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] mridulm commented on pull request #41017: [SPARK-43334] [UI] Fix error while serializing ExecutorPeakMetricsDistributions into API response

Posted by "mridulm (via GitHub)" <gi...@apache.org>.
mridulm commented on PR #41017:
URL: https://github.com/apache/spark/pull/41017#issuecomment-1549099812

   Can you fix the conflicts please ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on a diff in pull request #41017: [SPARK-43334] [UI] Fix error while serializing ExecutorPeakMetricsDistributions into API response

Posted by "srowen (via GitHub)" <gi...@apache.org>.
srowen commented on code in PR #41017:
URL: https://github.com/apache/spark/pull/41017#discussion_r1204491448


##########
core/src/main/scala/org/apache/spark/util/Utils.scala:
##########
@@ -199,6 +199,18 @@ private[spark] object Utils extends Logging with SparkClassUtils {
     Try { classForName(clazz, initialize = false) }.isSuccess
   }
 
+  def getQuantilesValue(

Review Comment:
   I don't see a good reason to put this in Utils; it's not used more widely?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] thejdeep commented on a diff in pull request #41017: [SPARK-43334] [UI] Fix error while serializing ExecutorPeakMetricsDistributions into API response

Posted by "thejdeep (via GitHub)" <gi...@apache.org>.
thejdeep commented on code in PR #41017:
URL: https://github.com/apache/spark/pull/41017#discussion_r1202733724


##########
core/src/main/scala/org/apache/spark/util/Utils.scala:
##########
@@ -232,6 +232,18 @@ private[spark] object Utils extends Logging {
     // scalastyle:on classforname
   }
 
+  def getQuantilesValue(
+    values: IndexedSeq[Double],
+    quantiles: Array[Double]): IndexedSeq[Double] = {
+    val count = values.size
+    val indices = quantiles.map { q => math.min((q * count).toLong, count - 1) }

Review Comment:
   Addressed it, thanks.



##########
core/src/main/scala/org/apache/spark/util/Utils.scala:
##########
@@ -232,6 +232,18 @@ private[spark] object Utils extends Logging {
     // scalastyle:on classforname
   }
 
+  def getQuantilesValue(
+    values: IndexedSeq[Double],
+    quantiles: Array[Double]): IndexedSeq[Double] = {
+    val count = values.size
+    val indices = quantiles.map { q => math.min((q * count).toLong, count - 1) }
+    if (count > 0) {
+      indices.map(i => values(i.toInt)).toIndexedSeq
+    } else {
+      quantiles.map(_ => 0.0)

Review Comment:
   Thanks for the suggestion, addressed it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] mridulm commented on pull request #41017: [SPARK-43334] [UI] Fix error while serializing ExecutorPeakMetricsDistributions into API response

Posted by "mridulm (via GitHub)" <gi...@apache.org>.
mridulm commented on PR #41017:
URL: https://github.com/apache/spark/pull/41017#issuecomment-1549102618

   +CC @AngersZhuuuu who last worked on this.
   Also +CC @srowen who reviewed the previous changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] thejdeep commented on a diff in pull request #41017: [SPARK-43334] [UI] Fix error while serializing ExecutorPeakMetricsDistributions into API response

Posted by "thejdeep (via GitHub)" <gi...@apache.org>.
thejdeep commented on code in PR #41017:
URL: https://github.com/apache/spark/pull/41017#discussion_r1204496674


##########
core/src/main/scala/org/apache/spark/util/Utils.scala:
##########
@@ -199,6 +199,18 @@ private[spark] object Utils extends Logging with SparkClassUtils {
     Try { classForName(clazz, initialize = false) }.isSuccess
   }
 
+  def getQuantilesValue(

Review Comment:
   I was indifferent to putting it in Utils but since it is currently used in two different classes, decided to put it there.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen closed pull request #41017: [SPARK-43334] [UI] Fix error while serializing ExecutorPeakMetricsDistributions into API response

Posted by "srowen (via GitHub)" <gi...@apache.org>.
srowen closed pull request #41017: [SPARK-43334] [UI] Fix error while serializing ExecutorPeakMetricsDistributions into API response
URL: https://github.com/apache/spark/pull/41017


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #41017: [SPARK-43334] [UI] Fix error while serializing ExecutorPeakMetricsDistributions into API response

Posted by "srowen (via GitHub)" <gi...@apache.org>.
srowen commented on PR #41017:
URL: https://github.com/apache/spark/pull/41017#issuecomment-1562049449

   Merged to master


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] rmcyang commented on a diff in pull request #41017: [SPARK-43334] [UI] Fix error while serializing ExecutorPeakMetricsDistributions into API response

Posted by "rmcyang (via GitHub)" <gi...@apache.org>.
rmcyang commented on code in PR #41017:
URL: https://github.com/apache/spark/pull/41017#discussion_r1182988496


##########
core/src/main/scala/org/apache/spark/status/api/v1/api.scala:
##########
@@ -454,13 +455,11 @@ class ExecutorMetricsDistributions private[spark](
 class ExecutorPeakMetricsDistributions private[spark](

Review Comment:
   Could this change also apply to other kinds of MetricsDistributions?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] thejdeep commented on a diff in pull request #41017: [SPARK-43334] [UI] Fix error while serializing ExecutorPeakMetricsDistributions into API response

Posted by "thejdeep (via GitHub)" <gi...@apache.org>.
thejdeep commented on code in PR #41017:
URL: https://github.com/apache/spark/pull/41017#discussion_r1204652875


##########
core/src/main/scala/org/apache/spark/util/Utils.scala:
##########
@@ -199,6 +199,18 @@ private[spark] object Utils extends Logging with SparkClassUtils {
     Try { classForName(clazz, initialize = false) }.isSuccess
   }
 
+  def getQuantilesValue(

Review Comment:
   Moved it to `AppStatusUtils`, thanks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on a diff in pull request #41017: [SPARK-43334] [UI] Fix error while serializing ExecutorPeakMetricsDistributions into API response

Posted by "srowen (via GitHub)" <gi...@apache.org>.
srowen commented on code in PR #41017:
URL: https://github.com/apache/spark/pull/41017#discussion_r1204511522


##########
core/src/main/scala/org/apache/spark/util/Utils.scala:
##########
@@ -199,6 +199,18 @@ private[spark] object Utils extends Logging with SparkClassUtils {
     Try { classForName(clazz, initialize = false) }.isSuccess
   }
 
+  def getQuantilesValue(

Review Comment:
   Can it go somewhere 'closer' to the places it is used? it may not be worth refactoring



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org