You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Mridul Muralidharan (Jira)" <ji...@apache.org> on 2022/03/08 18:47:00 UTC
[jira] [Resolved] (SPARK-38309) SHS has incorrect percentiles for shuffle read bytes and shuffle total blocks metrics

     [ https://issues.apache.org/jira/browse/SPARK-38309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mridul Muralidharan resolved SPARK-38309.
-----------------------------------------
    Fix Version/s: 3.3.0
                   3.2.2
                   3.1.3
       Resolution: Fixed

Issue resolved by pull request 35637
[https://github.com/apache/spark/pull/35637]

> SHS has incorrect percentiles for shuffle read bytes and shuffle total blocks metrics
> -------------------------------------------------------------------------------------
>
>                 Key: SPARK-38309
>                 URL: https://issues.apache.org/jira/browse/SPARK-38309
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 3.1.0
>            Reporter: Rob Reeves
>            Assignee: Rob Reeves
>            Priority: Major
>              Labels: correctness
>             Fix For: 3.3.0, 3.2.2, 3.1.3
>
>         Attachments: image-2022-02-23-14-19-33-255.png
>
>
> *Background*
> In [this PR|https://github.com/apache/spark/pull/26508] (SPARK-26260) the SHS stage metric percentiles were updated to only include successful tasks when using disk storage. It did this by making the values for each metric negative when the task is not in a successful state. This approach was chosen to avoid breaking changes to disk storage. See [this comment|https://github.com/apache/spark/pull/26508#issuecomment-554540314] for context.
> To get the percentiles, it reads the metric values, starting at 0, in ascending order. This filters out all tasks that are not successful because the values are less than 0. To get the percentile values it scales the percentiles to the list index of successful tasks. For example if there are 200 tasks and you want percentiles [0, 25, 50, 75, 100] the lookup indexes in the task collection are [0, 50, 100, 150, 199].
> *Issue*
> For metrics 1) shuffle total reads and 2) shuffle total blocks, the above PR incorrectly makes the metric indices positive. This means tasks that are not successful are included in the percentile calculations. The percentile lookup index calculation is still based on the number of successful task so the wrong task metric is returned for a given percentile. This was not caught because the unit test only verified values for one metric, executorRunTime.
> *Steps to Reproduce*
> _SHS UI_
>  # Find a spark application in the SHS that has failed tasks for a stage with shuffle read.
>  # Navigate to the stage UI.
>  # Look at the max shuffle read size in the summary metrics
>  # Sort the tasks by shuffle read size descending. You'll see it doesn't match step 3.
>  
> !image-2022-02-23-14-19-33-255.png|width=789,height=389!
> _API_
>  # For the same stage in the above repro steps, make a request to the task summary endpoint (e.g. /api/v1/applications/application_1632281309592_21294517/1/stages/6/0/taskSummary?quantiles=0,0.25,0.5,0.75,1.0)
>  # Look at the shuffleReadMetrics.readBytes and shuffleReadMetrics.totalBlocksFetched. You will see -2 for at least some of the lower percentiles and the positive values will also be wrong.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org