You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Wenchen Fan (JIRA)" <ji...@apache.org> on 2018/01/11 11:44:00 UTC

[jira] [Resolved] (SPARK-20657) Speed up Stage page

     [ https://issues.apache.org/jira/browse/SPARK-20657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wenchen Fan resolved SPARK-20657.
---------------------------------
       Resolution: Fixed
    Fix Version/s: 2.3.0

Issue resolved by pull request 20013
[https://github.com/apache/spark/pull/20013]

> Speed up Stage page
> -------------------
>
>                 Key: SPARK-20657
>                 URL: https://issues.apache.org/jira/browse/SPARK-20657
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Web UI
>    Affects Versions: 2.3.0
>            Reporter: Marcelo Vanzin
>            Assignee: Marcelo Vanzin
>             Fix For: 2.3.0
>
>
> The Stage page in the UI is very slow when a large number of tasks exist (tens of thousands). The new work being done in SPARK-18085 makes that worse, since it adds potential disk access to the mix.
> A lot of the slowness is because the code loads all the tasks in memory then sorts a really large list, and does a lot of calculations on all the data; both can be avoided with the new app state store by having smarter indices (so data is read from the store sorted in the desired order) and by keeping statistics about metrics pre-calculated (instead of re-doing that on every page access).
> Then only the tasks on the current page (100 items by default) need to actually be loaded. This also saves a lot on memory usage, not just CPU time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org