You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Carlos Fuertes (JIRA)" <ji...@apache.org> on 2014/08/01 00:29:38 UTC

[jira] [Commented] (SPARK-2017) web ui stage page becomes unresponsive when the number of tasks is large

    [ https://issues.apache.org/jira/browse/SPARK-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081598#comment-14081598 ] 

Carlos Fuertes commented on SPARK-2017:
---------------------------------------

I have done some tests with the solution where you use JSON to send the data. If you run with 50k tasks

sc.parallelize(1 to 1000000, 50000).count()

the JSON [/stages/stage/tasks/json/?id=0] that represents the tasks table takes ~15Mb if you download it. You can get the JSON is some secs but the UI [/stages/stage/?id=0] will take still forever to render it (summary still shows up nonetheless).

I did not change the way we are rendering, that is move to pagination or anything else, and still using sorttable to allow the sorting of the table.

Maybe just converting to JSON is too simple and you still have to do streaming of the data if you want to go around 50k task and higher while maintaining responsiveness of the browser. And/or incorporate pagination directly with a global index for the tasks on the back. 











> web ui stage page becomes unresponsive when the number of tasks is large
> ------------------------------------------------------------------------
>
>                 Key: SPARK-2017
>                 URL: https://issues.apache.org/jira/browse/SPARK-2017
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Web UI
>            Reporter: Reynold Xin
>              Labels: starter
>
> {code}
> sc.parallelize(1 to 1000000, 1000000).count()
> {code}
> The above code creates one million tasks to be executed. The stage detail web ui page takes forever to load (if it ever completes).
> There are again a few different alternatives:
> 0. Limit the number of tasks we show.
> 1. Pagination
> 2. By default only show the aggregate metrics and failed tasks, and hide the successful ones.



--
This message was sent by Atlassian JIRA
(v6.2#6252)