You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Lantao Jin (JIRA)" <ji...@apache.org> on 2019/06/27 12:09:00 UTC

[jira] [Updated] (SPARK-28183) Add a task status filter for taskList in REST API

     [ https://issues.apache.org/jira/browse/SPARK-28183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lantao Jin updated SPARK-28183:
-------------------------------
    Description: 
We have a scenario that our application needs to query failed tasks by REST API {{/applications/[app-id]/stages/[stage-id]/[stage-attempt-id]/taskList}} when Spark job is running. In a large Stage, it may filter out dozens of failed tasks from hundred thousands total tasks. It consumes much unnecessary memory and time both in Spark and App side.




  was:
We have a scenario that our application needs to query failed tasks by REST API {{/applications/[app-id]/stages/[stage-id]/[stage-attempt-id]/taskList}} when Spark job is running. In a large Stage, it may contain hundred thousands tasks totally. Although it offers a pagination query via {{?offset=[offset]&length=[len]}}, it still faces two disadvantages:
1. App still needs to query out all tasks. It consumes much unnecessary memory and time both in Spark and App side.
2. Pagination query via {{?offset=[offset]&length=[len]}} makes the logic much complex and it still needs to handle all tasks.





> Add a task status filter for taskList in REST API
> -------------------------------------------------
>
>                 Key: SPARK-28183
>                 URL: https://issues.apache.org/jira/browse/SPARK-28183
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core, Web UI
>    Affects Versions: 3.0.0
>            Reporter: Lantao Jin
>            Priority: Major
>
> We have a scenario that our application needs to query failed tasks by REST API {{/applications/[app-id]/stages/[stage-id]/[stage-attempt-id]/taskList}} when Spark job is running. In a large Stage, it may filter out dozens of failed tasks from hundred thousands total tasks. It consumes much unnecessary memory and time both in Spark and App side.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org