You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2019/07/10 14:53:00 UTC

[jira] [Updated] (SPARK-26097) Show partitioning details in DAG UI

     [ https://issues.apache.org/jira/browse/SPARK-26097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen updated SPARK-26097:
------------------------------
    Priority: Minor  (was: Major)

This can be reopened with a PR that would address the different approach described in the last PR.

> Show partitioning details in DAG UI
> -----------------------------------
>
>                 Key: SPARK-26097
>                 URL: https://issues.apache.org/jira/browse/SPARK-26097
>             Project: Spark
>          Issue Type: Improvement
>          Components: Web UI
>    Affects Versions: 2.2.0, 2.2.1, 2.2.2, 2.3.0, 2.3.1, 2.3.2, 2.4.0
>            Reporter: Idan Zalzberg
>            Priority: Minor
>         Attachments: image (8).png
>
>
> We run complex SQL queries using Spark SQL, often we have to tackle a join skew or incorrect partition count. The problem is that while the Spark UI shows the existence of the problem and what *stage* it is part of, it's hard to infer back to the original SQL query that was given (e.g. what is the specific join operation that is actually skewed).
> One way to resolve this is to relate the Exchange nodes in the DAG to the partitioning that they represent, this is actually a trivial change in code (less than one line) that we believe can greatly benefit the research of performance issues.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org