You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Idan Zalzberg (JIRA)" <ji...@apache.org> on 2018/11/17 04:52:00 UTC

[jira] [Updated] (SPARK-26097) Show partitioning details in DAG UI

     [ https://issues.apache.org/jira/browse/SPARK-26097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Idan Zalzberg updated SPARK-26097:
----------------------------------
    Attachment: image (8).png

> Show partitioning details in DAG UI
> -----------------------------------
>
>                 Key: SPARK-26097
>                 URL: https://issues.apache.org/jira/browse/SPARK-26097
>             Project: Spark
>          Issue Type: Improvement
>          Components: Web UI
>    Affects Versions: 2.2.0, 2.2.1, 2.2.2, 2.3.0, 2.3.1, 2.3.2, 2.4.0
>            Reporter: Idan Zalzberg
>            Priority: Major
>         Attachments: image (8).png
>
>
> We run complex SQL queries using Spark SQL, often we have to tackle a join skew or incorrect partition count. The problem is that while the Spark UI shows the existence of the problem and what *stage* it is part of, it's hard to infer back to the original SQL query that was given (e.g. what is the specific join operation that is actually skewed).
> One way to resolve this is to relate the Exchange nodes in the DAG to the partitioning that they represent, this is actually a trivial change in code (less than one line) that we believe can greatly benefit the research of performance issues.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org