You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Takeshi Yamamuro (JIRA)" <ji...@apache.org> on 2018/01/12 06:03:00 UTC
[jira] [Commented] (SPARK-23021) AnalysisBarrier should not cut off
the explain output for Parsed Logical Plan
[ https://issues.apache.org/jira/browse/SPARK-23021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16323599#comment-16323599 ]
Takeshi Yamamuro commented on SPARK-23021:
------------------------------------------
hi, kris, you're working on it?
I think we just forget to override innerChildren in AnalysisBarrier.
{code}
>> Before
scala> Seq((1, 1)).toDF("a", "b").groupBy("a").count().sample(0.1).explain(true)
== Parsed Logical Plan ==
Sample 0.0, 0.1, false, -7661439431999668039
+- AnalysisBarrier Aggregate [a#5], [a#5, count(1) AS count#14L]
>> After
scala> Seq((1, 1)).toDF("a", "b").groupBy("a").count().sample(0.1).explain(true)
== Parsed Logical Plan ==
Sample 0.0, 0.1, false, -5086223488015741426
+- AnalysisBarrier
+- Aggregate [a#5], [a#5, count(1) AS count#14L]
+- Project [_1#2 AS a#5, _2#3 AS b#6]
+- LocalRelation [_1#2, _2#3]
{code}
> AnalysisBarrier should not cut off the explain output for Parsed Logical Plan
> -----------------------------------------------------------------------------
>
> Key: SPARK-23021
> URL: https://issues.apache.org/jira/browse/SPARK-23021
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.3.0
> Reporter: Kris Mok
>
> In PR#20094 as a follow up to SPARK-20392, there were some fixes to the handling of {{AnalysisBarrier}}, but there seem to be more cases that need to be fixed.
> One such case is that right now the Parsed Logical Plan in explain output would be cutoff by {{AnalysisBarrier}}, e.g.
> {code:none}
> scala> val df1 = spark.range(1).select('id as 'x, 'id + 1 as 'y).repartition(1).select('x === 'y)
> df1: org.apache.spark.sql.DataFrame = [(x = y): boolean]
> scala> df1.explain(true)
> == Parsed Logical Plan ==
> 'Project [('x = 'y) AS (x = y)#22]
> +- AnalysisBarrier Repartition 1, true
> == Analyzed Logical Plan ==
> (x = y): boolean
> Project [(x#16L = y#17L) AS (x = y)#22]
> +- Repartition 1, true
> +- Project [id#13L AS x#16L, (id#13L + cast(1 as bigint)) AS y#17L]
> +- Range (0, 1, step=1, splits=Some(8))
> == Optimized Logical Plan ==
> Project [(x#16L = y#17L) AS (x = y)#22]
> +- Repartition 1, true
> +- Project [id#13L AS x#16L, (id#13L + 1) AS y#17L]
> +- Range (0, 1, step=1, splits=Some(8))
> == Physical Plan ==
> *Project [(x#16L = y#17L) AS (x = y)#22]
> +- Exchange RoundRobinPartitioning(1)
> +- *Project [id#13L AS x#16L, (id#13L + 1) AS y#17L]
> +- *Range (0, 1, step=1, splits=8)
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org