You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/08/08 07:33:00 UTC
[jira] [Created] (SPARK-28654) Move "Extract Python UDFs" to the
last in optimizer
Hyukjin Kwon created SPARK-28654:
------------------------------------
Summary: Move "Extract Python UDFs" to the last in optimizer
Key: SPARK-28654
URL: https://issues.apache.org/jira/browse/SPARK-28654
Project: Spark
Issue Type: Improvement
Components: SQL
Affects Versions: 3.0.0
Reporter: Hyukjin Kwon
Plans after "Extract Python UDFs" are very flaky and error-prone to other plans. For instance,
if we add some rules, for instance, [{PushDownPredicates}},
The optimization is rolled back as below:
{code}
=== Applying Rule org.apache.spark.sql.catalyst.optimizer.PushDownPredicates ===
!Filter (dummyUDF(a#7, c#18) = dummyUDF(d#19, c#18)) Join Cross, (dummyUDF(a#7, c#18) = dummyUDF(d#19, c#18))
!+- Join Cross :- Project [_1#2 AS a#7, _2#3 AS b#8]
! :- Project [_1#2 AS a#7, _2#3 AS b#8] : +- LocalRelation [_1#2, _2#3]
! : +- LocalRelation [_1#2, _2#3] +- Project [_1#13 AS c#18, _2#14 AS d#19]
! +- Project [_1#13 AS c#18, _2#14 AS d#19] +- LocalRelation [_1#13, _2#14]
! +- LocalRelation [_1#13, _2#14]
{code}
Seems we should do Python UDFs cases at the last even after post hoc rules.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org