You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Nikita Konda (Jira)" <ji...@apache.org> on 2019/09/10 00:49:00 UTC

[jira] [Updated] (SPARK-29029) PhysicalOperation.collectProjectsAndFilters should use AttributeMap while substituting aliases

     [ https://issues.apache.org/jira/browse/SPARK-29029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nikita Konda updated SPARK-29029:
---------------------------------
    Description: 
We have a specific use case where in we are trying insert a custom logical operator in our logical plan to avoid some of the Spark’s optimization rules. However, we remove this logical operator as part of custom optimization rule before we send this to SparkStrategies.

Analyzed plan:
{code:java}
[1] Project [userid#0]
+- [2] SubqueryAlias tmp6
   +- [3] Project [videoid#47L, avebitrate#2, userid#0]
      +- [4] Filter NOT (videoid#47L = cast(30 as bigint))
         +- [5] SubqueryAlias tmp5
            +- [6] CustomBarrier
               +- [7] Project [videoid#47L, avebitrate#2, userid#0]
                  +- [8] Filter (avebitrate#2 < 10)
                     +- [9] SubqueryAlias tmp3
                        +- [10] Project [avebitrate#2, factorial(videoid#1) AS videoid#47L, userid#0]
                           +- [11] SubqueryAlias tmp2
                              +- [12] Project [userid#0, videoid#1, avebitrate#2]
                                 +- [13] SubqueryAlias tmp1
                                    +- [14] Project [userid#0, videoid#1, avebitrate#2]
                                       +- [15] SubqueryAlias views
                                          +- [16] Relation[userid#0,videoid#1,avebitrate#2] 
{code}
 

Optimized Plan:
{code:java}
[1] Project [userid#0]
+- [2] Filter (isnotnull(videoid#47L) && NOT (videoid#47L = 30))
   +- [3] Project [factorial(videoid#1) AS videoid#47L, userid#0]
      +- [4] Filter (isnotnull(avebitrate#2) && (avebitrate#2 < 10))
         +- [5] Relation[userid#0,videoid#1,avebitrate#2]
{code}
 

  was:
We have a specific use case where in we are trying insert a custom logical operator in our logical plan to avoid some of the Spark’s optimization rules. However, we remove this logical operator as part of custom optimization rule before we send this to SparkStrategies.

Analyzed plan:

 
{code:java}
[1] Project [userid#0]
+- [2] SubqueryAlias tmp6
   +- [3] Project [videoid#47L, avebitrate#2, userid#0]
      +- [4] Filter NOT (videoid#47L = cast(30 as bigint))
         +- [5] SubqueryAlias tmp5
            +- [6] CustomBarrier
               +- [7] Project [videoid#47L, avebitrate#2, userid#0]
                  +- [8] Filter (avebitrate#2 < 10)
                     +- [9] SubqueryAlias tmp3
                        +- [10] Project [avebitrate#2, factorial(videoid#1) AS videoid#47L, userid#0]
                           +- [11] SubqueryAlias tmp2
                              +- [12] Project [userid#0, videoid#1, avebitrate#2]
                                 +- [13] SubqueryAlias tmp1
                                    +- [14] Project [userid#0, videoid#1, avebitrate#2]
                                       +- [15] SubqueryAlias views
                                          +- [16] Relation[userid#0,videoid#1,avebitrate#2] 
{code}
 

Optimized Plan:
{code:java}
[1] Project [userid#0]
+- [2] Filter (isnotnull(videoid#47L) && NOT (videoid#47L = 30))
   +- [3] Project [factorial(videoid#1) AS videoid#47L, userid#0]
      +- [4] Filter (isnotnull(avebitrate#2) && (avebitrate#2 < 10))
         +- [5] Relation[userid#0,videoid#1,avebitrate#2]
{code}
 


> PhysicalOperation.collectProjectsAndFilters should use AttributeMap while substituting aliases
> ----------------------------------------------------------------------------------------------
>
>                 Key: SPARK-29029
>                 URL: https://issues.apache.org/jira/browse/SPARK-29029
>             Project: Spark
>          Issue Type: Bug
>          Components: Optimizer
>    Affects Versions: 2.3.0
>            Reporter: Nikita Konda
>            Priority: Major
>
> We have a specific use case where in we are trying insert a custom logical operator in our logical plan to avoid some of the Spark’s optimization rules. However, we remove this logical operator as part of custom optimization rule before we send this to SparkStrategies.
> Analyzed plan:
> {code:java}
> [1] Project [userid#0]
> +- [2] SubqueryAlias tmp6
>    +- [3] Project [videoid#47L, avebitrate#2, userid#0]
>       +- [4] Filter NOT (videoid#47L = cast(30 as bigint))
>          +- [5] SubqueryAlias tmp5
>             +- [6] CustomBarrier
>                +- [7] Project [videoid#47L, avebitrate#2, userid#0]
>                   +- [8] Filter (avebitrate#2 < 10)
>                      +- [9] SubqueryAlias tmp3
>                         +- [10] Project [avebitrate#2, factorial(videoid#1) AS videoid#47L, userid#0]
>                            +- [11] SubqueryAlias tmp2
>                               +- [12] Project [userid#0, videoid#1, avebitrate#2]
>                                  +- [13] SubqueryAlias tmp1
>                                     +- [14] Project [userid#0, videoid#1, avebitrate#2]
>                                        +- [15] SubqueryAlias views
>                                           +- [16] Relation[userid#0,videoid#1,avebitrate#2] 
> {code}
>  
> Optimized Plan:
> {code:java}
> [1] Project [userid#0]
> +- [2] Filter (isnotnull(videoid#47L) && NOT (videoid#47L = 30))
>    +- [3] Project [factorial(videoid#1) AS videoid#47L, userid#0]
>       +- [4] Filter (isnotnull(avebitrate#2) && (avebitrate#2 < 10))
>          +- [5] Relation[userid#0,videoid#1,avebitrate#2]
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org