You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Yuming Wang (Jira)" <ji...@apache.org> on 2020/12/16 01:43:00 UTC

[jira] [Updated] (SPARK-33798) Simplify EqualTo(CaseWhen/If, Literal) always false

     [ https://issues.apache.org/jira/browse/SPARK-33798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuming Wang updated SPARK-33798:
--------------------------------
    Description: 
Simplify CaseWhen/If with EqualTo if all values are Literal and always false, this is a real case from production:
{code:sql}
create table t1 using parquet as select * from range(100);
create table t2 using parquet as select * from range(200);

create temp view v1 as                                                             
select 'a' as event_type, * from t1                                                
union all                                                                          
select CASE WHEN id = 1 THEN 'b' WHEN id = 3 THEN 'c' end as event_type, * from t2 

explain select * from v1 where event_type = 'a';
{code}

Before this PR:


{noformat}
== Physical Plan ==
Union
:- *(1) Project [a AS event_type#30533, id#30535L]
:  +- *(1) ColumnarToRow
:     +- FileScan parquet default.t1[id#30535L] Batched: true, DataFilters: [], Format: Parquet
+- *(2) Project [CASE WHEN (id#30536L = 1) THEN b WHEN (id#30536L = 3) THEN c END AS event_type#30534, id#30536L]
   +- *(2) Filter (CASE WHEN (id#30536L = 1) THEN b WHEN (id#30536L = 3) THEN c END = a)
      +- *(2) ColumnarToRow
         +- FileScan parquet default.t2[id#30536L] Batched: true, DataFilters: [(CASE WHEN (id#30536L = 1) THEN b WHEN (id#30536L = 3) THEN c END = a)], Format: Parquet
{noformat}

After this PR:


{noformat}
== Physical Plan ==
*(1) Project [a AS event_type#8, id#4L]
+- *(1) ColumnarToRow
   +- FileScan parquet default.t1[id#4L] Batched: true, DataFilters: [], Format: Parquet
{noformat}


> Simplify EqualTo(CaseWhen/If, Literal) always false
> ---------------------------------------------------
>
>                 Key: SPARK-33798
>                 URL: https://issues.apache.org/jira/browse/SPARK-33798
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.2.0
>            Reporter: Yuming Wang
>            Priority: Major
>
> Simplify CaseWhen/If with EqualTo if all values are Literal and always false, this is a real case from production:
> {code:sql}
> create table t1 using parquet as select * from range(100);
> create table t2 using parquet as select * from range(200);
> create temp view v1 as                                                             
> select 'a' as event_type, * from t1                                                
> union all                                                                          
> select CASE WHEN id = 1 THEN 'b' WHEN id = 3 THEN 'c' end as event_type, * from t2 
> explain select * from v1 where event_type = 'a';
> {code}
> Before this PR:
> {noformat}
> == Physical Plan ==
> Union
> :- *(1) Project [a AS event_type#30533, id#30535L]
> :  +- *(1) ColumnarToRow
> :     +- FileScan parquet default.t1[id#30535L] Batched: true, DataFilters: [], Format: Parquet
> +- *(2) Project [CASE WHEN (id#30536L = 1) THEN b WHEN (id#30536L = 3) THEN c END AS event_type#30534, id#30536L]
>    +- *(2) Filter (CASE WHEN (id#30536L = 1) THEN b WHEN (id#30536L = 3) THEN c END = a)
>       +- *(2) ColumnarToRow
>          +- FileScan parquet default.t2[id#30536L] Batched: true, DataFilters: [(CASE WHEN (id#30536L = 1) THEN b WHEN (id#30536L = 3) THEN c END = a)], Format: Parquet
> {noformat}
> After this PR:
> {noformat}
> == Physical Plan ==
> *(1) Project [a AS event_type#8, id#4L]
> +- *(1) ColumnarToRow
>    +- FileScan parquet default.t1[id#4L] Batched: true, DataFilters: [], Format: Parquet
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org