You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (Jira)" <ji...@apache.org> on 2022/04/29 13:14:00 UTC

[jira] [Commented] (SPARK-39069) Simplify another conditionals case in predicate

    [ https://issues.apache.org/jira/browse/SPARK-39069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17529984#comment-17529984 ] 

Apache Spark commented on SPARK-39069:
--------------------------------------

User 'wangyum' has created a pull request for this issue:
https://github.com/apache/spark/pull/36410

> Simplify another conditionals case in predicate
> -----------------------------------------------
>
>                 Key: SPARK-39069
>                 URL: https://issues.apache.org/jira/browse/SPARK-39069
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.4.0
>            Reporter: Yuming Wang
>            Priority: Major
>
> {code:scala}
> sql(
>   """
>     |CREATE TABLE t1 (
>     |  id DECIMAL(18,0),
>     |  event_dt DATE,
>     |  cmpgn_run_dt DATE)
>     |USING parquet
>     |PARTITIONED BY (cmpgn_run_dt)
>   """.stripMargin)
> sql(
>   """
>     |select count(*)
>     |from t1
>     |where CMPGN_RUN_DT >= date_sub(EVENT_DT,2) and CMPGN_RUN_DT <= EVENT_DT
>     |and EVENT_DT ='2022-04-05'
>     |;
>   """.stripMargin).explain(true)
> {code}
> Excepted:
> {noformat}
> == Optimized Logical Plan ==
> Aggregate [count(1) AS count(1)#4L]
> +- Project
>    +- Filter (((isnotnull(CMPGN_RUN_DT#3) AND (CMPGN_RUN_DT#3 >= 2022-04-03)) AND (CMPGN_RUN_DT#3 <= 2022-04-05)) AND (EVENT_DT#2 = 2022-04-05))
>       +- Relation default.t1[id#1,event_dt#2,cmpgn_run_dt#3] parquet
> == Physical Plan ==
> *(2) HashAggregate(keys=[], functions=[count(1)], output=[count(1)#4L])
> +- Exchange SinglePartition, ENSURE_REQUIREMENTS, [id=#31]
>    +- *(1) HashAggregate(keys=[], functions=[partial_count(1)], output=[count#7L])
>       +- *(1) Project
>          +- *(1) Filter (EVENT_DT#2 = 2022-04-05)
>             +- *(1) ColumnarToRow
>                +- FileScan parquet default.t1[event_dt#2,cmpgn_run_dt#3] Batched: true, DataFilters: [(event_dt#2 = 2022-04-05)], Format: Parquet, Location: InMemoryFileIndex[], PartitionFilters: [isnotnull(cmpgn_run_dt#3), (cmpgn_run_dt#3 >= 2022-04-03), (cmpgn_run_dt#3 <= 2022-04-05)], PushedFilters: [EqualTo(event_dt,2022-04-05)], ReadSchema: struct<event_dt:date>, UsedIndexes: []
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org