You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "eaton (JIRA)" <ji...@apache.org> on 2018/09/27 01:59:00 UTC
[jira] [Created] (SPARK-25548) In the PruneFileSourcePartitions
optimizer, replace the nonPartitionOps field with true in the
And(partitionOps, nonPartitionOps) to make the partition can be pruned
eaton created SPARK-25548:
-----------------------------
Summary: In the PruneFileSourcePartitions optimizer, replace the nonPartitionOps field with true in the And(partitionOps, nonPartitionOps) to make the partition can be pruned
Key: SPARK-25548
URL: https://issues.apache.org/jira/browse/SPARK-25548
Project: Spark
Issue Type: Improvement
Components: SQL
Affects Versions: 2.3.2
Reporter: eaton
In the PruneFileSourcePartitions optimizer, the partition files will not be pruned if we use partition filter and non partition filter together, for example:
sql("CREATE TABLE IF NOT EXISTS src_par (key INT, value STRING) partitioned by(p_d int) stored as parquet ")
sql("insert overwrite table src_par partition(p_d=2) select 2 as key, '4' as value")
sql("insert overwrite table src_par partition(p_d=3) select 3 as key, '4' as value")
sql("insert overwrite table src_par partition(p_d=4) select 4 as key, '4' as value")
The sql below will scan all the partition files, in which, the partition **p_d=4** should be pruned.
**sql("select * from src_par where (p_d=2 and key=2) or (p_d=3 and key=3)").show**
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org