You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "xuanzhiang (Jira)" <ji...@apache.org> on 2023/02/02 12:09:00 UTC

[jira] [Created] (SPARK-42292) Spark SQL not use hive partition info

xuanzhiang created SPARK-42292:
----------------------------------

             Summary: Spark SQL not use hive partition info
                 Key: SPARK-42292
                 URL: https://issues.apache.org/jira/browse/SPARK-42292
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 3.2.1
            Reporter: xuanzhiang


I use spark3 to count partition num , like : 

table a is external parquet table, it have 3 partition columns (year ,month, day).

query sql : "select distinct month , day from a where year = '2022' "

i think spark can find hive metadata and use partition info, but it load all  "year = '2022'" partition data.

in spark2.4, it use TableLocalScanExec ,but spark3 use HiveTableRelation and scan hive parquet.
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org