You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Yuming Wang (JIRA)" <ji...@apache.org> on 2018/07/26 14:51:00 UTC

[jira] [Updated] (SPARK-24937) Datasource partition table should load empty partitions

     [ https://issues.apache.org/jira/browse/SPARK-24937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuming Wang updated SPARK-24937:
--------------------------------
    Description: 
How to reproduce:
{code:sql}
spark-sql> CREATE TABLE tbl AS SELECT 1;
18/07/26 22:48:11 WARN HiveMetaStore: Location: file:/Users/yumwang/tmp/spark/spark-warehouse/tbl specified for non-external table:tbl
18/07/26 22:48:15 WARN ObjectStore: Failed to get database global_temp, returning NoSuchObjectException
spark-sql> CREATE TABLE tbl1 (c1 BIGINT, day STRING, hour STRING)
         > USING parquet
         > PARTITIONED BY (day, hour);
spark-sql> INSERT INTO TABLE tbl1 PARTITION (day = '2018-07-25', hour='01') SELECT * FROM tbl where 1=0;
spark-sql> SHOW PARTITIONS tbl1;
spark-sql> CREATE TABLE tbl2 (c1 BIGINT)
         > PARTITIONED BY (day STRING, hour STRING);
18/07/26 22:49:20 WARN HiveMetaStore: Location: file:/Users/yumwang/tmp/spark/spark-warehouse/tbl2 specified for non-external table:tbl2
spark-sql> INSERT INTO TABLE tbl2 PARTITION (day = '2018-07-25', hour='01') SELECT * FROM tbl where 1=0;
18/07/26 22:49:36 WARN log: Updating partition stats fast for: tbl2
18/07/26 22:49:36 WARN log: Updated size to 0
spark-sql> SHOW PARTITIONS tbl2;
day=2018-07-25/hour=01
spark-sql> 
{code}

  was:
{code:sql}
spark-sql> CREATE TABLE tbl AS SELECT 1;
18/07/26 22:48:11 WARN HiveMetaStore: Location: file:/Users/yumwang/tmp/spark/spark-warehouse/tbl specified for non-external table:tbl
18/07/26 22:48:15 WARN ObjectStore: Failed to get database global_temp, returning NoSuchObjectException
spark-sql> CREATE TABLE tbl1 (c1 BIGINT, day STRING, hour STRING)
         > USING parquet
         > PARTITIONED BY (day, hour);
spark-sql> INSERT INTO TABLE tbl1 PARTITION (day = '2018-07-25', hour='01') SELECT * FROM tbl where 1=0;
spark-sql> SHOW PARTITIONS tbl1;
spark-sql> CREATE TABLE tbl2 (c1 BIGINT)
         > PARTITIONED BY (day STRING, hour STRING);
18/07/26 22:49:20 WARN HiveMetaStore: Location: file:/Users/yumwang/tmp/spark/spark-warehouse/tbl2 specified for non-external table:tbl2
spark-sql> INSERT INTO TABLE tbl2 PARTITION (day = '2018-07-25', hour='01') SELECT * FROM tbl where 1=0;
18/07/26 22:49:36 WARN log: Updating partition stats fast for: tbl2
18/07/26 22:49:36 WARN log: Updated size to 0
spark-sql> SHOW PARTITIONS tbl2;
day=2018-07-25/hour=01
spark-sql> 
{code}


> Datasource partition table should load empty partitions
> -------------------------------------------------------
>
>                 Key: SPARK-24937
>                 URL: https://issues.apache.org/jira/browse/SPARK-24937
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.4.0
>            Reporter: Yuming Wang
>            Priority: Major
>
> How to reproduce:
> {code:sql}
> spark-sql> CREATE TABLE tbl AS SELECT 1;
> 18/07/26 22:48:11 WARN HiveMetaStore: Location: file:/Users/yumwang/tmp/spark/spark-warehouse/tbl specified for non-external table:tbl
> 18/07/26 22:48:15 WARN ObjectStore: Failed to get database global_temp, returning NoSuchObjectException
> spark-sql> CREATE TABLE tbl1 (c1 BIGINT, day STRING, hour STRING)
>          > USING parquet
>          > PARTITIONED BY (day, hour);
> spark-sql> INSERT INTO TABLE tbl1 PARTITION (day = '2018-07-25', hour='01') SELECT * FROM tbl where 1=0;
> spark-sql> SHOW PARTITIONS tbl1;
> spark-sql> CREATE TABLE tbl2 (c1 BIGINT)
>          > PARTITIONED BY (day STRING, hour STRING);
> 18/07/26 22:49:20 WARN HiveMetaStore: Location: file:/Users/yumwang/tmp/spark/spark-warehouse/tbl2 specified for non-external table:tbl2
> spark-sql> INSERT INTO TABLE tbl2 PARTITION (day = '2018-07-25', hour='01') SELECT * FROM tbl where 1=0;
> 18/07/26 22:49:36 WARN log: Updating partition stats fast for: tbl2
> 18/07/26 22:49:36 WARN log: Updated size to 0
> spark-sql> SHOW PARTITIONS tbl2;
> day=2018-07-25/hour=01
> spark-sql> 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org