You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Wenchen Fan (Jira)" <ji...@apache.org> on 2021/08/02 16:09:00 UTC
[jira] [Resolved] (SPARK-36086) The case of the delta table is inconsistent with parquet

     [ https://issues.apache.org/jira/browse/SPARK-36086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wenchen Fan resolved SPARK-36086.
---------------------------------
    Fix Version/s: 3.2.0
       Resolution: Fixed

Issue resolved by pull request 33576
[https://github.com/apache/spark/pull/33576]

> The case of the delta table is inconsistent with parquet
> --------------------------------------------------------
>
>                 Key: SPARK-36086
>                 URL: https://issues.apache.org/jira/browse/SPARK-36086
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 3.1.1
>            Reporter: Yuming Wang
>            Priority: Major
>             Fix For: 3.2.0
>
>
> How to reproduce this issue:
> {noformat}
> 1. Add delta-core_2.12-1.0.0-SNAPSHOT.jar to ${SPARK_HOME}/jars.
> 2. bin/spark-shell --conf spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension --conf spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog
> {noformat}
> {code:scala}
> spark.sql("create table t1 using parquet as select id, id as lower_id from range(5)")
> spark.sql("CREATE VIEW v1 as SELECT * FROM t1")
> spark.sql("CREATE TABLE t2 USING DELTA PARTITIONED BY (LOWER_ID) SELECT LOWER_ID, ID FROM v1")
> spark.sql("CREATE TABLE t3 USING PARQUET PARTITIONED BY (LOWER_ID) SELECT LOWER_ID, ID FROM v1")
> spark.sql("desc extended t2").show(false)
> spark.sql("desc extended t3").show(false)
> {code}
> {noformat}
> scala> spark.sql("desc extended t2").show(false)
> +----------------------------+--------------------------------------------------------------------------+-------+
> |col_name                    |data_type                                                                 |comment|
> +----------------------------+--------------------------------------------------------------------------+-------+
> |lower_id                    |bigint                                                                    |       |
> |id                          |bigint                                                                    |       |
> |                            |                                                                          |       |
> |# Partitioning              |                                                                          |       |
> |Part 0                      |lower_id                                                                  |       |
> |                            |                                                                          |       |
> |# Detailed Table Information|                                                                          |       |
> |Name                        |default.t2                                                                |       |
> |Location                    |file:/Users/yumwang/Downloads/spark-3.1.1-bin-hadoop2.7/spark-warehouse/t2|       |
> |Provider                    |delta                                                                     |       |
> |Table Properties            |[Type=MANAGED,delta.minReaderVersion=1,delta.minWriterVersion=2]          |       |
> +----------------------------+--------------------------------------------------------------------------+-------+
> scala> spark.sql("desc extended t3").show(false)
> +----------------------------+--------------------------------------------------------------------------+-------+
> |col_name                    |data_type                                                                 |comment|
> +----------------------------+--------------------------------------------------------------------------+-------+
> |ID                          |bigint                                                                    |null   |
> |LOWER_ID                    |bigint                                                                    |null   |
> |# Partition Information     |                                                                          |       |
> |# col_name                  |data_type                                                                 |comment|
> |LOWER_ID                    |bigint                                                                    |null   |
> |                            |                                                                          |       |
> |# Detailed Table Information|                                                                          |       |
> |Database                    |default                                                                   |       |
> |Table                       |t3                                                                        |       |
> |Owner                       |yumwang                                                                   |       |
> |Created Time                |Mon Jul 12 14:07:16 CST 2021                                              |       |
> |Last Access                 |UNKNOWN                                                                   |       |
> |Created By                  |Spark 3.1.1                                                               |       |
> |Type                        |MANAGED                                                                   |       |
> |Provider                    |PARQUET                                                                   |       |
> |Location                    |file:/Users/yumwang/Downloads/spark-3.1.1-bin-hadoop2.7/spark-warehouse/t3|       |
> |Serde Library               |org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe               |       |
> |InputFormat                 |org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat             |       |
> |OutputFormat                |org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat            |       |
> |Partition Provider          |Catalog                                                                   |       |
> +----------------------------+--------------------------------------------------------------------------+-------+
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org