You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Wenchen Fan (Jira)" <ji...@apache.org> on 2021/08/02 16:09:00 UTC
[jira] [Resolved] (SPARK-36086) The case of the delta table is
inconsistent with parquet
[ https://issues.apache.org/jira/browse/SPARK-36086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wenchen Fan resolved SPARK-36086.
---------------------------------
Fix Version/s: 3.2.0
Resolution: Fixed
Issue resolved by pull request 33576
[https://github.com/apache/spark/pull/33576]
> The case of the delta table is inconsistent with parquet
> --------------------------------------------------------
>
> Key: SPARK-36086
> URL: https://issues.apache.org/jira/browse/SPARK-36086
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Affects Versions: 3.1.1
> Reporter: Yuming Wang
> Priority: Major
> Fix For: 3.2.0
>
>
> How to reproduce this issue:
> {noformat}
> 1. Add delta-core_2.12-1.0.0-SNAPSHOT.jar to ${SPARK_HOME}/jars.
> 2. bin/spark-shell --conf spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension --conf spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog
> {noformat}
> {code:scala}
> spark.sql("create table t1 using parquet as select id, id as lower_id from range(5)")
> spark.sql("CREATE VIEW v1 as SELECT * FROM t1")
> spark.sql("CREATE TABLE t2 USING DELTA PARTITIONED BY (LOWER_ID) SELECT LOWER_ID, ID FROM v1")
> spark.sql("CREATE TABLE t3 USING PARQUET PARTITIONED BY (LOWER_ID) SELECT LOWER_ID, ID FROM v1")
> spark.sql("desc extended t2").show(false)
> spark.sql("desc extended t3").show(false)
> {code}
> {noformat}
> scala> spark.sql("desc extended t2").show(false)
> +----------------------------+--------------------------------------------------------------------------+-------+
> |col_name |data_type |comment|
> +----------------------------+--------------------------------------------------------------------------+-------+
> |lower_id |bigint | |
> |id |bigint | |
> | | | |
> |# Partitioning | | |
> |Part 0 |lower_id | |
> | | | |
> |# Detailed Table Information| | |
> |Name |default.t2 | |
> |Location |file:/Users/yumwang/Downloads/spark-3.1.1-bin-hadoop2.7/spark-warehouse/t2| |
> |Provider |delta | |
> |Table Properties |[Type=MANAGED,delta.minReaderVersion=1,delta.minWriterVersion=2] | |
> +----------------------------+--------------------------------------------------------------------------+-------+
> scala> spark.sql("desc extended t3").show(false)
> +----------------------------+--------------------------------------------------------------------------+-------+
> |col_name |data_type |comment|
> +----------------------------+--------------------------------------------------------------------------+-------+
> |ID |bigint |null |
> |LOWER_ID |bigint |null |
> |# Partition Information | | |
> |# col_name |data_type |comment|
> |LOWER_ID |bigint |null |
> | | | |
> |# Detailed Table Information| | |
> |Database |default | |
> |Table |t3 | |
> |Owner |yumwang | |
> |Created Time |Mon Jul 12 14:07:16 CST 2021 | |
> |Last Access |UNKNOWN | |
> |Created By |Spark 3.1.1 | |
> |Type |MANAGED | |
> |Provider |PARQUET | |
> |Location |file:/Users/yumwang/Downloads/spark-3.1.1-bin-hadoop2.7/spark-warehouse/t3| |
> |Serde Library |org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe | |
> |InputFormat |org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat | |
> |OutputFormat |org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat | |
> |Partition Provider |Catalog | |
> +----------------------------+--------------------------------------------------------------------------+-------+
> {noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org