You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "weiliang hao (Jira)" <ji...@apache.org> on 2023/02/01 11:37:00 UTC
[jira] [Updated] (SPARK-41241) Use Hive and Spark SQL to modify table field comment, the modified results of Hive cannot be queried using Spark SQL
[ https://issues.apache.org/jira/browse/SPARK-41241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
weiliang hao updated SPARK-41241:
---------------------------------
Description:
-- Hive
> create table table_test(id int);
> alter table table_test change column id id int comment "hive comment";
> desc formatted table_test;
{code:java}
+-------------------------------+----------------------------------------------------+----------------------------------------------------+
| col_name | data_type | comment |
+-------------------------------+----------------------------------------------------+----------------------------------------------------+
| # col_name | data_type | comment |
| id | int | hive comment |
| | NULL | NULL |
| # Detailed Table Information | NULL | NULL |
| Database: | default | NULL |
| OwnerType: | USER | NULL |
| Owner: | anonymous | NULL |
| CreateTime: | Wed Nov 23 23:06:41 CST 2022 | NULL |
| LastAccessTime: | UNKNOWN | NULL |
| Retention: | 0 | NULL |
| Location: | hdfs://localhost:8020/warehouse/tablespace/managed/hive/table_test | NULL |
| Table Type: | MANAGED_TABLE | NULL |
| Table Parameters: | NULL | NULL |
| | COLUMN_STATS_ACCURATE | {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"id\":\"true\"}} |
| | bucketing_version | 2 |
| | last_modified_by | anonymous |
| | last_modified_time | 1669216665 |
| | numFiles | 0 |
| | numRows | 0 |
| | rawDataSize | 0 |
| | totalSize | 0 |
| | transactional | true |
| | transactional_properties | default |
| | transient_lastDdlTime | 1669216665 |
| | NULL | NULL |
| # Storage Information | NULL | NULL |
| SerDe Library: | org.apache.hadoop.hive.ql.io.orc.OrcSerde | NULL |
| InputFormat: | org.apache.hadoop.hive.ql.io.orc.OrcInputFormat | NULL |
| OutputFormat: | org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat | NULL |
| Compressed: | No | NULL |
| Num Buckets: | -1 | NULL |
| Bucket Columns: | [] | NULL |
| Sort Columns: | [] | NULL |
| Storage Desc Params: | NULL | NULL |
| | serialization.format | 1 |
+-------------------------------+----------------------------------------------------+----------------------------------------------------+ {code}
-- Spark SQL
> alter table table_test change column id id int comment "spark comment";
> desc formatted table_test;
{code:java}
+-------------------------------+----------------------------------------------------+--------------+
| col_name | data_type | comment |
+-------------------------------+----------------------------------------------------+--------------+
| id | int | spark comment |
| | | |
| # Detailed Table Information | | |
| Catalog | spark_catalog | |
| Database | default | |
| Table | table_test | |
| Owner | anonymous | |
| Created Time | Wed Nov 23 23:06:41 CST 2022 | |
| Last Access | UNKNOWN | |
| Created By | Spark 2.2 or prior | |
| Type | MANAGED | |
| Provider | hive | |
| Table Properties | [bucketing_version=2, last_modified_by=anonymous, last_modified_time=1669216665, transactional=true, transactional_properties=default, transient_lastDdlTime=1669216711] | |
| Location | hdfs://localhost:8020/warehouse/tablespace/managed/hive/table_test | |
| Serde Library | org.apache.hadoop.hive.ql.io.orc.OrcSerde | |
| InputFormat | org.apache.hadoop.hive.ql.io.orc.OrcInputFormat | |
| OutputFormat | org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat | |
| Storage Properties | [serialization.format=1] | |
| Partition Provider | Catalog | | {code}
-- Hive
> alter table table_test change column id id int comment "hive new comment";
-- Spark SQL
> desc formatted table_test;
{code:java}
+-------------------------------+----------------------------------------------------+--------------+
| col_name | data_type | comment |
+-------------------------------+----------------------------------------------------+--------------+
| id | int | spark comment |
| | | |
| # Detailed Table Information | | |
| Catalog | spark_catalog | |
| Database | default | |
| Table | table_test | |
| Owner | anonymous | |
| Created Time | Wed Nov 23 23:06:41 CST 2022 | |
| Last Access | UNKNOWN | |
| Created By | Spark 2.2 or prior | |
| Type | MANAGED | |
| Provider | hive | |
| Table Properties | [bucketing_version=2, last_modified_by=anonymous, last_modified_time=1669216736, transactional=true, transactional_properties=default, transient_lastDdlTime=1669216736] | |
| Location | hdfs://localhost:8020/warehouse/tablespace/managed/hive/table_test | |
| Serde Library | org.apache.hadoop.hive.ql.io.orc.OrcSerde | |
| InputFormat | org.apache.hadoop.hive.ql.io.orc.OrcInputFormat | |
| OutputFormat | org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat | |
| Storage Properties | [serialization.format=1] | |
| Partition Provider | Catalog | |
+-------------------------------+----------------------------------------------------+--------------+ {code}
Alternately modify table field comments with hive and spark,the modified results of hive cannot be queried using spark sql
was:
---HIVE---
> create table table_test(id int);
> alter table table_test change column id id int comment "hive comment";
> desc formatted table_test;
{code:java}
+-------------------------------+----------------------------------------------------+----------------------------------------------------+
| col_name | data_type | comment |
+-------------------------------+----------------------------------------------------+----------------------------------------------------+
| # col_name | data_type | comment |
| id | int | hive comment |
| | NULL | NULL |
| # Detailed Table Information | NULL | NULL |
| Database: | default | NULL |
| OwnerType: | USER | NULL |
| Owner: | anonymous | NULL |
| CreateTime: | Wed Nov 23 23:06:41 CST 2022 | NULL |
| LastAccessTime: | UNKNOWN | NULL |
| Retention: | 0 | NULL |
| Location: | hdfs://localhost:8020/warehouse/tablespace/managed/hive/table_test | NULL |
| Table Type: | MANAGED_TABLE | NULL |
| Table Parameters: | NULL | NULL |
| | COLUMN_STATS_ACCURATE | {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"id\":\"true\"}} |
| | bucketing_version | 2 |
| | last_modified_by | anonymous |
| | last_modified_time | 1669216665 |
| | numFiles | 0 |
| | numRows | 0 |
| | rawDataSize | 0 |
| | totalSize | 0 |
| | transactional | true |
| | transactional_properties | default |
| | transient_lastDdlTime | 1669216665 |
| | NULL | NULL |
| # Storage Information | NULL | NULL |
| SerDe Library: | org.apache.hadoop.hive.ql.io.orc.OrcSerde | NULL |
| InputFormat: | org.apache.hadoop.hive.ql.io.orc.OrcInputFormat | NULL |
| OutputFormat: | org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat | NULL |
| Compressed: | No | NULL |
| Num Buckets: | -1 | NULL |
| Bucket Columns: | [] | NULL |
| Sort Columns: | [] | NULL |
| Storage Desc Params: | NULL | NULL |
| | serialization.format | 1 |
+-------------------------------+----------------------------------------------------+----------------------------------------------------+ {code}
---SPARK---
> alter table table_test change column id id int comment "spark comment";
> desc formatted table_test;
{code:java}
+-------------------------------+----------------------------------------------------+--------------+
| col_name | data_type | comment |
+-------------------------------+----------------------------------------------------+--------------+
| id | int | spark comment |
| | | |
| # Detailed Table Information | | |
| Catalog | spark_catalog | |
| Database | default | |
| Table | table_test | |
| Owner | anonymous | |
| Created Time | Wed Nov 23 23:06:41 CST 2022 | |
| Last Access | UNKNOWN | |
| Created By | Spark 2.2 or prior | |
| Type | MANAGED | |
| Provider | hive | |
| Table Properties | [bucketing_version=2, last_modified_by=anonymous, last_modified_time=1669216665, transactional=true, transactional_properties=default, transient_lastDdlTime=1669216711] | |
| Location | hdfs://localhost:8020/warehouse/tablespace/managed/hive/table_test | |
| Serde Library | org.apache.hadoop.hive.ql.io.orc.OrcSerde | |
| InputFormat | org.apache.hadoop.hive.ql.io.orc.OrcInputFormat | |
| OutputFormat | org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat | |
| Storage Properties | [serialization.format=1] | |
| Partition Provider | Catalog | | {code}
---HIVE---
> alter table table_test change column id id int comment "hive new comment";
---SPARK---
> desc formatted table_test;
{code:java}
+-------------------------------+----------------------------------------------------+--------------+
| col_name | data_type | comment |
+-------------------------------+----------------------------------------------------+--------------+
| id | int | spark comment |
| | | |
| # Detailed Table Information | | |
| Catalog | spark_catalog | |
| Database | default | |
| Table | table_test | |
| Owner | anonymous | |
| Created Time | Wed Nov 23 23:06:41 CST 2022 | |
| Last Access | UNKNOWN | |
| Created By | Spark 2.2 or prior | |
| Type | MANAGED | |
| Provider | hive | |
| Table Properties | [bucketing_version=2, last_modified_by=anonymous, last_modified_time=1669216736, transactional=true, transactional_properties=default, transient_lastDdlTime=1669216736] | |
| Location | hdfs://localhost:8020/warehouse/tablespace/managed/hive/table_test | |
| Serde Library | org.apache.hadoop.hive.ql.io.orc.OrcSerde | |
| InputFormat | org.apache.hadoop.hive.ql.io.orc.OrcInputFormat | |
| OutputFormat | org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat | |
| Storage Properties | [serialization.format=1] | |
| Partition Provider | Catalog | |
+-------------------------------+----------------------------------------------------+--------------+ {code}
Alternately modify table field comments with hive and spark,the modified results of hive cannot be queried using spark sql
> Use Hive and Spark SQL to modify table field comment, the modified results of Hive cannot be queried using Spark SQL
> --------------------------------------------------------------------------------------------------------------------
>
> Key: SPARK-41241
> URL: https://issues.apache.org/jira/browse/SPARK-41241
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 3.0.0, 3.1.0, 3.2.0, 3.3.0
> Reporter: weiliang hao
> Priority: Major
>
> -- Hive
> > create table table_test(id int);
> > alter table table_test change column id id int comment "hive comment";
> > desc formatted table_test;
> {code:java}
> +-------------------------------+----------------------------------------------------+----------------------------------------------------+
> | col_name | data_type | comment |
> +-------------------------------+----------------------------------------------------+----------------------------------------------------+
> | # col_name | data_type | comment |
> | id | int | hive comment |
> | | NULL | NULL |
> | # Detailed Table Information | NULL | NULL |
> | Database: | default | NULL |
> | OwnerType: | USER | NULL |
> | Owner: | anonymous | NULL |
> | CreateTime: | Wed Nov 23 23:06:41 CST 2022 | NULL |
> | LastAccessTime: | UNKNOWN | NULL |
> | Retention: | 0 | NULL |
> | Location: | hdfs://localhost:8020/warehouse/tablespace/managed/hive/table_test | NULL |
> | Table Type: | MANAGED_TABLE | NULL |
> | Table Parameters: | NULL | NULL |
> | | COLUMN_STATS_ACCURATE | {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"id\":\"true\"}} |
> | | bucketing_version | 2 |
> | | last_modified_by | anonymous |
> | | last_modified_time | 1669216665 |
> | | numFiles | 0 |
> | | numRows | 0 |
> | | rawDataSize | 0 |
> | | totalSize | 0 |
> | | transactional | true |
> | | transactional_properties | default |
> | | transient_lastDdlTime | 1669216665 |
> | | NULL | NULL |
> | # Storage Information | NULL | NULL |
> | SerDe Library: | org.apache.hadoop.hive.ql.io.orc.OrcSerde | NULL |
> | InputFormat: | org.apache.hadoop.hive.ql.io.orc.OrcInputFormat | NULL |
> | OutputFormat: | org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat | NULL |
> | Compressed: | No | NULL |
> | Num Buckets: | -1 | NULL |
> | Bucket Columns: | [] | NULL |
> | Sort Columns: | [] | NULL |
> | Storage Desc Params: | NULL | NULL |
> | | serialization.format | 1 |
> +-------------------------------+----------------------------------------------------+----------------------------------------------------+ {code}
> -- Spark SQL
> > alter table table_test change column id id int comment "spark comment";
> > desc formatted table_test;
> {code:java}
> +-------------------------------+----------------------------------------------------+--------------+
> | col_name | data_type | comment |
> +-------------------------------+----------------------------------------------------+--------------+
> | id | int | spark comment |
> | | | |
> | # Detailed Table Information | | |
> | Catalog | spark_catalog | |
> | Database | default | |
> | Table | table_test | |
> | Owner | anonymous | |
> | Created Time | Wed Nov 23 23:06:41 CST 2022 | |
> | Last Access | UNKNOWN | |
> | Created By | Spark 2.2 or prior | |
> | Type | MANAGED | |
> | Provider | hive | |
> | Table Properties | [bucketing_version=2, last_modified_by=anonymous, last_modified_time=1669216665, transactional=true, transactional_properties=default, transient_lastDdlTime=1669216711] | |
> | Location | hdfs://localhost:8020/warehouse/tablespace/managed/hive/table_test | |
> | Serde Library | org.apache.hadoop.hive.ql.io.orc.OrcSerde | |
> | InputFormat | org.apache.hadoop.hive.ql.io.orc.OrcInputFormat | |
> | OutputFormat | org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat | |
> | Storage Properties | [serialization.format=1] | |
> | Partition Provider | Catalog | | {code}
> -- Hive
> > alter table table_test change column id id int comment "hive new comment";
>
> -- Spark SQL
> > desc formatted table_test;
> {code:java}
> +-------------------------------+----------------------------------------------------+--------------+
> | col_name | data_type | comment |
> +-------------------------------+----------------------------------------------------+--------------+
> | id | int | spark comment |
> | | | |
> | # Detailed Table Information | | |
> | Catalog | spark_catalog | |
> | Database | default | |
> | Table | table_test | |
> | Owner | anonymous | |
> | Created Time | Wed Nov 23 23:06:41 CST 2022 | |
> | Last Access | UNKNOWN | |
> | Created By | Spark 2.2 or prior | |
> | Type | MANAGED | |
> | Provider | hive | |
> | Table Properties | [bucketing_version=2, last_modified_by=anonymous, last_modified_time=1669216736, transactional=true, transactional_properties=default, transient_lastDdlTime=1669216736] | |
> | Location | hdfs://localhost:8020/warehouse/tablespace/managed/hive/table_test | |
> | Serde Library | org.apache.hadoop.hive.ql.io.orc.OrcSerde | |
> | InputFormat | org.apache.hadoop.hive.ql.io.orc.OrcInputFormat | |
> | OutputFormat | org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat | |
> | Storage Properties | [serialization.format=1] | |
> | Partition Provider | Catalog | |
> +-------------------------------+----------------------------------------------------+--------------+ {code}
>
> Alternately modify table field comments with hive and spark,the modified results of hive cannot be queried using spark sql
>
>
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org