You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Zhaojing Yu (Jira)" <ji...@apache.org> on 2022/10/01 12:22:00 UTC
[jira] [Updated] (HUDI-3646) The Hudi update syntax should not modify the nullability attribute of a column
[ https://issues.apache.org/jira/browse/HUDI-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zhaojing Yu updated HUDI-3646:
------------------------------
Fix Version/s: 0.13.0
(was: 0.12.1)
> The Hudi update syntax should not modify the nullability attribute of a column
> ------------------------------------------------------------------------------
>
> Key: HUDI-3646
> URL: https://issues.apache.org/jira/browse/HUDI-3646
> Project: Apache Hudi
> Issue Type: Bug
> Components: spark-sql
> Affects Versions: 0.10.1
> Environment: spark3.1.2
> Reporter: Tao Meng
> Assignee: Alexey Kudinkin
> Priority: Minor
> Fix For: 0.13.0
>
>
> now, when we use sparksql to update hudi table, we find that hudi will change the nullability attribute of a column
> eg:
> {code:java}
> // code placeholder
> val tableName = generateTableName
> val tablePath = s"${new Path(tmp.getCanonicalPath, tableName).toUri.toString}"
> // create table
> spark.sql(
> s"""
> |create table $tableName (
> | id int,
> | name string,
> | price double,
> | ts long
> |) using hudi
> | location '$tablePath'
> | options (
> | type = '$tableType',
> | primaryKey = 'id',
> | preCombineField = 'ts'
> | )
> """.stripMargin)
> // insert data to table
> spark.sql(s"insert into $tableName select 1, 'a1', 10, 1000")
> spark.sql(s"select * from $tableName").printSchema()
> // update data
> spark.sql(s"update $tableName set price = 20 where id = 1")
> spark.sql(s"select * from $tableName").printSchema() {code}
>
> |-- _hoodie_commit_time: string (nullable = true)
> |-- _hoodie_commit_seqno: string (nullable = true)
> |-- _hoodie_record_key: string (nullable = true)
> |-- _hoodie_partition_path: string (nullable = true)
> |-- _hoodie_file_name: string (nullable = true)
> |-- id: integer (nullable = true)
> |-- name: string (nullable = true)
> *|-- price: double (nullable = true)*
> |-- ts: long (nullable = true)
>
> |-- _hoodie_commit_time: string (nullable = true)
> |-- _hoodie_commit_seqno: string (nullable = true)
> |-- _hoodie_record_key: string (nullable = true)
> |-- _hoodie_partition_path: string (nullable = true)
> |-- _hoodie_file_name: string (nullable = true)
> |-- id: integer (nullable = true)
> |-- name: string (nullable = true)
> *|-- price: double (nullable = false )*
> |-- ts: long (nullable = true)
>
> the nullable attribute of price has been changed to false, This is not the result we want
--
This message was sent by Atlassian Jira
(v8.20.10#820010)