You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/10/22 00:44:55 UTC

[GitHub] [hudi] danny0405 commented on a diff in pull request #7009: [HUDI-5058]Fix flink catalog read spark table error : primary key col can not be nullable

danny0405 commented on code in PR #7009:
URL: https://github.com/apache/hudi/pull/7009#discussion_r1002288141


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/catalog/HoodieHiveCatalog.java:
##########
@@ -397,17 +399,22 @@ public CatalogBaseTable getTable(ObjectPath tablePath) throws TableNotExistExcep
     String path = hiveTable.getSd().getLocation();
     Map<String, String> parameters = hiveTable.getParameters();
     Schema latestTableSchema = StreamerUtil.getLatestTableSchema(path, hiveConf);
+    String pkColumnsStr = parameters.get(FlinkOptions.RECORD_KEY_FIELD.key());
+    List<String> pkColumns = StringUtils.isNullOrEmpty(pkColumnsStr)
+        ? null : StringUtils.split(pkColumnsStr, ",");
     org.apache.flink.table.api.Schema schema;
     if (latestTableSchema != null) {
+      // if the table is initialized from spark, the write schema is nullable for pk columns.
+      DataType tableDataType = DataTypeUtils.ensureColumnsAsNonNullable(

Review Comment:
   It is a common behavior: a column is by default nullable if user does not declare the nullability in DDL. And for primary keys, they must be forced as non-nullable.
   
   Flink would generate correct avro schema if the table was initialized from Flink app, what we fix here is a table created by Spark, so i guess, spark does not take the primary key constraint into nullability somewhere.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org