You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/06/05 22:47:54 UTC

[GitHub] [hudi] parisni commented on issue #5751: [SUPPORT] Bulk insert + OCC = NullPointerException

parisni commented on issue #5751:
URL: https://github.com/apache/hudi/issues/5751#issuecomment-1146897023

   As a workaround, I am able to use both bulk-insert and OCC but turning:
   ```
       {"hoodie.datasource.write.row.writer.enable", "false"},
   ```
   
   
   from the code, there is two ways of bulk-inserting : `bulkInsertAsRow` which leads to NPE with OCC
   and `bulkInsert` which deals correctly with OCC
   
   ```
         if (hoodieConfig.getBoolean(ENABLE_ROW_WRITER) &&
           operation == WriteOperationType.BULK_INSERT) {
           val (success, commitTime: common.util.Option[String]) = bulkInsertAsRow(sqlContext, parameters, df, tblName,
             basePath, path, instantTime, partitionColumns)
   
   ```
   
   
   ```
     public JavaRDD<WriteStatus> bulkInsert(JavaRDD<HoodieRecord<T>> records, String instantTime, Option<BulkInsertPartitioner> userDefinedBulkInsertPartitioner) {
       HoodieTable<T, HoodieData<HoodieRecord<T>>, HoodieData<HoodieKey>, HoodieData<WriteStatus>> table =
           initTable(WriteOperationType.BULK_INSERT, Option.ofNullable(instantTime));
       table.validateInsertSchema();
       preWrite(instantTime, WriteOperationType.BULK_INSERT, table.getMetaClient());
       HoodieWriteMetadata<HoodieData<WriteStatus>> result = table.bulkInsert(context,instantTime, HoodieJavaRDD.of(records), userDefinedBulkInsertPartitioner);
       HoodieWriteMetadata<JavaRDD<WriteStatus>> resultRDD = result.clone(HoodieJavaRDD.getJavaRDD(result.getWriteStatuses()));
       return postWrite(resultRDD, instantTime, table);
     }
   ```
   
   We should make the former aware of OCC to fix that bug


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org