You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "xi chaomin (Jira)" <ji...@apache.org> on 2022/06/28 10:20:00 UTC

[jira] [Comment Edited] (HUDI-4330) NPE when trying to upsert into a dataset with no Meta Fields

|  ![](cid:jira-generated-image-avatar-4b9492c2-5881-4780-babd-e9c382c59236) |
[xi
chaomin](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=xichaomin)
**edited a comment** on [![Bug](cid:jira-generated-image-
avatar-7272586e-a9bc-470c-ab9d-d9440bc6a99b)
HUDI-4330](https://issues.apache.org/jira/browse/HUDI-4330)  
---|---  
|  
---  
|  [Re: NPE when trying to upsert into a dataset with no Meta
Fields](https://issues.apache.org/jira/browse/HUDI-4330)  
---  
|  Currently, bloom filter depend on "hoodie.populate.meta.fields", If
"hoodie.populate.meta.fields" is false, we won't write bloom filter to the
footer.  
  
Some code:  
  
hudi-client/hudi-client-
common/src/main/java/org/apache/hudi/io/storage/HoodieAvroParquetWriter.java  
{code:java}  
  @Override  
  public void writeAvroWithMetadata(HoodieKey key, R avroRecord) throws
IOException {  
    if (populateMetaFields) {  
      prepRecordWithMetadata(key, avroRecord, instantTime,  
          taskContextSupplier.getPartitionIdSupplier().get(), getWrittenRecordCount(), fileName);  
      super.write(avroRecord);  
      writeSupport.add(key.getRecordKey());  
    } else {  
      super.write(avroRecord);  
    }  
  }  @Override  
  public void writeAvro(String key, IndexedRecord object) throws IOException {  
    super.write(object);  
    if (populateMetaFields) {  
      writeSupport.add(key);  
    }  
  } {code}  
  
  
hudi-client/hudi-client-
common/src/main/java/org/apache/hudi/io/storage/HoodieFileWriterFactory.java  
{code:java}  
private static <T extends HoodieRecordPayload, R extends IndexedRecord>
HoodieFileWriter<R> newParquetFileWriter(  
    String instantTime, Path path, HoodieWriteConfig config, Schema schema, HoodieTable hoodieTable,  
    TaskContextSupplier taskContextSupplier, boolean populateMetaFields) throws IOException {  
  boolean enableBloomFilter =
config.getIndexType().name().equals(BLOOM.name()) ||
config.getIndexType().name().equals(GLOBAL_BLOOM.name());  
  return newParquetFileWriter(instantTime, path, config, schema,
hoodieTable.getHadoopConf(),  
      taskContextSupplier, populateMetaFields, enableBloomFilter populateMetaFields );  
}  
  
private static <T extends HoodieRecordPayload, R extends IndexedRecord>
HoodieFileWriter<R> newParquetFileWriter(  
    String instantTime, Path path, HoodieWriteConfig config, Schema schema, Configuration conf,  
    TaskContextSupplier taskContextSupplier, boolean populateMetaFields, boolean enableBloomFilter) throws IOException {  
  Option<BloomFilter> filter = enableBloomFilter ?
Option.of(createBloomFilter(config)) : Option.empty();  
  HoodieAvroWriteSupport writeSupport = new HoodieAvroWriteSupport(new
AvroSchemaConverter(conf).convert(schema), schema, filter);  
  
  HoodieParquetConfig<HoodieAvroWriteSupport> parquetConfig = new
HoodieParquetConfig<>(writeSupport, config.getParquetCompressionCodec(),  
      config.getParquetBlockSize(), config.getParquetPageSize(), config.getParquetMaxFileSize(),  
      conf, config.getParquetCompressionRatio(), config.parquetDictionaryEnabled());  
  
  return new HoodieAvroParquetWriter<>(path, parquetConfig, instantTime,
taskContextSupplier, populateMetaFields);  
}{code}  
---  
|  |  [ ![Add Comment](cid:jira-generated-image-static-comment-
icon-1d689037-e30f-4c4d-b1e1-b4cdf45f1899)
](https://issues.apache.org/jira/browse/HUDI-4330#add-comment "Add Comment") |
[Add Comment](https://issues.apache.org/jira/browse/HUDI-4330#add-comment "Add
Comment")  
---|---  
  
|  This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9) |  |
![Atlassian logo](https://issues.apache.org/jira/images/mail/atlassian-email-
logo.png)  
---