You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "xi chaomin (Jira)" <ji...@apache.org> on 2022/06/28 10:20:00 UTC
[jira] [Comment Edited] (HUDI-4330) NPE when trying to upsert into a dataset with no Meta Fields
| ![](cid:jira-generated-image-avatar-4b9492c2-5881-4780-babd-e9c382c59236) |
[xi
chaomin](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=xichaomin)
**edited a comment** on [![Bug](cid:jira-generated-image-
avatar-7272586e-a9bc-470c-ab9d-d9440bc6a99b)
HUDI-4330](https://issues.apache.org/jira/browse/HUDI-4330)
---|---
|
---
| [Re: NPE when trying to upsert into a dataset with no Meta
Fields](https://issues.apache.org/jira/browse/HUDI-4330)
---
| Currently, bloom filter depend on "hoodie.populate.meta.fields", If
"hoodie.populate.meta.fields" is false, we won't write bloom filter to the
footer.
Some code:
hudi-client/hudi-client-
common/src/main/java/org/apache/hudi/io/storage/HoodieAvroParquetWriter.java
{code:java}
@Override
public void writeAvroWithMetadata(HoodieKey key, R avroRecord) throws
IOException {
if (populateMetaFields) {
prepRecordWithMetadata(key, avroRecord, instantTime,
taskContextSupplier.getPartitionIdSupplier().get(), getWrittenRecordCount(), fileName);
super.write(avroRecord);
writeSupport.add(key.getRecordKey());
} else {
super.write(avroRecord);
}
} @Override
public void writeAvro(String key, IndexedRecord object) throws IOException {
super.write(object);
if (populateMetaFields) {
writeSupport.add(key);
}
} {code}
hudi-client/hudi-client-
common/src/main/java/org/apache/hudi/io/storage/HoodieFileWriterFactory.java
{code:java}
private static <T extends HoodieRecordPayload, R extends IndexedRecord>
HoodieFileWriter<R> newParquetFileWriter(
String instantTime, Path path, HoodieWriteConfig config, Schema schema, HoodieTable hoodieTable,
TaskContextSupplier taskContextSupplier, boolean populateMetaFields) throws IOException {
boolean enableBloomFilter =
config.getIndexType().name().equals(BLOOM.name()) ||
config.getIndexType().name().equals(GLOBAL_BLOOM.name());
return newParquetFileWriter(instantTime, path, config, schema,
hoodieTable.getHadoopConf(),
taskContextSupplier, populateMetaFields, enableBloomFilter populateMetaFields );
}
private static <T extends HoodieRecordPayload, R extends IndexedRecord>
HoodieFileWriter<R> newParquetFileWriter(
String instantTime, Path path, HoodieWriteConfig config, Schema schema, Configuration conf,
TaskContextSupplier taskContextSupplier, boolean populateMetaFields, boolean enableBloomFilter) throws IOException {
Option<BloomFilter> filter = enableBloomFilter ?
Option.of(createBloomFilter(config)) : Option.empty();
HoodieAvroWriteSupport writeSupport = new HoodieAvroWriteSupport(new
AvroSchemaConverter(conf).convert(schema), schema, filter);
HoodieParquetConfig<HoodieAvroWriteSupport> parquetConfig = new
HoodieParquetConfig<>(writeSupport, config.getParquetCompressionCodec(),
config.getParquetBlockSize(), config.getParquetPageSize(), config.getParquetMaxFileSize(),
conf, config.getParquetCompressionRatio(), config.parquetDictionaryEnabled());
return new HoodieAvroParquetWriter<>(path, parquetConfig, instantTime,
taskContextSupplier, populateMetaFields);
}{code}
---
| | [ ![Add Comment](cid:jira-generated-image-static-comment-
icon-1d689037-e30f-4c4d-b1e1-b4cdf45f1899)
](https://issues.apache.org/jira/browse/HUDI-4330#add-comment "Add Comment") |
[Add Comment](https://issues.apache.org/jira/browse/HUDI-4330#add-comment "Add
Comment")
---|---
| This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9) | |
![Atlassian logo](https://issues.apache.org/jira/images/mail/atlassian-email-
logo.png)
---