You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "xi chaomin (Jira)" <ji...@apache.org> on 2022/06/28 10:19:00 UTC

[jira] [Commented] (HUDI-4330) NPE when trying to upsert into a dataset with no Meta Fields

|  ![](cid:jira-generated-image-avatar-6544fed2-f5b9-409e-b0c5-7d4d0dc29391) |
[xi
chaomin](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=xichaomin)
**commented** on [![Bug](cid:jira-generated-image-
avatar-b7deae89-1597-4aa5-98d7-5b21019eb740)
HUDI-4330](https://issues.apache.org/jira/browse/HUDI-4330)  
---|---  
|  
---  
|  [Re: NPE when trying to upsert into a dataset with no Meta
Fields](https://issues.apache.org/jira/browse/HUDI-4330)  
---  
|

Currently, bloom filter depend on "hoodie.populate.meta.fields", If
"hoodie.populate.meta.fields" is false, we won't write bloom filter to the
footer.

Some code:

hudi-client/hudi-client-
common/src/main/java/org/apache/hudi/io/storage/HoodieAvroParquetWriter.java

    
    
      @Override
      public void writeAvroWithMetadata(HoodieKey key, R avroRecord) throws IOException {
        if (populateMetaFields) {
          prepRecordWithMetadata(key, avroRecord, instantTime,
              taskContextSupplier.getPartitionIdSupplier().get(), getWrittenRecordCount(), fileName);
          super.write(avroRecord);
          writeSupport.add(key.getRecordKey());
        } else {
          super.write(avroRecord);
        }
      }  @Override
      public void writeAvro(String key, IndexedRecord object) throws IOException {
        super.write(object);
        if (populateMetaFields) {
          writeSupport.add(key);
        }
      } 



hudi-client/hudi-client-
common/src/main/java/org/apache/hudi/io/storage/HoodieFileWriterFactory.java

    
    
    private static <T extends HoodieRecordPayload, R extends IndexedRecord> HoodieFileWriter<R> newParquetFileWriter(
        String instantTime, Path path, HoodieWriteConfig config, Schema schema, HoodieTable hoodieTable,
        TaskContextSupplier taskContextSupplier, boolean populateMetaFields) throws IOException {
      boolean enableBloomFilter = config.getIndexType().name().equals(BLOOM.name()) || config.getIndexType().name().equals(GLOBAL_BLOOM.name());
      return newParquetFileWriter(instantTime, path, config, schema, hoodieTable.getHadoopConf(),
          taskContextSupplier, populateMetaFields, enableBloomFilter);
    }
    
    private static <T extends HoodieRecordPayload, R extends IndexedRecord> HoodieFileWriter<R> newParquetFileWriter(
        String instantTime, Path path, HoodieWriteConfig config, Schema schema, Configuration conf,
        TaskContextSupplier taskContextSupplier, boolean populateMetaFields, boolean enableBloomFilter) throws IOException {
      Option<BloomFilter> filter = enableBloomFilter ? Option.of(createBloomFilter(config)) : Option.empty();
      HoodieAvroWriteSupport writeSupport = new HoodieAvroWriteSupport(new AvroSchemaConverter(conf).convert(schema), schema, filter);
    
      HoodieParquetConfig<HoodieAvroWriteSupport> parquetConfig = new HoodieParquetConfig<>(writeSupport, config.getParquetCompressionCodec(),
          config.getParquetBlockSize(), config.getParquetPageSize(), config.getParquetMaxFileSize(),
          conf, config.getParquetCompressionRatio(), config.parquetDictionaryEnabled());
    
      return new HoodieAvroParquetWriter<>(path, parquetConfig, instantTime, taskContextSupplier, populateMetaFields);
    }  
  
---  
|  |  [ ![Add Comment](cid:jira-generated-image-static-comment-
icon-29efdb51-bc53-458f-953a-50670eef4c96)
](https://issues.apache.org/jira/browse/HUDI-4330#add-comment "Add Comment") |
[Add Comment](https://issues.apache.org/jira/browse/HUDI-4330#add-comment "Add
Comment")  
---|---  
  
|  This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9) |  |
![Atlassian logo](https://issues.apache.org/jira/images/mail/atlassian-email-
logo.png)  
---