You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/01/09 05:01:33 UTC

[GitHub] [hudi] nsivabalan commented on issue #4541: [SUPPORT] NullPointerException while writing Bulk ingest table

nsivabalan commented on issue #4541:
URL: https://github.com/apache/hudi/issues/4541#issuecomment-1008230765


   let's try to remove some advanced configs, and test if we can make a simple job succeed and then we can add back more configs to deduce the issue.
   
   - I see you have added lot of custom configs for index. can we remove them for now. 
   ```
   'hoodie.bloom.index.bucketized.checking': True,
               'hoodie.bloom.index.keys.per.bucket': 50000000,
               'hoodie.index.bloom.num_entries': 1000000,
               'hoodie.bloom.index.use.caching': True,
               'hoodie.bloom.index.use.treebased.filter': True,
               'hoodie.bloom.index.filter.type': 'DYNAMIC_V0',
               'hoodie.bloom.index.filter.dynamic.max.entries': 1000000,
               'hoodie.bloom.index.prune.by.ranges': True,
   ```
   - 'write.parquet.block.size': 256 seems very low. Can we remove this for now. 
   - I see the exception arises from clustering code. lets try to remove them for now. 
   ```
   'hoodie.clustering.inline': True,
               'hoodie.clustering.inline.max.commits': '1',
               'hoodie.clustering.plan.strategy.small.file.limit': '1073741824',
               'hoodie.clustering.plan.strategy.target.file.max.bytes': '2147483648',
               'hoodie.clustering.execution.strategy.class':
                   'org.apache.hudi.client.clustering.run.strategy'
                   '.SparkSortAndSizeExecutionStrategy',
               'hoodie.clustering.plan.strategy.sort.columns': sort_cols,
   ```
   
   Lets try to see if the job succeeds after making above modifications. and we can go from there.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org