You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/04/19 00:42:02 UTC

[GitHub] [hudi] moustafaalaa commented on issue #5291: [SUPPORT] How to use hudi-defaults.conf with Glue

moustafaalaa commented on issue #5291:
URL: https://github.com/apache/hudi/issues/5291#issuecomment-1101884862

   @zhedoubushishi I updated the issue. args['HUDI_CONF_DIR'] is S3 URI to the hudi config. For example, 
   ```
   HUDI_CONF_DIR='s3://glue-development-bucket/scripts/hudi-conf/hudi-default.conf'
   ```
   
   This file contains the below details 
   
   ```
   # Default system properties included when running Hudi jobs.
   # This is useful for setting default environmental settings.
   
   # Example:
   hoodie.datasource.write.table.type                      COPY_ON_WRITE
   hoodie.datasource.write.hive_style_partitioning         false
   
   # commonConfig      
   className                                               org.apache.hudi
   hoodie.datasource.hive_sync.use_jdbc                    false
   hoodie.datasource.write.precombine.field                tpep_pickup_datetime
   hoodie.datasource.write.recordkey.field                 pk_col
   hoodie.table.name                                       ny_yellow_trip_data_paritioned
   hoodie.consistency.check.enabled                        true
   hoodie.datasource.hive_sync.database                    hudi10
   hoodie.datasource.hive_sync.table                       ny_yellow_trip_data_paritioned
   hoodie.datasource.hive_sync.enable                      true
   hoodie.metrics.on                                       true
   hoodie.metrics.reporter.type                            CLOUDWATCH
   path                                                    s3://hudi-update/hudi10/ny_yellow_trip_data_partitioned
   
   #partitionDataConfig    
   hoodie.datasource.write.partitionpath.field             payment_type
   hoodie.datasource.hive_sync.partition_extractor_class   org.apache.hudi.hive.MultiPartKeysValueExtractor
   hoodie.datasource.hive_sync.partition_fields            payment_type
   hoodie.datasource.write.hive_style_partitioning         true
   
   # incrementalConfig
   hoodie.datasource.write.operation                       upsert
   hoodie.cleaner.policy                                   KEEP_LATEST_COMMITS
   hoodie.cleaner.commits.retained                         1     
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org