You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/10/14 22:43:23 UTC

[GitHub] [hudi] afeldman1 removed a comment on issue #933: Support for multiple level partitioning in Hudi

afeldman1 removed a comment on issue #933:
URL: https://github.com/apache/hudi/issues/933#issuecomment-708698195


   Hey @LeoHsu0802,
   I believe you are missing specifying the key generator class. I'm presuming you have actual columns named year, month, and day, unless you have an actual timestamp column. But you need to specify the class using:
   DataSourceWriteOptions.KEYGENERATOR_CLASS_OPT_KEY: classOf[ComplexKeyGenerator].getName
   
   (If DataSourceWriteOptions.KEYGENERATOR_CLASS_OPT_KEY is not recognized in python, it's the same thing as "hoodie.datasource.write.keygenerator.class")
   
   This is where the CustomKeyGenerator, that I mentioned previously, comes in. I believe CustomKeyGenerator is the new preferred key class as of hudi version 0.6.0, however I'm still using ComplexKeyGenerator as I haven't been able to get CustomKeyGenerator to work.
   
   Details on the key class can be found here: https://hudi.apache.org/docs/writing_data.html#key-generation
   
   Also, are you using the Glue metastore with EMR and S3?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org