You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/10/17 20:23:34 UTC

[GitHub] [hudi] reenarosid opened a new issue #2186: [SUPPORT]

reenarosid opened a new issue #2186:
URL: https://github.com/apache/hudi/issues/2186


   We are  assessing Apache hudi for GDPR compliance purpose. In the process , i have a bunch of question. Please have a look at them and help me understand .
   
   1. How does hudi deal with schema evolution:
      a. during type change
      b. during addition of new fields.
      c. during absence of field .
   2. Currently Athena supports Apache Hudi, is it same for AWS Glue?? or is there a hack to do it?
   3. To be GDPR compliant i would like to keep my file versions at 1, with this configuration will i be able to do a roll back (to the last commit)? 
   4. I tried doing a incremental pull and i was able to get the last set of records that got inserted or updated, is there a way to get the records which got deleted?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] reenarosid closed issue #2186: [SUPPORT] General question on Hudi about GDPR, Glue

Posted by GitBox <gi...@apache.org>.
reenarosid closed issue #2186:
URL: https://github.com/apache/hudi/issues/2186


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] bvaradar commented on issue #2186: [SUPPORT] General question on Hudi about GDPR, Glue

Posted by GitBox <gi...@apache.org>.
bvaradar commented on issue #2186:
URL: https://github.com/apache/hudi/issues/2186#issuecomment-713732423


   Good point. Added FAQ sections for these questions in https://cwiki.apache.org/confluence/display/HUDI/FAQ


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on issue #2186: [SUPPORT]

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on issue #2186:
URL: https://github.com/apache/hudi/issues/2186#issuecomment-713308713


   Good to see if we can update our FAQ with these questions? :) 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] reenarosid commented on issue #2186: [SUPPORT] General question on Hudi about GDPR, Glue

Posted by GitBox <gi...@apache.org>.
reenarosid commented on issue #2186:
URL: https://github.com/apache/hudi/issues/2186#issuecomment-714215694


   Thanks a lot @bvaradar and @vinothchandar  for the taking your time to answer my questions, it helped a lot . Closing this issue.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] bvaradar commented on issue #2186: [SUPPORT]

Posted by GitBox <gi...@apache.org>.
bvaradar commented on issue #2186:
URL: https://github.com/apache/hudi/issues/2186#issuecomment-713305298


   cc @n3nash who is rethinking schema management in Hudi.
   
   1. Regarding schema evolution, Hudi uses avro schema compatibility rules. For example, new fields must be nullable, columns cannot be deleted...
   2. https://github.com/apache/hudi/issues/1977 should give context around AWS Glue support (cc @umehrot2 )
   3.  Yeah, I am assuming you are referring to cleaner retentions. Commits happen before cleaning. Any failed commits will not cause any side-effects.
   4.  For Incremental pull to work, you need to employ soft-deletes only with tombstones general pattern. You can retain hudi key but null out other columns (nullable) or obfuscate. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org