You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/08/24 09:51:59 UTC

[GitHub] [hudi] hughfdjackson edited a comment on issue #1979: [SUPPORT]: Is it possible to incrementally read only upserted rows where a material change has occurred?

hughfdjackson edited a comment on issue #1979:
URL: https://github.com/apache/hudi/issues/1979#issuecomment-679026628


   @bvaradar - As a follow-up question, your reply confirms that what we're looking for (ideally) isn't a Hudi feature currently.  Is it something you might be interested in supporting?
   
   In many use cases, the behaviour would likely be nearly identical to the current behaviour* - for snapshot queries, or for incrementally reading tables where the writer ensures only material changes** are written (e.g. some stream processing, or insert-only batch processes).  In the remaining use-cases like ours, it would cut back on a lot of noise + processing.   
   
   If so, I can talk to my team about contributing towards the project, since it would be valuable to us.  
   
   ----
   
   \* Implementation dependent, of course!  It may be that it'd require another metadata field to be added to support that sort of behaviour, for instance.  
   
   \** I'm using 'material changes' here to describe an upsert that impacts on the non-`_hoodie` columns.  Either a deletion, or a change in value to one of those columns.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org