You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/04/23 05:04:34 UTC

[GitHub] [incubator-hudi] vinothchandar commented on issue #1549: Potential issue when using Deltastreamer with DMS

vinothchandar commented on issue #1549:
URL: https://github.com/apache/incubator-hudi/issues/1549#issuecomment-618179797


   
   > When I look into the specific S3 folder, I see that the insert and delete into the partition actually  create a new .parquet file with no log file.
   
   So inserts in MOR still go to a parquet file. only updates go to a log file (merging is much more expensive since it reads, merges and write parquet, than just writing parquet). So what you saw is expected behavior. 
   
   
   > it does not reflect the deletes.Based on my understanding, the _rt table should reflect the deletes
   
   True.. it should reflect the deletes. MOR would have logged a delete block into the log file and the keys should be listed there.. Do you know the log files? if so, you can use the CLI and see what's inside the logs, there is a command to inspect the log file there.. 
   
   Happy to get this ironed out.. 
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org