You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/04/24 06:19:02 UTC

[GitHub] [incubator-hudi] HariprasadAllaka1612 commented on issue #1556: [SUPPORT] Input path in s3 doesn't exist if the write multiple datasets to s3 in a single execution

HariprasadAllaka1612 commented on issue #1556:
URL: https://github.com/apache/incubator-hudi/issues/1556#issuecomment-618825491


   @vinothchandar No let me be more clear. Below is the complete process i am doing
   
   1. Reading CDC table from hive (hoodie table) to get the latest marker,
   2. Read the files from S3 based on the latest marked read in step1.
   3. Process the files which will result in 2 data frames.
   4. Write both the data frames into the S3 in hoodie format and sync them to hive
   5. Update the marker with latest end time
   
   The problem here is when i am writing the data set for the first time, its working. But when i am trying UPSERT the data in the 2nd run its giving this error


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org