You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/05/07 15:07:04 UTC

[GitHub] [incubator-hudi] reste85 edited a comment on issue #1598: [SUPPORT] Slow upsert time reading from Kafka

reste85 edited a comment on issue #1598:
URL: https://github.com/apache/incubator-hudi/issues/1598#issuecomment-625312293


   Just a note:
   We had 16 mln of records in the topic. According to the 0.5.2-inc version, Deltastreamer reads 5mln records at each iteration. First three runs were ok (so we've correctly ingested 15mln records). Last run seemed stuck (for 1.8 hours): no resources usage, no network usage etc. So i've asked to pump up some new data inside the topic and the job suddenly completed.
   Does this means that to perform the computation we need at least some X data in Kafka? does this depends on how KafkaRDD are designed? 
   
   Thank you!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org