You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/01/21 17:03:30 UTC

[GitHub] [hudi] jtmzheng opened a new issue #2470: [SUPPORT] Heavy skew in ListingBasedRollbackHelper

jtmzheng opened a new issue #2470:
URL: https://github.com/apache/hudi/issues/2470


   **Describe the problem you faced**
   We're seeing heavy skew in our Spark Streaming job when processing rollbacks (`mapToPair at ListingBasedRollbackHelper.java:100`). The example below took 2.2h to complete with a p75 of 15 minutes and a max of 2.2h (100 tasks, long tail of 8 tasks that took > 1 hour)
   
   <img width="1666" alt="Screen Shot 2021-01-20 at 11 36 04 AM" src="https://user-images.githubusercontent.com/3466206/105383583-4184f800-5bdf-11eb-8368-c345d452a6eb.png">
   
   What can cause this skew and what can we do to alleviate/investigate this?
   
   We have Hudi configured with:
   
   ```
   hudi_options = {
           "hoodie.table.name": "transactions",
           "hoodie.datasource.write.recordkey.field": "id.value",
           "hoodie.datasource.write.keygenerator.class": "org.apache.hudi.keygen.ComplexKeyGenerator",
           "hoodie.datasource.write.partitionpath.field": "year,month,day",
           "hoodie.datasource.write.table.name": "transactions",
           "hoodie.datasource.write.table.type": "MERGE_ON_READ",
           "hoodie.datasource.write.operation": "upsert",
           "hoodie.consistency.check.enabled": "true",
           "hoodie.datasource.write.precombine.field": "publishedAtUnixNano",
           "hoodie.compact.inline": True,
           "hoodie.compact.inline.max.delta.commits": 10,
           "hoodie.cleaner.commits.retained": 1,
   }
   ```
   
   **Environment Description**
   
   * Hudi version : 0.6.0
   
   * Spark version : 2.4.6 (EMR 5.31)
   
   * Hive version : 2.3.7
   
   * Hadoop version : Amazon 2.10.0
   
   * Storage (HDFS/S3/GCS..) : S3
   
   * Running on Docker? (yes/no) : no
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on issue #2470: [SUPPORT] Heavy skew in ListingBasedRollbackHelper

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on issue #2470:
URL: https://github.com/apache/hudi/issues/2470#issuecomment-764870104






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] n3nash commented on issue #2470: [SUPPORT] Heavy skew in ListingBasedRollbackHelper

Posted by GitBox <gi...@apache.org>.
n3nash commented on issue #2470:
URL: https://github.com/apache/hudi/issues/2470#issuecomment-771409047


   @jtmzheng Does using 0.7.0 and `hoodie.metadata.enable=true` solve the issue ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] jtmzheng commented on issue #2470: [SUPPORT] Heavy skew in ListingBasedRollbackHelper

Posted by GitBox <gi...@apache.org>.
jtmzheng commented on issue #2470:
URL: https://github.com/apache/hudi/issues/2470#issuecomment-771421761


   @n3nash I have not had a chance to look at 0.7.0 migration yet, what EMR versions is 0.7.0 compatible with?
   
   If my dataset is on 0.6.0 already do I just need to update the Hudi jar version for my Spark job? Unsure if there's anything else I need to do based on the migration guide https://hudi.apache.org/releases.html. Thanks!
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on issue #2470: [SUPPORT] Heavy skew in ListingBasedRollbackHelper

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on issue #2470:
URL: https://github.com/apache/hudi/issues/2470#issuecomment-771803301


   if you are already on 0.6.0, the upgrade step would have run anyway from 0.5.x. No additional steps needed to migrate to 0.7.0. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on issue #2470: [SUPPORT] Heavy skew in ListingBasedRollbackHelper

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on issue #2470:
URL: https://github.com/apache/hudi/issues/2470#issuecomment-770029110


   Thanks @jtmzheng . WIth 0.7.0 and `hoodie.metadata.enable=true`, this should be much faster to go over the file listings. marker based rollback avoids that altogether and can efficiently just find out the files that were written to.
   
   If you feel, we need to add a new compaction strategy/enhance, please feel free to raise a JIRA and also mention @nsivabalan 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] jtmzheng commented on issue #2470: [SUPPORT] Heavy skew in ListingBasedRollbackHelper

Posted by GitBox <gi...@apache.org>.
jtmzheng commented on issue #2470:
URL: https://github.com/apache/hudi/issues/2470#issuecomment-853177797


   Unfortunately no, ran into https://github.com/apache/hudi/issues/2995
   
   I think this issue is fine to close out, https://github.com/apache/hudi/issues/2470#issuecomment-769948718 got our rollback performance to an acceptable state even without the metadata table


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on issue #2470: [SUPPORT] Heavy skew in ListingBasedRollbackHelper

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on issue #2470:
URL: https://github.com/apache/hudi/issues/2470#issuecomment-908912277


   thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on issue #2470: [SUPPORT] Heavy skew in ListingBasedRollbackHelper

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on issue #2470:
URL: https://github.com/apache/hudi/issues/2470#issuecomment-908912277


   thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan closed issue #2470: [SUPPORT] Heavy skew in ListingBasedRollbackHelper

Posted by GitBox <gi...@apache.org>.
nsivabalan closed issue #2470:
URL: https://github.com/apache/hudi/issues/2470


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on issue #2470: [SUPPORT] Heavy skew in ListingBasedRollbackHelper

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on issue #2470:
URL: https://github.com/apache/hudi/issues/2470#issuecomment-764870104


   @jtmzheng can you just give the marker based rollbacks a shot? We intend to make it the default in the next release. If the issue is from listing, then it would help out a lot. 
   `hoodie.rollback.using.markers=true` 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on issue #2470: [SUPPORT] Heavy skew in ListingBasedRollbackHelper

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on issue #2470:
URL: https://github.com/apache/hudi/issues/2470#issuecomment-771803301


   if you are already on 0.6.0, the upgrade step would have run anyway from 0.5.x. No additional steps needed to migrate to 0.7.0. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] n3nash commented on issue #2470: [SUPPORT] Heavy skew in ListingBasedRollbackHelper

Posted by GitBox <gi...@apache.org>.
n3nash commented on issue #2470:
URL: https://github.com/apache/hudi/issues/2470#issuecomment-852790007


   @jtmzheng Were you able to migrate to 0.7.0 and confirm that the skew goes away ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] jtmzheng commented on issue #2470: [SUPPORT] Heavy skew in ListingBasedRollbackHelper

Posted by GitBox <gi...@apache.org>.
jtmzheng commented on issue #2470:
URL: https://github.com/apache/hudi/issues/2470#issuecomment-769948718


   Sorry for the delay, I believe the slowness was because compaction wasn't keeping up with the number of files (we partition by date and we have many partitions updated with a small number of updates with most coming in current date) and file count was growing faster than dataset size. I saw that IO was bounded on compaction and its based on log file size so the smaller updates were never getting compacted. 
   
   I've since increased the IO bound 3x and performance is slowly improving as file count goes down (ie. getting small files stage is faster). I'll update once we test rollback performance but I suspect it will also be better.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on issue #2470: [SUPPORT] Heavy skew in ListingBasedRollbackHelper

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on issue #2470:
URL: https://github.com/apache/hudi/issues/2470#issuecomment-764870271


   we parallelize by partitions, so it must be fast. not sure where the skew comes from.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] jtmzheng commented on issue #2470: [SUPPORT] Heavy skew in ListingBasedRollbackHelper

Posted by GitBox <gi...@apache.org>.
jtmzheng commented on issue #2470:
URL: https://github.com/apache/hudi/issues/2470#issuecomment-771421761


   @n3nash I have not had a chance to look at 0.7.0 migration yet, what EMR versions is 0.7.0 compatible with?
   
   If my dataset is on 0.6.0 already do I just need to update the Hudi jar version for my Spark job? Unsure if there's anything else I need to do based on the migration guide https://hudi.apache.org/releases.html. Thanks!
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan closed issue #2470: [SUPPORT] Heavy skew in ListingBasedRollbackHelper

Posted by GitBox <gi...@apache.org>.
nsivabalan closed issue #2470:
URL: https://github.com/apache/hudi/issues/2470


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] n3nash commented on issue #2470: [SUPPORT] Heavy skew in ListingBasedRollbackHelper

Posted by GitBox <gi...@apache.org>.
n3nash commented on issue #2470:
URL: https://github.com/apache/hudi/issues/2470#issuecomment-771409047


   @jtmzheng Does using 0.7.0 and `hoodie.metadata.enable=true` solve the issue ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org