You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Volodymyr Burenin (Jira)" <ji...@apache.org> on 2022/09/14 17:17:00 UTC

[jira] [Created] (HUDI-4845) HiveSync fails while scanning a table with a large number of partitions

Volodymyr Burenin created HUDI-4845:
---------------------------------------

             Summary: HiveSync fails while scanning a table with a large number of partitions
                 Key: HUDI-4845
                 URL: https://issues.apache.org/jira/browse/HUDI-4845
             Project: Apache Hudi
          Issue Type: Bug
            Reporter: Volodymyr Burenin


 

When I try to recreate a table in metastore that has around 4k partitions, I get this exceptions during org.apache.hudi.common.fs.FSUtils.getAllPartitionPaths call.


{code:java}
Caused by: java.util.concurrent.RejectedExecutionException: Thread limit exceeded replacing blocked worker
at java.base/java.util.concurrent.ForkJoinPool.tryCompensate(ForkJoinPool.java:1575)
at java.base/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3115)
at java.base/java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1823)
at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1998)
at org.apache.hadoop.util.functional.FutureIO.awaitFuture(FutureIO.java:73)
at org.apache.hadoop.fs.impl.FutureIOSupport.awaitFuture(FutureIOSupport.java:65)
at org.apache.hadoop.fs.s3a.Listing$ObjectListingIterator.next(Listing.java:821)
at org.apache.hadoop.fs.s3a.Listing$FileStatusListingIterator.requestNextBatch(Listing.java:612)
at org.apache.hadoop.fs.s3a.Listing$FileStatusListingIterator.<init>(Listing.java:536)
at org.apache.hadoop.fs.s3a.Listing.createFileStatusListingIterator(Listing.java:173)
at org.apache.hadoop.fs.s3a.Listing.createFileStatusListingIterator(Listing.java:148)
at org.apache.hadoop.fs.s3a.Listing.getFileStatusesAssumingNonEmptyDir(Listing.java:414){code}
I tracked down the problem down here:

[https://github.com/apache/hudi/blob/master/hudi-common/src/main/java/org/apache/hudi/metadata/FileSystemBackedTableMetadata.java]

Apparently this value is too high:
private static final int DEFAULT_LISTING_PARALLELISM = 1500;

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)