You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/08/04 04:26:34 UTC

[GitHub] [hudi] XuQianJin-Stars commented on a diff in pull request #6284: [HUDI-4526] Improve spillableMapBasePath disk directory is full

XuQianJin-Stars commented on code in PR #6284:
URL: https://github.com/apache/hudi/pull/6284#discussion_r937334947


##########
hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieMergedLogRecordScanner.java:
##########
@@ -92,11 +92,12 @@ protected HoodieMergedLogRecordScanner(FileSystem fs, String basePath, List<Stri
         forceFullScan, partitionName, internalSchema);
     try {
       // Store merged records for all versions for this log file, set the in-memory footprint to maxInMemoryMapSize
-      this.records = new ExternalSpillableMap<>(maxMemorySizeInBytes, spillableMapBasePath, new DefaultSizeEstimator(),
+      this.records = new ExternalSpillableMap<>(maxMemorySizeInBytes, basePath + spillableMapBasePath, new DefaultSizeEstimator(),
           new HoodieRecordSizeEstimator(readerSchema), diskMapType, isBitCaskDiskMapCompressionEnabled);
+

Review Comment:
   > So why the dir is full if it is cleaned in time ?
   
   When spark writes multiple jobs concurrently, there are not only hudi jobs, but the tmp directory shared by many jobs will cause it to explode.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org