You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2019/07/03 15:23:56 UTC

[GitHub] [incubator-hudi] smdahmed opened a new issue #776: Incorrect averageBytesPerRecord Causes OOM

smdahmed opened a new issue #776: Incorrect averageBytesPerRecord Causes OOM
URL: https://github.com/apache/incubator-hudi/issues/776
 
 
   Historically I see an issue that has been closed at: https://github.com/apache/incubator-hudi/issues/270. 
   
   I am not sure what the fix was for the above issue.
   
   I have hit the issue today. Lets say there are about thousands of records but none get written (which may happen in my case as we want to selectively write records). This results total records written to 0 leading to avgSize going to Infinity. 
   
   ``
   scala> val l = Math.ceil( 1.0 / 0 )
   l: Double = Infinity
   
   scala> val l = Math.ceil( 1.0 / 0 ).toLong
   l: Long = 9223372036854775807
   ``
   This causes OOM.
   
   ``
       protected long averageBytesPerRecord() {
         long avgSize = 0L;
         HoodieTimeline commitTimeline = metaClient.getActiveTimeline().getCommitTimeline()
             .filterCompletedInstants();
         try {
           if (!commitTimeline.empty()) {
             HoodieInstant latestCommitTime = commitTimeline.lastInstant().get();
             HoodieCommitMetadata commitMetadata = HoodieCommitMetadata
                 .fromBytes(commitTimeline.getInstantDetails(latestCommitTime).get(), HoodieCommitMetadata.class);
             avgSize = (long) Math.ceil(
                 (1.0 * commitMetadata.fetchTotalBytesWritten()) / commitMetadata
                     .fetchTotalRecordsWritten());
           }
         } catch (Throwable t) {
           // make this fail safe.
           logger.error("Error trying to compute average bytes/record ", t);
         }
         return avgSize <= 0L ? config.getCopyOnWriteRecordSizeEstimate() : avgSize;
       }
   ``
   
   I have now managed to work around it by editing the last line in the code as below. 
   
   return (avgSize <= 0L  | avgSize >= Integer.MAX_VALUE) ? config.getCopyOnWriteRecordSizeEstimate() : avgSize;
   
   But I believe someone more knowledgeable about this should take a look at it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services