You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by GitBox <gi...@apache.org> on 2018/08/08 00:00:46 UTC

[GitHub] Ben-Zvi commented on a change in pull request #1409: DRILL-6644: Don't reserve space for incoming probe batches unnecessarily during the build phase.

Ben-Zvi commented on a change in pull request #1409: DRILL-6644: Don't reserve space for incoming probe batches unnecessarily during the build phase.
URL: https://github.com/apache/drill/pull/1409#discussion_r208421033
 
 

 ##########
 File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/HashJoinMemoryCalculatorImpl.java
 ##########
 @@ -387,9 +387,12 @@ private void calculateMemoryUsage()
         long incompletePartitionsBatchSizes = ((long) partitions) * partitionBuildBatchSize;
         // We need to reserve all the space for incomplete batches, and the incoming batch as well as the
         // probe batch we sniffed.
-        // TODO when batch sizing project is complete we won't have to sniff probe batches since
-        // they will have a well defined size.
-        reservedMemory = incompletePartitionsBatchSizes + maxBuildBatchSize + probeSizePredictor.getBatchSize();
+        reservedMemory = incompletePartitionsBatchSizes + maxBuildBatchSize;
+
+        if (!firstCycle) {
+          // If this is NOT the first cycle the HashJoin operator owns the probe batch and we need to reserve space for it.
+          reservedMemory += probeSizePredictor.getBatchSize();
 
 Review comment:
   Sounds correct, but note that this method *calculateMemoryUsage()* is invoked from the *innerNext()* which is called (recursively) after both the first Build and first Probe side spilled batches are already read.
    -  In line 363 *maxProbeBatchSize* is calculated; how different is it from getBatchSize() here ?
    -  Should the build side spilled "incoming" batch size also be taken into account ? 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services