You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by GitBox <gi...@apache.org> on 2018/07/22 05:59:38 UTC

[GitHub] sachouche opened a new pull request #1391: DRILL-6622: Fixed a NullPointerException in a query with Union

sachouche opened a new pull request #1391: DRILL-6622: Fixed a NullPointerException in a query with Union
URL: https://github.com/apache/drill/pull/1391
 
 
   @Ben-Zvi, can you please review the fix?
   Thanks!
   
   There were two bugs in the Aggregator batch sizing logic:
   Issue I
   - The aggregator runs in a loop to consume all input batches
   - The loop was updating the batch sizing stats after they were consumed
   - Assume output-row-count is 1 and we receive a batch with at least 32k + 1 records
   - The code would create 32k output batches (one per incoming record) and then fails because of overflow
   - Fix - Now updating the batch sizing logic when a non-empty batch is received and before the processing loop
   
   Issue II
   - The Aggregator has two main modules: AggregatorBatch and Aggregator objects
   - Both share the same "incoming" record batch instance
   - Though there is logic to spill incoming batches when under pressure
   - The batch sizing logic was not aware that when batches are spilled the shared "incoming" object instance will diverge; that is, the Aggregator object will mutate the incoming object
   - The batch sizer was being invoked with a stale "incoming" object (the one from the AggregatorBatch)
   - Fix - Update the  Aggregator code to always pass the active incoming object explicitly

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services