You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Paul Rogers (JIRA)" <ji...@apache.org> on 2017/02/17 22:28:44 UTC

[jira] [Commented] (DRILL-5100) External Sort does not manage memory requirements of a schema change

    [ https://issues.apache.org/jira/browse/DRILL-5100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872649#comment-15872649 ] 

Paul Rogers commented on DRILL-5100:
------------------------------------

Remains an issue after DRILL-5080. A schema change will grow the size of in-memory batches, but no code exists to ensure that there is space to do so. Since schema change is not fully supported in other operators, we elected to postpone the work for this issue.

The result is that, if someone does enable union vectors, and causes a schema change, and the buffer pool is near full, then processing the schema change can cause an OOM error.

> External Sort does not manage memory requirements of a schema change
> --------------------------------------------------------------------
>
>                 Key: DRILL-5100
>                 URL: https://issues.apache.org/jira/browse/DRILL-5100
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.8.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>
> The external sort is given a fixed amount of memory to hold buffered in-memory batches prior to spilling. External sort also handles certain schema changes when union vectors are enabled. When a schema change occurs, existing vectors are coerced into the new schema format, perhaps replacing an existing vector with a new union vector.
> This conversion requires (direct) memory. When done when the external sort has already almost filled its in-memory buffer, the conversion process can cause memory overflow and failure.
> The following show the allocated memory before and after schema changes in the unit tests {{TestExternalSort.testNumericTypes}}:
> {code}
> Before: 134144
> After: 150528
> Before: 150528
> After: 166912
> {code}
> Union vectors appear to be larger than the original BIGINT vectors. External sort must anticipate this and perhaps spill to ensure sufficient room exists for the new, larger vectors.
> Further, the conversion process itself requires that two copies of each vector be in memory: the original and the new, converted one. The external sort does not check to ensure this much working memory is available, leading to potential OOM errors during each vector conversion.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)