You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Paul Rogers (JIRA)" <ji...@apache.org> on 2017/05/16 20:05:04 UTC

[jira] [Commented] (DRILL-5517) Provide size-aware set operations in value vectors

    [ https://issues.apache.org/jira/browse/DRILL-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013023#comment-16013023 ] 

Paul Rogers commented on DRILL-5517:
------------------------------------

Highlights of changes:

* Provide a {{setBytesBounded()}} method in DrillBuf which sets bytes if they fit in the current buffer, else returns false. (Current versions throw exceptions.)
* Equivalent methods in the UDLE class.
* Constant that defines the maximum vector buffer size. Also includes an experimental option to adjust this size as a system option at startup.
* For all vectors with data, define new {{setScalar}} and {{setArrayItem}} methods that implement the semantics mentioned above. {{setScalar()}} is for individual values, {{setArrayItem()}} is for items that are members of an array. The semantics of the two cases are slightly different.
* For all vectors with data, provide a {{copyEntry()}} method to be used when handling "overflow" rows in record batch writers.
* For all vectors with data, provide an {{exchange()}} method that swaps the buffers between two vectors. Unlike the {{TransferPair}} mechanism, this exchange is for vectors within a single operator, using the same allocator, so no memory accounting is needed. Used when swapping between the "full" and "overflow" batches in the record batch writer.
* In each fixed-width vector, define a constant for the value width rather than using a (generated) "magic number".
* In each fixed-width vector, create a constant for the maximum number of values that fit in either 64K or the maximum vector length.

This work required reviewing much existing code. A number of cosmetic cleanups were done as problems were noticed.

A unit test verifies the semantics of the new methods for typical generated required, optional and repeated vectors for both the fixed-width and variable-width cases.


> Provide size-aware set operations in value vectors
> --------------------------------------------------
>
>                 Key: DRILL-5517
>                 URL: https://issues.apache.org/jira/browse/DRILL-5517
>             Project: Apache Drill
>          Issue Type: Improvement
>    Affects Versions: 1.11.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>             Fix For: 1.11.0
>
>
> DRILL-5211 describes a memory fragmentation issue in Drill. The resolution is to limit vector sizes to 16 MB (the size of Netty memory allocation "slabs.") Effort starts by providing "size-aware" set operations in value vectors which:
> * Operate as {{setSafe()}} while vectors are below 16 MB.
> * Return false if setting the value (and growing the vector) would exceed the vector limit.
> The methods in value vectors then become the foundation on which we can construct size-aware record batch "writers."



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)