You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@drill.apache.org by Paul Rogers <pr...@mapr.com> on 2017/11/08 22:08:29 UTC

"Batch size control" project update

Hi All,

The “batch size control” project seeks to further improve Drill’s internal memory management by controlling the size of batches created in Drill’s scan and “combining” operators. We’ve provided updates in the Hangouts. This note is for folks who may have missed those discussions.

Drill currently faces two pesky problems in certain data sets:

* Readers consume a fixed number of records (often 1K, 4K or 64K.) If an individual column happens to be “wide”, then Drill may allocate a vector larger than 16 MB. Because of how Drill uses the Netty memory allocator, such “jumbo” vectors can lead to direct memory fragmentation and premature OOM (out-of-memory) errors.

* If no individual column is not wide, the overall row may be, resulting in very large record batches, too large, in fact for the new memory-limited sort and hash agg operators to consume. (We’ve seen 100 MB batches sent to a sort limited to use 30 MB of memory.) This leads to an OOM error because the downstream operator exceeds the memory limit set by the planner.

The same problems can arise in “combining” operators such as flatten, project, join, etc. These operators take incoming batches and produce new, possibly larger, batches. These operations can cause the same memory issues as described above for readers.

The goal of the “batch size control” project is to limit individual vectors to no more than 16 MB, and to limit overall batch size to some configured maximum. In practice, the maximum batch size will eventually be set by the Drill planner as part of its memory allocation planning.

The design is inspired by the set of “complex (vector) writers” that already exist in Drill. The existing vector writers (and thus the new set as well) are modeled after JSON structure builders. They write scalars, tuples and arrays. (“Tuple” is a fancy name for what Hive calls a “struct” and Drill calls a “map”. Both tuples and arrays occur in JSON-like data sources.)

Unlike the existing set, the new set also monitors memory use and takes the required actions. The key challenge in limiting size is that we don’t know ahead of time the size of the data we wish to write to a vector, nor can most readers “backtrack” and rewrite a record if we discover one of the values causes a vector to overflow.

The key contribution of the new vector writers is to monitor vector and batch sizes. They employ an algorithm to handle “overflow” when a value is discovered to be too large for the remaining space. The vector writers roll the current row over to a new batch, allow the reader to continue writing the current record, then tell the reader that the batch is full and should be sent downstream. When the reader starts with the next batch, the batch will already contain the “overflow” row that ended the prior batch.

By using the new mechanism, readers and other operators gain control over their batch sizes without the need for each reader to implement its own solution.

There is much more to the solution (such as consolidating projection handling in the scan operator.) These topics can be the subject of future updates.

For more information, see:

* DRILL-5211: https://issues.apache.org/jira/browse/DRILL-5211
* Drill PR #914: https://github.com/apache/drill/pull/914

We welcome your comments, questions and suggestions. Please post them in the above JIRA ticket, above PR, or to this dev list.

Thanks,

- Paul