You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2019/11/10 21:20:00 UTC

[jira] [Commented] (DRILL-7441) Fix issues with fillEmpties, offset vectors

    [ https://issues.apache.org/jira/browse/DRILL-7441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16971234#comment-16971234 ] 

ASF GitHub Bot commented on DRILL-7441:
---------------------------------------

paul-rogers commented on pull request #1896: DRILL-7441: Fix issues with fillEmpties, offset vectors
URL: https://github.com/apache/drill/pull/1896
 
 
   Fixes subtle issues with offset vectors and "fill empties"
   logic.
   
   Drill has an informal standard that if a batch has no rows, then
   offset vectors within that batch should have zero size. Contrast
   this with batches of size 1 that should have offset vectors of
   size 2. Changed to enforce this rule throughout.
   
   Nullable, repeated and variable-width vectors have "fill empties"
   logic that is used in two places: when setting the value count and
   when preparing to write a new value. The current logic is not
   quite right for either case. Added tests and fixed the code to
   properly handle each case.
   
   Revised the batch validator to enforce the offset-vector length of 0 for
   0-sized batches rule. The result was much simpler code.
   
   Added tools to easily print a batch, restoring some code that
   was recently lost when the RowSet classes were moved.
   
   Code cleanup in all files touched.
   
   Added logic to "dirty" allocated buffers when testing to ensure
   logic is not sensitive to the "pristine" state of new buffers.
   
   Added logic to the column writers to enforce the zero-size-batch rule
   for offset vectors. Added unit tests for this case.
   
   Fixed the column writers to set the "lastSet" mutator value for
   nullable types since other code relies on this value.
   
   Removed the "setCount" field in nullable vectors: turns out
   it is not actually used.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Fix issues with fillEmpties, offset vectors
> -------------------------------------------
>
>                 Key: DRILL-7441
>                 URL: https://issues.apache.org/jira/browse/DRILL-7441
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>            Priority: Major
>
> Enable the vector validator with full testing of offset vectors. A number of operators trigger errors. Tracking down the issues, and adding detailed tests, it turns out that:
> * Drill has an informal standard that zero-length batches should have zero-length offset vectors, while a batch of size 1 will have offset vectors of size 2. Thus, zero-length is a special case.
> * Nullable, repeated and variable-width vectors have "fill empties" logic that is used in two places: when setting the value count and when preparing to write a new value. The current logic is not quite right for either case.
> Detailed vector checks fail due to inconsistencies in how the above works. This PR fixes those issues.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)