You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by GitBox <gi...@apache.org> on 2019/05/31 01:55:54 UTC

[GitHub] [drill] paul-rogers opened a new pull request #1802: DRILL-7258: Remove field width limit for text reader

paul-rogers opened a new pull request #1802: DRILL-7258: Remove field width limit for text reader
URL: https://github.com/apache/drill/pull/1802
 
 
   The V2 text reader enforced a limit of 64K characters when using
   column headers, but not when using the columns[] array. The V3 reader
   enforced the 64K limit in both cases.
   
   This patch removes the limit in both cases. The limit now is the
   16MB vector size limit. With headers, no one column can exceed 16MB.
   With the columns[] array, no one row can exceed 16MB. (The 16MB
   limit is set by the Netty memory allocator.)
   
   Added an "appendBytes()" method to the scalar column writer which adds
   additional bytes to those already written for a specific column or
   array element value. The method is implemented for VarChar, Var16Char
    and VarBinary vectors. It throws an exception for all other types.
   
   When used with a type conversion shim, the appendBytes() method throws
   an exception. This should be OK because, the previous setBytes() should
   have failed because a huge value is not acceptable for numeric or date
   types conversions.
   
   Added unit tests of the append feature, and for the append feature in
   the batch overflow case (when appending bytes causes the vector or
   batch to overflow.) Also added tests to verify the lack of column width
   limit with the text reader, both with and without headers.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services