You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@drill.apache.org by "benj (JIRA)" <ji...@apache.org> on 2019/04/16 12:26:00 UTC

[jira] [Commented] (DRILL-7020) big varchar doesn't work with extractHeader=true

    [ https://issues.apache.org/jira/browse/DRILL-7020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16818954#comment-16818954 ] 

benj commented on DRILL-7020:
-----------------------------

Note that this problem of course exists +for every csvh+ file that contains at least one field with more than 65536 characters.
{code:java}
SELECT * FROM ...`example_file_with_large_field.csvh`
Error: UNSUPPORTED_OPERATION ERROR: Trying to write something big in a column
{code}

The trick using "TABLE()" syntax to bypass this limitation is useful but as already mentioned force to use "COLUMNS[0]" syntax instead of real column name.

The error message is already a little bit disturbing because it say "write" although the problem comes from a reading (file).

> big varchar doesn't work with extractHeader=true
> ------------------------------------------------
>
>                 Key: DRILL-7020
>                 URL: https://issues.apache.org/jira/browse/DRILL-7020
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Text &amp; CSV
>    Affects Versions: 1.15.0
>            Reporter: benj
>            Priority: Major
>
> with a TEST file of csv type like
> {code:java}
> col1,col2
> w,x
> ...y...,z
> {code}
> where ...y... is > 65536 characters string (let say 66000 for example)
> SELECT with +*extractHeader=false*+ are OK
> {code:java}
> SELECT * FROM TABLE(tmp.`TEST`(type => 'text', fieldDelimiter => ',', extractHeader => false));
>     col1  | col2
> +---------+------
> | w       | x
> | ...y... | z
> {code}
> But SELECT with +*extractHeader=true*+ gives an error
> {code:java}
> SELECT * FROM TABLE(tmp.`TEST`(type => 'text', fieldDelimiter => ',', extractHeader => true));
> Error: UNSUPPORTED_OPERATION ERROR: Trying to write something big in a column
> columnIndex 1
> Limit 65536
> Fragment 0:0
> {code}
> Note that is possible to use extractHeader=false with skipFirstLine=true but in this case it's not possible to automatically get columns names.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)