You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Paul Rogers (JIRA)" <ji...@apache.org> on 2018/05/31 23:40:00 UTC

[jira] [Commented] (DRILL-5762) CSV having header column by name "suffix" fails to load

    [ https://issues.apache.org/jira/browse/DRILL-5762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497336#comment-16497336 ] 

Paul Rogers commented on DRILL-5762:
------------------------------------

Actually, there is a solution: have the CSV reader rename the column. This works only for {{SELECT *}} queries.

The renaming is already done (in recent code) for some cases:

* Empty column
* Duplicate column name

Would be possible to also rename if the column matches a reserved word.

There is another workaround. System/session options exist to give alternative names to the "intrinsic" columns. Use these options to tell Drill to use, say, "fileSuffix" in place of "suffix" so the the CSV file can use "suffix". This workaround is available today. (Have not tested it, but have seen the code that gets the name from session/system options, so is worth a shot.)

> CSV having header column by name "suffix" fails to load
> -------------------------------------------------------
>
>                 Key: DRILL-5762
>                 URL: https://issues.apache.org/jira/browse/DRILL-5762
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Text &amp; CSV
>    Affects Versions: 1.11.0
>            Reporter: Praveen Yadav
>            Priority: Major
>
> Trying select * query on the below csv file using apache drill 1.11.0.
> {code:none}
> id,email,first_name,last_name,middle_name,suffix,work_phone,mobile_phone,gender,picture,speciality,taxonomy_code,education_details,experience_details,keywords,doctor_npi,wait_time,created_tstamp,created_by,last_updated_tstamp,last_updated_by,is_deleted
> 1,xxxx@gmail.com,XXXXX,XXXX,,Dr,912225711234,,M,assets/images/doctorIcon.png,Primary Care Physician,Primary Care Doctor,M.D,3 years,Primary Care Doctor,12349765,10,2015-04-22 17:20:48.0,,2015-12-16 12:06:27.0,,N
> 2,xxxx@gmail.com,XXXX,XXXX,,Dr,913345311234,,M,assets/images/doctorIcon.png,Eye Doctor,EYE Care Doctor,MD,5 years,,16456076,20,2015-04-30 11:07:57.0,,2015-11-07 08:49:57.0,,N
> {code}
> I get this error :
> {noformat}
> Error: DATA_READ ERROR: Error processing input: , line=1, char=286. Content parsed: [ ]
> Failure while reading file file:somepath/file.csv. Happened at or shortly before byte position 286.
> Fragment 0:0
> [Error Id: 1fff3645-e788-4ec3-b678-bea86a39003c on praveens-mbp.lan:31010] (state=,code=0)
> {noformat}
> Solution:
> Replacing column name "suffix" with any other text fixed the error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)