You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Hari Sekhon (JIRA)" <ji...@apache.org> on 2014/11/14 12:40:34 UTC

[jira] [Created] (DRILL-1712) Quoted CSV parsing

Hari Sekhon created DRILL-1712:
----------------------------------

             Summary: Quoted CSV parsing
                 Key: DRILL-1712
                 URL: https://issues.apache.org/jira/browse/DRILL-1712
             Project: Apache Drill
          Issue Type: Improvement
    Affects Versions: 0.6.0
         Environment: MapR 4.0.1 M5
            Reporter: Hari Sekhon


When querying CSV files Drill doesn't handle quoted CSV files properly and includes the quotes in the data. The directory /tmp/hari in MapR-FS has two simple CSV files,  one quoted, one not quoted so you can see the difference.
{code}
0: jdbc:drill:> select * from dfs.`/tmp/hari` limit 10;
+------------+
|  columns   |
+------------+
| ["1","2","3"] |
| ["4","5","6"] |
| ["7","8","9"] |
| ["\"1\"","\"2\"","\"3\""] |
| ["\"4\"","\"5\"","\"6\""] |
| ["\"7\"","\"8\"","\"9\""] |
+------------+
6 rows selected (0.238 seconds)

 cat hari/hari.csv
1,2,3
4,5,6
7,8,9
cat hari/hari2.csv
"1","2","3"
"4","5","6"
"7","8","9"
{code}
It shouldn't be including the quotes as data, they're just containers to the data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)