You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Khurram Faraaz (JIRA)" <ji...@apache.org> on 2017/07/07 12:01:04 UTC
[jira] [Updated] (DRILL-4648) select count(*) on csv file fails with UNSUPPORTED_OPERATION

     [ https://issues.apache.org/jira/browse/DRILL-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Khurram Faraaz updated DRILL-4648:
----------------------------------
    Component/s: Storage - Text & CSV

> select count(*) on csv file fails with UNSUPPORTED_OPERATION
> ------------------------------------------------------------
>
>                 Key: DRILL-4648
>                 URL: https://issues.apache.org/jira/browse/DRILL-4648
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Data Types, Functions - Drill, Storage - Text & CSV
>    Affects Versions: 1.6.0
>            Reporter: Peter McTaggart
>
> When trying to perform a select count(*) on a CSV file the following error is encountered:
> 0: jdbc:drill:drillbit=10.1.101.10> select count(*) from `views/db/test.csv`;
> Error: UNSUPPORTED_OPERATION ERROR: With extractHeader enabled, only header names are supported
> column name columns
> column index
> Fragment 0:0
> [Error Id: b38a1e44-c2f5-44a3-9960-6062debc6b50 on xxxxxx.compute.internal:31010] (state=,code=0)
> If we refer to a column in the file by name it works, eg:
> 0: jdbc:drill:drillbit=10.1.101.10> select count(COLUMN_ONE) from `views/db/test.csv`;
> +---------+
> | EXPR$0  |
> +---------+
> | 1       |
> +---------+
> 1 row selected (0.144 seconds)
> 0: jdbc:drill:drillbit=10.1.101.10>
> The test.csv file contents:
> ~/D❯❯❯ cat test.csv
> "COLUMN_ONE","COLUMN_TWO"
> "Hello","World"
> ~/D❯❯❯
> Drill is talking to a file mounted on Alluxio.
> More info:
> Mounting s3 directly gives the following results:
> With extractHeaders NOT turned on:
> : jdbc:drill:drillbit=10.1.101.10> select count(*) from `src/db/test.csv`;
> +---------+
> | EXPR$0  |
> +---------+
> | 2       |
> +---------+
> 1 row selected (0.951 seconds)
> 0: jdbc:drill:drillbit=10.1.101.10>
> **With extractHeaders = true :**
> 0: jdbc:drill:drillbit=10.1.101.10> select count(*) from `src/db/test.csv`;
> Error: UNSUPPORTED_OPERATION ERROR: With extractHeader enabled, only header names are supported
> column name columns
> column index
> Fragment 0:0
> [Error Id: 5609cf0d-7553-44b5-bd90-40bce1c020a9 on ixxxxxx.compute.internal:31010] (state=,code=0)
> 0: jdbc:drill:drillbit=10.1.101.10>
> Workspace file:
> {
>   "type": "file",
>   "enabled": true,
>   "connection": "s3a://<my-bucket>",
>   "config": {
>     "fs.s3a.access.key": "xxx",
>     "fs.s3a.secret.key": "xxx"
>   },
>   "workspaces": {
>     "root": {
>       "location": "/",
>       "writable": false,
>       "defaultInputFormat": null
>     },
>     "tmp": {
>       "location": "/tmp",
>       "writable": true,
>       "defaultInputFormat": null
>     }
>   },
>   "formats": {
>     "psv": {
>       "type": "text",
>       "extensions": [
>         "tbl"
>       ],
>       "delimiter": "|"
>     },
>     "csv": {
>       "type": "text",
>       "extensions": [
>         "csv"
>       ],
>       "extractHeader": true,
>       "delimiter": ","
>     },
>     "tsv": {
>       "type": "text",
>       "extensions": [
>         "tsv"
>       ],
>       "delimiter": "\t"
>     },
>     "parquet": {
>       "type": "parquet"
>     },
>     "json": {
>       "type": "json",
>       "extensions": [
>         "json"
>       ]
>     },
>     "avro": {
>       "type": "avro"
>     },
>     "sequencefile": {
>       "type": "sequencefile",
>       "extensions": [
>         "seq"
>       ]
>     },
>     "csvh": {
>       "type": "text",
>       "extensions": [
>         "csvh"
>       ],
>       "extractHeader": true,
>       "delimiter": ","
>     }
>   }
> }



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)