You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Peter McTaggart (JIRA)" <ji...@apache.org> on 2015/12/01 02:26:10 UTC

[jira] [Updated] (DRILL-4145) IndexOutOfBoundsException raised during select * query on S3 csv file

     [ https://issues.apache.org/jira/browse/DRILL-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Peter McTaggart updated DRILL-4145:
-----------------------------------
    Description: 
When trying to query (via sqlline or WebUI) a .csv file I am getting an IndexOutofBoundsException:
{noformat} 0: jdbc:drill:> select * from s3data.root.`staging/data/apps1-bad.csv` limit 1;
Error: SYSTEM ERROR: IndexOutOfBoundsException: index: 16384, length: 4 (expected: range(0, 16384))

Fragment 0:0

[Error Id: be9856d2-0b80-4b9c-94a4-a1ca38ec5db0 on ip-XXXXX.compute.internal:31010] (state=,code=0)
0: jdbc:drill:> select * from s3data.root.`staging/data/apps1.csv` limit 1;
+----------+----------------------+----------+----------+----------+------------+----------+------------+----------+--------------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+
| FIELD_1  |       FIELD_2        | FIELD_3  | FIELD_4  | FIELD_5  |  FIELD_6   | FIELD_7  |  FIELD_8   | FIELD_9  |   FIELD_10   | FIELD_11  |       FIELD_12       | FIELD_13  | FIELD_14  | FIELD_15  | FIELD_16  | FIELD_17  | FIELD_18  | FIELD_19  |       FIELD_20       | FIELD_21  | FIELD_22  | FIELD_23  | FIELD_24  | FIELD_25  | FIELD_26  | FIELD_27  | FIELD_28  | FIELD_29  | FIELD_30  | FIELD_31  | FIELD_32  | FIELD_33  | FIELD_34  | FIELD_35  |
+----------+----------------------+----------+----------+----------+------------+----------+------------+----------+--------------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+
| 489517   | 27/10/2015 02:05:27  | 261      | 1130232  | 0        | 925630488  | 0        | 925630488  | -1       | 19531580547  | 00000000  | 27/10/2015 02:00:00  |           | 30        | 300       | 0         | 0         | 00000000  | 00000000  | 27/10/2015 02:05:27  | 0         | 1         | 0         | 35.0      |           |           |           | 505       | 872.0     |           | aBc       |           |           |           |           |
+----------+----------------------+----------+----------+----------+------------+----------+------------+----------+--------------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+
1 row selected (1.094 seconds)
0: jdbc:drill:>  {noformat{

Good file: apps1.csv, and 
Bad file: apps1-bad.csv  attached.


  was:
When trying to query (via sqlline or WebUI) a .csv file I am getting an IndexOutofBoundsException:
{{ 0: jdbc:drill:> select * from s3data.root.`staging/data/apps1-bad.csv` limit 1;
Error: SYSTEM ERROR: IndexOutOfBoundsException: index: 16384, length: 4 (expected: range(0, 16384))

Fragment 0:0

[Error Id: be9856d2-0b80-4b9c-94a4-a1ca38ec5db0 on ip-XXXXX.compute.internal:31010] (state=,code=0)
0: jdbc:drill:> select * from s3data.root.`staging/data/apps1.csv` limit 1;
+----------+----------------------+----------+----------+----------+------------+----------+------------+----------+--------------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+
| FIELD_1  |       FIELD_2        | FIELD_3  | FIELD_4  | FIELD_5  |  FIELD_6   | FIELD_7  |  FIELD_8   | FIELD_9  |   FIELD_10   | FIELD_11  |       FIELD_12       | FIELD_13  | FIELD_14  | FIELD_15  | FIELD_16  | FIELD_17  | FIELD_18  | FIELD_19  |       FIELD_20       | FIELD_21  | FIELD_22  | FIELD_23  | FIELD_24  | FIELD_25  | FIELD_26  | FIELD_27  | FIELD_28  | FIELD_29  | FIELD_30  | FIELD_31  | FIELD_32  | FIELD_33  | FIELD_34  | FIELD_35  |
+----------+----------------------+----------+----------+----------+------------+----------+------------+----------+--------------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+
| 489517   | 27/10/2015 02:05:27  | 261      | 1130232  | 0        | 925630488  | 0        | 925630488  | -1       | 19531580547  | 00000000  | 27/10/2015 02:00:00  |           | 30        | 300       | 0         | 0         | 00000000  | 00000000  | 27/10/2015 02:05:27  | 0         | 1         | 0         | 35.0      |           |           |           | 505       | 872.0     |           | aBc       |           |           |           |           |
+----------+----------------------+----------+----------+----------+------------+----------+------------+----------+--------------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+
1 row selected (1.094 seconds)
0: jdbc:drill:>  }}

Good file: apps1.csv, and 
Bad file: apps1-bad.csv  attached.



> IndexOutOfBoundsException raised during select * query on S3 csv file
> ---------------------------------------------------------------------
>
>                 Key: DRILL-4145
>                 URL: https://issues.apache.org/jira/browse/DRILL-4145
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Functions - Drill
>    Affects Versions: 1.3.0
>         Environment: Drill 1.3.0 on a 3 node distriubted cluster on AWS.
> Data files on S3.
> S3 storage plugin configuration:
> {
>   "type": "file",
>   "enabled": true,
>   "connection": "s3a://<bucket-name-was-here>",
>   "workspaces": {
>     "root": {
>       "location": "/",
>       "writable": false,
>       "defaultInputFormat": null
>     },
>     "views": {
>       "location": "/processed",
>       "writable": true,
>       "defaultInputFormat": null
>     },
>     "tmp": {
>       "location": "/tmp",
>       "writable": true,
>       "defaultInputFormat": null
>     }
>   },
>   "formats": {
>     "psv": {
>       "type": "text",
>       "extensions": [
>         "tbl"
>       ],
>       "delimiter": "|"
>     },
>     "csv": {
>       "type": "text",
>       "extensions": [
>         "csv"
>       ],
>       "extractHeader": true,
>       "delimiter": ","
>     },
>     "tsv": {
>       "type": "text",
>       "extensions": [
>         "tsv"
>       ],
>       "delimiter": "\t"
>     },
>     "parquet": {
>       "type": "parquet"
>     },
>     "json": {
>       "type": "json"
>     },
>     "avro": {
>       "type": "avro"
>     },
>     "sequencefile": {
>       "type": "sequencefile",
>       "extensions": [
>         "seq"
>       ]
>     },
>     "csvh": {
>       "type": "text",
>       "extensions": [
>         "csvh",
>         "csv"
>       ],
>       "extractHeader": true,
>       "delimiter": ","
>     }
>   }
> }
>            Reporter: Peter McTaggart
>
> When trying to query (via sqlline or WebUI) a .csv file I am getting an IndexOutofBoundsException:
> {noformat} 0: jdbc:drill:> select * from s3data.root.`staging/data/apps1-bad.csv` limit 1;
> Error: SYSTEM ERROR: IndexOutOfBoundsException: index: 16384, length: 4 (expected: range(0, 16384))
> Fragment 0:0
> [Error Id: be9856d2-0b80-4b9c-94a4-a1ca38ec5db0 on ip-XXXXX.compute.internal:31010] (state=,code=0)
> 0: jdbc:drill:> select * from s3data.root.`staging/data/apps1.csv` limit 1;
> +----------+----------------------+----------+----------+----------+------------+----------+------------+----------+--------------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+
> | FIELD_1  |       FIELD_2        | FIELD_3  | FIELD_4  | FIELD_5  |  FIELD_6   | FIELD_7  |  FIELD_8   | FIELD_9  |   FIELD_10   | FIELD_11  |       FIELD_12       | FIELD_13  | FIELD_14  | FIELD_15  | FIELD_16  | FIELD_17  | FIELD_18  | FIELD_19  |       FIELD_20       | FIELD_21  | FIELD_22  | FIELD_23  | FIELD_24  | FIELD_25  | FIELD_26  | FIELD_27  | FIELD_28  | FIELD_29  | FIELD_30  | FIELD_31  | FIELD_32  | FIELD_33  | FIELD_34  | FIELD_35  |
> +----------+----------------------+----------+----------+----------+------------+----------+------------+----------+--------------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+
> | 489517   | 27/10/2015 02:05:27  | 261      | 1130232  | 0        | 925630488  | 0        | 925630488  | -1       | 19531580547  | 00000000  | 27/10/2015 02:00:00  |           | 30        | 300       | 0         | 0         | 00000000  | 00000000  | 27/10/2015 02:05:27  | 0         | 1         | 0         | 35.0      |           |           |           | 505       | 872.0     |           | aBc       |           |           |           |           |
> +----------+----------------------+----------+----------+----------+------------+----------+------------+----------+--------------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+
> 1 row selected (1.094 seconds)
> 0: jdbc:drill:>  {noformat{
> Good file: apps1.csv, and 
> Bad file: apps1-bad.csv  attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)