You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Peter McTaggart (JIRA)" <ji...@apache.org> on 2015/12/01 02:26:10 UTC
[jira] [Updated] (DRILL-4145) IndexOutOfBoundsException raised
during select * query on S3 csv file
[ https://issues.apache.org/jira/browse/DRILL-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Peter McTaggart updated DRILL-4145:
-----------------------------------
Description:
When trying to query (via sqlline or WebUI) a .csv file I am getting an IndexOutofBoundsException:
{noformat} 0: jdbc:drill:> select * from s3data.root.`staging/data/apps1-bad.csv` limit 1;
Error: SYSTEM ERROR: IndexOutOfBoundsException: index: 16384, length: 4 (expected: range(0, 16384))
Fragment 0:0
[Error Id: be9856d2-0b80-4b9c-94a4-a1ca38ec5db0 on ip-XXXXX.compute.internal:31010] (state=,code=0)
0: jdbc:drill:> select * from s3data.root.`staging/data/apps1.csv` limit 1;
+----------+----------------------+----------+----------+----------+------------+----------+------------+----------+--------------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+
| FIELD_1 | FIELD_2 | FIELD_3 | FIELD_4 | FIELD_5 | FIELD_6 | FIELD_7 | FIELD_8 | FIELD_9 | FIELD_10 | FIELD_11 | FIELD_12 | FIELD_13 | FIELD_14 | FIELD_15 | FIELD_16 | FIELD_17 | FIELD_18 | FIELD_19 | FIELD_20 | FIELD_21 | FIELD_22 | FIELD_23 | FIELD_24 | FIELD_25 | FIELD_26 | FIELD_27 | FIELD_28 | FIELD_29 | FIELD_30 | FIELD_31 | FIELD_32 | FIELD_33 | FIELD_34 | FIELD_35 |
+----------+----------------------+----------+----------+----------+------------+----------+------------+----------+--------------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+
| 489517 | 27/10/2015 02:05:27 | 261 | 1130232 | 0 | 925630488 | 0 | 925630488 | -1 | 19531580547 | 00000000 | 27/10/2015 02:00:00 | | 30 | 300 | 0 | 0 | 00000000 | 00000000 | 27/10/2015 02:05:27 | 0 | 1 | 0 | 35.0 | | | | 505 | 872.0 | | aBc | | | | |
+----------+----------------------+----------+----------+----------+------------+----------+------------+----------+--------------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+
1 row selected (1.094 seconds)
0: jdbc:drill:> {noformat{
Good file: apps1.csv, and
Bad file: apps1-bad.csv attached.
was:
When trying to query (via sqlline or WebUI) a .csv file I am getting an IndexOutofBoundsException:
{{ 0: jdbc:drill:> select * from s3data.root.`staging/data/apps1-bad.csv` limit 1;
Error: SYSTEM ERROR: IndexOutOfBoundsException: index: 16384, length: 4 (expected: range(0, 16384))
Fragment 0:0
[Error Id: be9856d2-0b80-4b9c-94a4-a1ca38ec5db0 on ip-XXXXX.compute.internal:31010] (state=,code=0)
0: jdbc:drill:> select * from s3data.root.`staging/data/apps1.csv` limit 1;
+----------+----------------------+----------+----------+----------+------------+----------+------------+----------+--------------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+
| FIELD_1 | FIELD_2 | FIELD_3 | FIELD_4 | FIELD_5 | FIELD_6 | FIELD_7 | FIELD_8 | FIELD_9 | FIELD_10 | FIELD_11 | FIELD_12 | FIELD_13 | FIELD_14 | FIELD_15 | FIELD_16 | FIELD_17 | FIELD_18 | FIELD_19 | FIELD_20 | FIELD_21 | FIELD_22 | FIELD_23 | FIELD_24 | FIELD_25 | FIELD_26 | FIELD_27 | FIELD_28 | FIELD_29 | FIELD_30 | FIELD_31 | FIELD_32 | FIELD_33 | FIELD_34 | FIELD_35 |
+----------+----------------------+----------+----------+----------+------------+----------+------------+----------+--------------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+
| 489517 | 27/10/2015 02:05:27 | 261 | 1130232 | 0 | 925630488 | 0 | 925630488 | -1 | 19531580547 | 00000000 | 27/10/2015 02:00:00 | | 30 | 300 | 0 | 0 | 00000000 | 00000000 | 27/10/2015 02:05:27 | 0 | 1 | 0 | 35.0 | | | | 505 | 872.0 | | aBc | | | | |
+----------+----------------------+----------+----------+----------+------------+----------+------------+----------+--------------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+
1 row selected (1.094 seconds)
0: jdbc:drill:> }}
Good file: apps1.csv, and
Bad file: apps1-bad.csv attached.
> IndexOutOfBoundsException raised during select * query on S3 csv file
> ---------------------------------------------------------------------
>
> Key: DRILL-4145
> URL: https://issues.apache.org/jira/browse/DRILL-4145
> Project: Apache Drill
> Issue Type: Bug
> Components: Functions - Drill
> Affects Versions: 1.3.0
> Environment: Drill 1.3.0 on a 3 node distriubted cluster on AWS.
> Data files on S3.
> S3 storage plugin configuration:
> {
> "type": "file",
> "enabled": true,
> "connection": "s3a://<bucket-name-was-here>",
> "workspaces": {
> "root": {
> "location": "/",
> "writable": false,
> "defaultInputFormat": null
> },
> "views": {
> "location": "/processed",
> "writable": true,
> "defaultInputFormat": null
> },
> "tmp": {
> "location": "/tmp",
> "writable": true,
> "defaultInputFormat": null
> }
> },
> "formats": {
> "psv": {
> "type": "text",
> "extensions": [
> "tbl"
> ],
> "delimiter": "|"
> },
> "csv": {
> "type": "text",
> "extensions": [
> "csv"
> ],
> "extractHeader": true,
> "delimiter": ","
> },
> "tsv": {
> "type": "text",
> "extensions": [
> "tsv"
> ],
> "delimiter": "\t"
> },
> "parquet": {
> "type": "parquet"
> },
> "json": {
> "type": "json"
> },
> "avro": {
> "type": "avro"
> },
> "sequencefile": {
> "type": "sequencefile",
> "extensions": [
> "seq"
> ]
> },
> "csvh": {
> "type": "text",
> "extensions": [
> "csvh",
> "csv"
> ],
> "extractHeader": true,
> "delimiter": ","
> }
> }
> }
> Reporter: Peter McTaggart
>
> When trying to query (via sqlline or WebUI) a .csv file I am getting an IndexOutofBoundsException:
> {noformat} 0: jdbc:drill:> select * from s3data.root.`staging/data/apps1-bad.csv` limit 1;
> Error: SYSTEM ERROR: IndexOutOfBoundsException: index: 16384, length: 4 (expected: range(0, 16384))
> Fragment 0:0
> [Error Id: be9856d2-0b80-4b9c-94a4-a1ca38ec5db0 on ip-XXXXX.compute.internal:31010] (state=,code=0)
> 0: jdbc:drill:> select * from s3data.root.`staging/data/apps1.csv` limit 1;
> +----------+----------------------+----------+----------+----------+------------+----------+------------+----------+--------------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+
> | FIELD_1 | FIELD_2 | FIELD_3 | FIELD_4 | FIELD_5 | FIELD_6 | FIELD_7 | FIELD_8 | FIELD_9 | FIELD_10 | FIELD_11 | FIELD_12 | FIELD_13 | FIELD_14 | FIELD_15 | FIELD_16 | FIELD_17 | FIELD_18 | FIELD_19 | FIELD_20 | FIELD_21 | FIELD_22 | FIELD_23 | FIELD_24 | FIELD_25 | FIELD_26 | FIELD_27 | FIELD_28 | FIELD_29 | FIELD_30 | FIELD_31 | FIELD_32 | FIELD_33 | FIELD_34 | FIELD_35 |
> +----------+----------------------+----------+----------+----------+------------+----------+------------+----------+--------------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+
> | 489517 | 27/10/2015 02:05:27 | 261 | 1130232 | 0 | 925630488 | 0 | 925630488 | -1 | 19531580547 | 00000000 | 27/10/2015 02:00:00 | | 30 | 300 | 0 | 0 | 00000000 | 00000000 | 27/10/2015 02:05:27 | 0 | 1 | 0 | 35.0 | | | | 505 | 872.0 | | aBc | | | | |
> +----------+----------------------+----------+----------+----------+------------+----------+------------+----------+--------------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+
> 1 row selected (1.094 seconds)
> 0: jdbc:drill:> {noformat{
> Good file: apps1.csv, and
> Bad file: apps1-bad.csv attached.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)