You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Dave Challis (Jira)" <ji...@apache.org> on 2020/05/06 08:41:00 UTC

[jira] [Created] (DRILL-7735) Query against empty parquet file fails with: IndexOutOfBoundsException: Index: 0, Size: 0

Dave Challis created DRILL-7735:
-----------------------------------

             Summary: Query against empty parquet file fails with: IndexOutOfBoundsException: Index: 0, Size: 0
                 Key: DRILL-7735
                 URL: https://issues.apache.org/jira/browse/DRILL-7735
             Project: Apache Drill
          Issue Type: Bug
          Components:  Server, Storage - Parquet
    Affects Versions: 1.17.0
         Environment: 64Gb machine running on AWS.
            Reporter: Dave Challis
         Attachments: dispute.parquet, drillbit.log

Running a `SELECT *` query against an empty Parquet file (i.e. one with correct column metadata written, but no rows) triggers an `IndexOutOfBoundsException`.

I've got an empty parquet file with the following schema:
{noformat}
$ parquet-tools schema dispute.parquet
message parquet_go_root {
  required int32 dispute_id (INT_32) = 0;
  required binary title (UTF8) = 0;
  optional int32 start_date (DATE) = 0;
  optional int32 end_date (DATE) = 0;
  optional binary docket_number (UTF8) = 0;
  required binary route (UTF8) = 0;
  required binary jurisdiction (UTF8) = 0;
}
{noformat}
If I then run the following query via the Drill web UI:
{noformat}
SELECT * FROM dfs.`/data/dispute.parquet`
{noformat}
then I get the following error from Drill:
{noformat}
org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: IndexOutOfBoundsException: Index: 0, Size: 0 Please, refer to logs for more information. 

[Error Id: a93e1aa1-a7e6-4bc9-9f11-c42b9f6fe108 on e531a6492cf4:31010]
{noformat}
Expected result was just to get an empty result set (i.e. 0 rows).

 

I've attached the parquet file in question, and the relevant entries from the drillbit.log.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)