You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Vitalii Diravka (JIRA)" <ji...@apache.org> on 2017/11/15 15:46:00 UTC
[jira] [Commented] (DRILL-3288) False "Hash aggregate does not support schema changes" error message in a query with merge join and hash aggregation

    [ https://issues.apache.org/jira/browse/DRILL-3288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253661#comment-16253661 ] 

Vitalii Diravka commented on DRILL-3288:
----------------------------------------

[~vicky] I have compared schema of the files provided by you and found that it is different:
{code}
vitalii@vitalii-pc:~/parquet-tools/parquet-mr/parquet-tools/target$ java -jar parquet-tools-1.6.0rc3-SNAPSHOT.jar schema /home/vitalii/Downloads/j6.parquet
message root {
  optional binary c_varchar (UTF8);
  optional int32 c_integer;
  optional int64 c_bigint;
  optional float c_float;
  optional double c_double;
  optional int32 c_date (DATE);
  optional int32 c_time (TIME_MILLIS);
  optional int64 c_timestamp (TIMESTAMP_MILLIS);
  optional boolean c_boolean;
  optional double d9;
  optional double d18;
  optional double d28;
  optional double d38;
}

vitalii@vitalii-pc:~/parquet-tools/parquet-mr/parquet-tools/target$ java -jar parquet-tools-1.6.0rc3-SNAPSHOT.jar schema /home/vitalii/Downloads/j7.parquet
message root {
  required binary c_varchar (UTF8);
  required int32 c_integer;
  required int64 c_bigint;
  required float c_float;
  required double c_double;
  required int32 c_date (DATE);
  required int32 c_time (TIME_MILLIS);
  required int64 c_timestamp (TIMESTAMP_MILLIS);
  required boolean c_boolean;
  required double d9;
  required double d18;
  required double d28;
  required double d38;
}
{code}
"required" and "optional" are different parquet "Type.Repetition" types. It means that hard schema change happens, which isn't supported for "hash aggregating" for now. 
That's why it is an expected result.

> False "Hash aggregate does not support schema changes" error message in a query with merge join and hash aggregation
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: DRILL-3288
>                 URL: https://issues.apache.org/jira/browse/DRILL-3288
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Relational Operators
>    Affects Versions: 1.0.0
>            Reporter: Victoria Markman
>            Priority: Minor
>             Fix For: Future
>
>         Attachments: j6.parquet, j7.parquet
>
>
> This error seems to be happening only when you have both window and regular aggregate function in a query. You will need to disable hash join to reproduce this error: "alter session set `planner.enable_hashjoin` = false"
> Columns in table j6 are all of 'optional' type, columns in j7 are all "required" type. (attached sample for each)
> Here are two queries that are failing for me:
> Query 1 (aggregate function in the having clause):
> {code}
> 0: jdbc:drill:schema=dfs> select
> . . . . . . . . . . . . >         j6.c_integer,
> . . . . . . . . . . . . >         sum(j6.c_integer) over(partition by j6.c_date order by j6.c_time)
> . . . . . . . . . . . . > from
> . . . . . . . . . . . . >         j6, j7
> . . . . . . . . . . . . > where   j6.c_integer = j7.c_integer
> . . . . . . . . . . . . > group by
> . . . . . . . . . . . . >         j6.c_date, j6.c_time, j6.c_integer
> . . . . . . . . . . . . > having
> . . . . . . . . . . . . >         avg(j7.c_integer) > 0;
> java.lang.RuntimeException: java.sql.SQLException: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema changes
> Fragment 0:0
> [Error Id: ed0140d4-244c-4895-bf65-6ea1d085382e on atsqa4-133.qa.lab:31010]
> 	at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
> 	at sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:85)
> 	at sqlline.TableOutputFormat.print(TableOutputFormat.java:116)
> 	at sqlline.SqlLine.print(SqlLine.java:1583)
> 	at sqlline.Commands.execute(Commands.java:852)
> 	at sqlline.Commands.sql(Commands.java:751)
> 	at sqlline.SqlLine.dispatch(SqlLine.java:738)
> 	at sqlline.SqlLine.begin(SqlLine.java:612)
> 	at sqlline.SqlLine.start(SqlLine.java:366)
> 	at sqlline.SqlLine.main(SqlLine.java:259)
> {code}
> Query 2: (window function and aggregate function in projection list):
> {code}
> 0: jdbc:drill:schema=dfs> select
> . . . . . . . . . . . . >         j6.c_integer,
> . . . . . . . . . . . . >         avg(j7.c_integer),
> . . . . . . . . . . . . >         sum(j6.c_integer) over(partition by j6.c_date order by j6.c_time)
> . . . . . . . . . . . . > from    
> . . . . . . . . . . . . >         j6, j7 
> . . . . . . . . . . . . > where   j6.c_integer = j7.c_integer
> . . . . . . . . . . . . > group by 
> . . . . . . . . . . . . >         j6.c_date, j6.c_time, j6.c_integer;
> java.lang.RuntimeException: java.sql.SQLException: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema changes
> Fragment 0:0
> [Error Id: 370188bd-012d-4fc2-a365-fe9e482aaa0f on atsqa4-133.qa.lab:31010]
> 	at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
> 	at sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:85)
> 	at sqlline.TableOutputFormat.print(TableOutputFormat.java:116)
> 	at sqlline.SqlLine.print(SqlLine.java:1583)
> 	at sqlline.Commands.execute(Commands.java:852)
> 	at sqlline.Commands.sql(Commands.java:751)
> 	at sqlline.SqlLine.dispatch(SqlLine.java:738)
> 	at sqlline.SqlLine.begin(SqlLine.java:612)
> 	at sqlline.SqlLine.start(SqlLine.java:366)
> 	at sqlline.SqlLine.main(SqlLine.java:259)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)