You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Vitalii Diravka (JIRA)" <ji...@apache.org> on 2017/11/15 15:46:00 UTC
[jira] [Commented] (DRILL-3288) False "Hash aggregate does not
support schema changes" error message in a query with merge join and hash
aggregation
[ https://issues.apache.org/jira/browse/DRILL-3288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253661#comment-16253661 ]
Vitalii Diravka commented on DRILL-3288:
----------------------------------------
[~vicky] I have compared schema of the files provided by you and found that it is different:
{code}
vitalii@vitalii-pc:~/parquet-tools/parquet-mr/parquet-tools/target$ java -jar parquet-tools-1.6.0rc3-SNAPSHOT.jar schema /home/vitalii/Downloads/j6.parquet
message root {
optional binary c_varchar (UTF8);
optional int32 c_integer;
optional int64 c_bigint;
optional float c_float;
optional double c_double;
optional int32 c_date (DATE);
optional int32 c_time (TIME_MILLIS);
optional int64 c_timestamp (TIMESTAMP_MILLIS);
optional boolean c_boolean;
optional double d9;
optional double d18;
optional double d28;
optional double d38;
}
vitalii@vitalii-pc:~/parquet-tools/parquet-mr/parquet-tools/target$ java -jar parquet-tools-1.6.0rc3-SNAPSHOT.jar schema /home/vitalii/Downloads/j7.parquet
message root {
required binary c_varchar (UTF8);
required int32 c_integer;
required int64 c_bigint;
required float c_float;
required double c_double;
required int32 c_date (DATE);
required int32 c_time (TIME_MILLIS);
required int64 c_timestamp (TIMESTAMP_MILLIS);
required boolean c_boolean;
required double d9;
required double d18;
required double d28;
required double d38;
}
{code}
"required" and "optional" are different parquet "Type.Repetition" types. It means that hard schema change happens, which isn't supported for "hash aggregating" for now.
That's why it is an expected result.
> False "Hash aggregate does not support schema changes" error message in a query with merge join and hash aggregation
> --------------------------------------------------------------------------------------------------------------------
>
> Key: DRILL-3288
> URL: https://issues.apache.org/jira/browse/DRILL-3288
> Project: Apache Drill
> Issue Type: Bug
> Components: Execution - Relational Operators
> Affects Versions: 1.0.0
> Reporter: Victoria Markman
> Priority: Minor
> Fix For: Future
>
> Attachments: j6.parquet, j7.parquet
>
>
> This error seems to be happening only when you have both window and regular aggregate function in a query. You will need to disable hash join to reproduce this error: "alter session set `planner.enable_hashjoin` = false"
> Columns in table j6 are all of 'optional' type, columns in j7 are all "required" type. (attached sample for each)
> Here are two queries that are failing for me:
> Query 1 (aggregate function in the having clause):
> {code}
> 0: jdbc:drill:schema=dfs> select
> . . . . . . . . . . . . > j6.c_integer,
> . . . . . . . . . . . . > sum(j6.c_integer) over(partition by j6.c_date order by j6.c_time)
> . . . . . . . . . . . . > from
> . . . . . . . . . . . . > j6, j7
> . . . . . . . . . . . . > where j6.c_integer = j7.c_integer
> . . . . . . . . . . . . > group by
> . . . . . . . . . . . . > j6.c_date, j6.c_time, j6.c_integer
> . . . . . . . . . . . . > having
> . . . . . . . . . . . . > avg(j7.c_integer) > 0;
> java.lang.RuntimeException: java.sql.SQLException: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema changes
> Fragment 0:0
> [Error Id: ed0140d4-244c-4895-bf65-6ea1d085382e on atsqa4-133.qa.lab:31010]
> at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
> at sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:85)
> at sqlline.TableOutputFormat.print(TableOutputFormat.java:116)
> at sqlline.SqlLine.print(SqlLine.java:1583)
> at sqlline.Commands.execute(Commands.java:852)
> at sqlline.Commands.sql(Commands.java:751)
> at sqlline.SqlLine.dispatch(SqlLine.java:738)
> at sqlline.SqlLine.begin(SqlLine.java:612)
> at sqlline.SqlLine.start(SqlLine.java:366)
> at sqlline.SqlLine.main(SqlLine.java:259)
> {code}
> Query 2: (window function and aggregate function in projection list):
> {code}
> 0: jdbc:drill:schema=dfs> select
> . . . . . . . . . . . . > j6.c_integer,
> . . . . . . . . . . . . > avg(j7.c_integer),
> . . . . . . . . . . . . > sum(j6.c_integer) over(partition by j6.c_date order by j6.c_time)
> . . . . . . . . . . . . > from
> . . . . . . . . . . . . > j6, j7
> . . . . . . . . . . . . > where j6.c_integer = j7.c_integer
> . . . . . . . . . . . . > group by
> . . . . . . . . . . . . > j6.c_date, j6.c_time, j6.c_integer;
> java.lang.RuntimeException: java.sql.SQLException: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema changes
> Fragment 0:0
> [Error Id: 370188bd-012d-4fc2-a365-fe9e482aaa0f on atsqa4-133.qa.lab:31010]
> at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
> at sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:85)
> at sqlline.TableOutputFormat.print(TableOutputFormat.java:116)
> at sqlline.SqlLine.print(SqlLine.java:1583)
> at sqlline.Commands.execute(Commands.java:852)
> at sqlline.Commands.sql(Commands.java:751)
> at sqlline.SqlLine.dispatch(SqlLine.java:738)
> at sqlline.SqlLine.begin(SqlLine.java:612)
> at sqlline.SqlLine.start(SqlLine.java:366)
> at sqlline.SqlLine.main(SqlLine.java:259)
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)