You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by Divya Gehlot <di...@gmail.com> on 2017/08/14 06:44:26 UTC

UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema changes

Hi,
I am getting below error :

> UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema
> changes Fragment 0:0


*Some background to understand the error scenario : *

I have  data saved in parquet files in multidirectory structure.
One of the column of the table is empty values so I used below to handle
null values

> CASE WHEN `Capacity` = '' THEN CAST(0 AS INT) ELSE CAST(`Capacity` AS INT)
> END AS `Capacity`


My query does some aggregation like sum of the column values .My query
looks like below :
SELECT
`COUNTRY`,
`YEAR`,
`MONTH`,
`DAY`,
`Capacity`,
SUM(CAST(`UserCount` AS INT)) AS `Dailyusers`
FROM `dfs`.`default`.`./parquetfiles/countries/`
WHERE `YEAR` =2017 AND `MONTH` = 2
group by `COUNTRY`,`YEAR`,`MONTH`,`DAY`,`Office``Capacity`

When I include `Capacity` column in above query it gives me below error :

> UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema
> changes


When I exclude the `Capacity` column the query runs fine and I could view
the expected results .

Few approaches/verification  which I tried after browsing the internet
regarding the error whic i encounter :
1 : alter session set `planner.enable_streamagg` = false;
    alter session set `planner.enable_hashjoin` = false;

Still the same error



>  UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema
> changes Fragment


2 : somebody suggested that if  some of your files are empty you will
encounter the error but I checked and confirm that none of my parquet file
is empty.

3.I have same schema across all the files .

4. If I query the files individually there is no error and It works fine .
For instance if my query is
SELECT
`COUNTRY`,
`YEAR`,
`MONTH`,
`DAY`,
`Capacity`,
SUM(CAST(`UserCount` AS INT)) AS `Dailyusers`
FROM `dfs`.`default`.`./parquetfiles/countries/china/0_0_0.parquet`
WHERE `YEAR` =2017 AND `MONTH` = 2
group by `COUNTRY`,`YEAR`,`MONTH`,`DAY`,`Office``Capacity`

Above query works fine for all the countries which I have under countries
directory.
But if I query until directory countries its throws error .

Appreciate the help!

Thanks,
Divya

Re: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema changes

Posted by Khurram Faraaz <kf...@mapr.com>.
DRILL-4510 is open for the error message that you see.


Thanks,

Khurram

________________________________
From: Divya Gehlot <di...@gmail.com>
Sent: Tuesday, August 15, 2017 7:41:39 AM
To: user@drill.apache.org
Subject: Re: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema changes

Hi ,

As advised I set the config parameter

alter system set `drill.exec.functions.cast_empty_string_to_null` = true;

When I try to ran the query on `Capacity` column with aggregated values it
threw below error to me :
s:[
SYSTEM ERROR: IllegalStateException: Failure while reading vector.
Expected vector class of org.apache.drill.exec.vector.NullableIntVector but
was holding vector class
org.apache.drill.exec.vector.NullableVarCharVector, field= Capacity


Thanks,
Divya

On 14 August 2017 at 17:56, Khurram Faraaz <kf...@mapr.com> wrote:

> You should look at DRILL-3214 which is Open. And DRILL-4503 is open for
> Error: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema
> changes
>
>
> And you can try setting this config parameter to see if it helps in your
> case.
>
>
> alter system set `drill.exec.functions.cast_empty_string_to_null` = true;
>
>
> Thanks,
>
> Khurram
>
> ________________________________
> From: Divya Gehlot <di...@gmail.com>
> Sent: Monday, August 14, 2017 3:06:57 PM
> To: user@drill.apache.org
> Subject: Re: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support
> schema changes
>
> forgot to mention I tried this first but some post  in Apache Drill JIRA
> mentioned to use 0  instead of null to get rid of error .
> unfortunately it did not work in my case :(
>
>
> Thanks,
> Divya
>
> On 14 August 2017 at 14:56, Khurram Faraaz <kf...@mapr.com> wrote:
>
> > Can you please try this, since you mention column has empty values try
> > using null instead of zero, because 0 (zero) is  not null.
> >
> >
> > CASE WHEN `Capacity` = '' THEN CAST(null AS integer) ELSE CAST(`Capacity`
> > AS integer)
> >
> > END AS `Capacity`
> >
> >
> > Thanks,
> >
> > Khurram
> >
> > ________________________________
> > From: Divya Gehlot <di...@gmail.com>
> > Sent: Monday, August 14, 2017 12:14:26 PM
> > To: user@drill.apache.org
> > Subject: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support
> > schema changes
> >
> > Hi,
> > I am getting below error :
> >
> > > UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema
> > > changes Fragment 0:0
> >
> >
> > *Some background to understand the error scenario : *
> >
> > I have  data saved in parquet files in multidirectory structure.
> > One of the column of the table is empty values so I used below to handle
> > null values
> >
> > > CASE WHEN `Capacity` = '' THEN CAST(0 AS INT) ELSE CAST(`Capacity` AS
> > INT)
> > > END AS `Capacity`
> >
> >
> > My query does some aggregation like sum of the column values .My query
> > looks like below :
> > SELECT
> > `COUNTRY`,
> > `YEAR`,
> > `MONTH`,
> > `DAY`,
> > `Capacity`,
> > SUM(CAST(`UserCount` AS INT)) AS `Dailyusers`
> > FROM `dfs`.`default`.`./parquetfiles/countries/`
> > WHERE `YEAR` =2017 AND `MONTH` = 2
> > group by `COUNTRY`,`YEAR`,`MONTH`,`DAY`,`Office``Capacity`
> >
> > When I include `Capacity` column in above query it gives me below error :
> >
> > > UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema
> > > changes
> >
> >
> > When I exclude the `Capacity` column the query runs fine and I could view
> > the expected results .
> >
> > Few approaches/verification  which I tried after browsing the internet
> > regarding the error whic i encounter :
> > 1 : alter session set `planner.enable_streamagg` = false;
> >     alter session set `planner.enable_hashjoin` = false;
> >
> > Still the same error
> >
> >
> >
> > >  UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema
> > > changes Fragment
> >
> >
> > 2 : somebody suggested that if  some of your files are empty you will
> > encounter the error but I checked and confirm that none of my parquet
> file
> > is empty.
> >
> > 3.I have same schema across all the files .
> >
> > 4. If I query the files individually there is no error and It works fine
> .
> > For instance if my query is
> > SELECT
> > `COUNTRY`,
> > `YEAR`,
> > `MONTH`,
> > `DAY`,
> > `Capacity`,
> > SUM(CAST(`UserCount` AS INT)) AS `Dailyusers`
> > FROM `dfs`.`default`.`./parquetfiles/countries/china/0_0_0.parquet`
> > WHERE `YEAR` =2017 AND `MONTH` = 2
> > group by `COUNTRY`,`YEAR`,`MONTH`,`DAY`,`Office``Capacity`
> >
> > Above query works fine for all the countries which I have under countries
> > directory.
> > But if I query until directory countries its throws error .
> >
> > Appreciate the help!
> >
> > Thanks,
> > Divya
> >
>

Re: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema changes

Posted by Divya Gehlot <di...@gmail.com>.
Hi ,

As advised I set the config parameter

alter system set `drill.exec.functions.cast_empty_string_to_null` = true;

When I try to ran the query on `Capacity` column with aggregated values it
threw below error to me :
s:[
SYSTEM ERROR: IllegalStateException: Failure while reading vector.
Expected vector class of org.apache.drill.exec.vector.NullableIntVector but
was holding vector class
org.apache.drill.exec.vector.NullableVarCharVector, field= Capacity


Thanks,
Divya

On 14 August 2017 at 17:56, Khurram Faraaz <kf...@mapr.com> wrote:

> You should look at DRILL-3214 which is Open. And DRILL-4503 is open for
> Error: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema
> changes
>
>
> And you can try setting this config parameter to see if it helps in your
> case.
>
>
> alter system set `drill.exec.functions.cast_empty_string_to_null` = true;
>
>
> Thanks,
>
> Khurram
>
> ________________________________
> From: Divya Gehlot <di...@gmail.com>
> Sent: Monday, August 14, 2017 3:06:57 PM
> To: user@drill.apache.org
> Subject: Re: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support
> schema changes
>
> forgot to mention I tried this first but some post  in Apache Drill JIRA
> mentioned to use 0  instead of null to get rid of error .
> unfortunately it did not work in my case :(
>
>
> Thanks,
> Divya
>
> On 14 August 2017 at 14:56, Khurram Faraaz <kf...@mapr.com> wrote:
>
> > Can you please try this, since you mention column has empty values try
> > using null instead of zero, because 0 (zero) is  not null.
> >
> >
> > CASE WHEN `Capacity` = '' THEN CAST(null AS integer) ELSE CAST(`Capacity`
> > AS integer)
> >
> > END AS `Capacity`
> >
> >
> > Thanks,
> >
> > Khurram
> >
> > ________________________________
> > From: Divya Gehlot <di...@gmail.com>
> > Sent: Monday, August 14, 2017 12:14:26 PM
> > To: user@drill.apache.org
> > Subject: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support
> > schema changes
> >
> > Hi,
> > I am getting below error :
> >
> > > UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema
> > > changes Fragment 0:0
> >
> >
> > *Some background to understand the error scenario : *
> >
> > I have  data saved in parquet files in multidirectory structure.
> > One of the column of the table is empty values so I used below to handle
> > null values
> >
> > > CASE WHEN `Capacity` = '' THEN CAST(0 AS INT) ELSE CAST(`Capacity` AS
> > INT)
> > > END AS `Capacity`
> >
> >
> > My query does some aggregation like sum of the column values .My query
> > looks like below :
> > SELECT
> > `COUNTRY`,
> > `YEAR`,
> > `MONTH`,
> > `DAY`,
> > `Capacity`,
> > SUM(CAST(`UserCount` AS INT)) AS `Dailyusers`
> > FROM `dfs`.`default`.`./parquetfiles/countries/`
> > WHERE `YEAR` =2017 AND `MONTH` = 2
> > group by `COUNTRY`,`YEAR`,`MONTH`,`DAY`,`Office``Capacity`
> >
> > When I include `Capacity` column in above query it gives me below error :
> >
> > > UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema
> > > changes
> >
> >
> > When I exclude the `Capacity` column the query runs fine and I could view
> > the expected results .
> >
> > Few approaches/verification  which I tried after browsing the internet
> > regarding the error whic i encounter :
> > 1 : alter session set `planner.enable_streamagg` = false;
> >     alter session set `planner.enable_hashjoin` = false;
> >
> > Still the same error
> >
> >
> >
> > >  UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema
> > > changes Fragment
> >
> >
> > 2 : somebody suggested that if  some of your files are empty you will
> > encounter the error but I checked and confirm that none of my parquet
> file
> > is empty.
> >
> > 3.I have same schema across all the files .
> >
> > 4. If I query the files individually there is no error and It works fine
> .
> > For instance if my query is
> > SELECT
> > `COUNTRY`,
> > `YEAR`,
> > `MONTH`,
> > `DAY`,
> > `Capacity`,
> > SUM(CAST(`UserCount` AS INT)) AS `Dailyusers`
> > FROM `dfs`.`default`.`./parquetfiles/countries/china/0_0_0.parquet`
> > WHERE `YEAR` =2017 AND `MONTH` = 2
> > group by `COUNTRY`,`YEAR`,`MONTH`,`DAY`,`Office``Capacity`
> >
> > Above query works fine for all the countries which I have under countries
> > directory.
> > But if I query until directory countries its throws error .
> >
> > Appreciate the help!
> >
> > Thanks,
> > Divya
> >
>

Re: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema changes

Posted by Khurram Faraaz <kf...@mapr.com>.
You should look at DRILL-3214 which is Open. And DRILL-4503 is open for Error: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema changes


And you can try setting this config parameter to see if it helps in your case.


alter system set `drill.exec.functions.cast_empty_string_to_null` = true;


Thanks,

Khurram

________________________________
From: Divya Gehlot <di...@gmail.com>
Sent: Monday, August 14, 2017 3:06:57 PM
To: user@drill.apache.org
Subject: Re: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema changes

forgot to mention I tried this first but some post  in Apache Drill JIRA
mentioned to use 0  instead of null to get rid of error .
unfortunately it did not work in my case :(


Thanks,
Divya

On 14 August 2017 at 14:56, Khurram Faraaz <kf...@mapr.com> wrote:

> Can you please try this, since you mention column has empty values try
> using null instead of zero, because 0 (zero) is  not null.
>
>
> CASE WHEN `Capacity` = '' THEN CAST(null AS integer) ELSE CAST(`Capacity`
> AS integer)
>
> END AS `Capacity`
>
>
> Thanks,
>
> Khurram
>
> ________________________________
> From: Divya Gehlot <di...@gmail.com>
> Sent: Monday, August 14, 2017 12:14:26 PM
> To: user@drill.apache.org
> Subject: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support
> schema changes
>
> Hi,
> I am getting below error :
>
> > UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema
> > changes Fragment 0:0
>
>
> *Some background to understand the error scenario : *
>
> I have  data saved in parquet files in multidirectory structure.
> One of the column of the table is empty values so I used below to handle
> null values
>
> > CASE WHEN `Capacity` = '' THEN CAST(0 AS INT) ELSE CAST(`Capacity` AS
> INT)
> > END AS `Capacity`
>
>
> My query does some aggregation like sum of the column values .My query
> looks like below :
> SELECT
> `COUNTRY`,
> `YEAR`,
> `MONTH`,
> `DAY`,
> `Capacity`,
> SUM(CAST(`UserCount` AS INT)) AS `Dailyusers`
> FROM `dfs`.`default`.`./parquetfiles/countries/`
> WHERE `YEAR` =2017 AND `MONTH` = 2
> group by `COUNTRY`,`YEAR`,`MONTH`,`DAY`,`Office``Capacity`
>
> When I include `Capacity` column in above query it gives me below error :
>
> > UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema
> > changes
>
>
> When I exclude the `Capacity` column the query runs fine and I could view
> the expected results .
>
> Few approaches/verification  which I tried after browsing the internet
> regarding the error whic i encounter :
> 1 : alter session set `planner.enable_streamagg` = false;
>     alter session set `planner.enable_hashjoin` = false;
>
> Still the same error
>
>
>
> >  UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema
> > changes Fragment
>
>
> 2 : somebody suggested that if  some of your files are empty you will
> encounter the error but I checked and confirm that none of my parquet file
> is empty.
>
> 3.I have same schema across all the files .
>
> 4. If I query the files individually there is no error and It works fine .
> For instance if my query is
> SELECT
> `COUNTRY`,
> `YEAR`,
> `MONTH`,
> `DAY`,
> `Capacity`,
> SUM(CAST(`UserCount` AS INT)) AS `Dailyusers`
> FROM `dfs`.`default`.`./parquetfiles/countries/china/0_0_0.parquet`
> WHERE `YEAR` =2017 AND `MONTH` = 2
> group by `COUNTRY`,`YEAR`,`MONTH`,`DAY`,`Office``Capacity`
>
> Above query works fine for all the countries which I have under countries
> directory.
> But if I query until directory countries its throws error .
>
> Appreciate the help!
>
> Thanks,
> Divya
>

Re: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema changes

Posted by Divya Gehlot <di...@gmail.com>.
forgot to mention I tried this first but some post  in Apache Drill JIRA
mentioned to use 0  instead of null to get rid of error .
unfortunately it did not work in my case :(


Thanks,
Divya

On 14 August 2017 at 14:56, Khurram Faraaz <kf...@mapr.com> wrote:

> Can you please try this, since you mention column has empty values try
> using null instead of zero, because 0 (zero) is  not null.
>
>
> CASE WHEN `Capacity` = '' THEN CAST(null AS integer) ELSE CAST(`Capacity`
> AS integer)
>
> END AS `Capacity`
>
>
> Thanks,
>
> Khurram
>
> ________________________________
> From: Divya Gehlot <di...@gmail.com>
> Sent: Monday, August 14, 2017 12:14:26 PM
> To: user@drill.apache.org
> Subject: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support
> schema changes
>
> Hi,
> I am getting below error :
>
> > UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema
> > changes Fragment 0:0
>
>
> *Some background to understand the error scenario : *
>
> I have  data saved in parquet files in multidirectory structure.
> One of the column of the table is empty values so I used below to handle
> null values
>
> > CASE WHEN `Capacity` = '' THEN CAST(0 AS INT) ELSE CAST(`Capacity` AS
> INT)
> > END AS `Capacity`
>
>
> My query does some aggregation like sum of the column values .My query
> looks like below :
> SELECT
> `COUNTRY`,
> `YEAR`,
> `MONTH`,
> `DAY`,
> `Capacity`,
> SUM(CAST(`UserCount` AS INT)) AS `Dailyusers`
> FROM `dfs`.`default`.`./parquetfiles/countries/`
> WHERE `YEAR` =2017 AND `MONTH` = 2
> group by `COUNTRY`,`YEAR`,`MONTH`,`DAY`,`Office``Capacity`
>
> When I include `Capacity` column in above query it gives me below error :
>
> > UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema
> > changes
>
>
> When I exclude the `Capacity` column the query runs fine and I could view
> the expected results .
>
> Few approaches/verification  which I tried after browsing the internet
> regarding the error whic i encounter :
> 1 : alter session set `planner.enable_streamagg` = false;
>     alter session set `planner.enable_hashjoin` = false;
>
> Still the same error
>
>
>
> >  UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema
> > changes Fragment
>
>
> 2 : somebody suggested that if  some of your files are empty you will
> encounter the error but I checked and confirm that none of my parquet file
> is empty.
>
> 3.I have same schema across all the files .
>
> 4. If I query the files individually there is no error and It works fine .
> For instance if my query is
> SELECT
> `COUNTRY`,
> `YEAR`,
> `MONTH`,
> `DAY`,
> `Capacity`,
> SUM(CAST(`UserCount` AS INT)) AS `Dailyusers`
> FROM `dfs`.`default`.`./parquetfiles/countries/china/0_0_0.parquet`
> WHERE `YEAR` =2017 AND `MONTH` = 2
> group by `COUNTRY`,`YEAR`,`MONTH`,`DAY`,`Office``Capacity`
>
> Above query works fine for all the countries which I have under countries
> directory.
> But if I query until directory countries its throws error .
>
> Appreciate the help!
>
> Thanks,
> Divya
>

Re: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema changes

Posted by Khurram Faraaz <kf...@mapr.com>.
Can you please try this, since you mention column has empty values try using null instead of zero, because 0 (zero) is  not null.


CASE WHEN `Capacity` = '' THEN CAST(null AS integer) ELSE CAST(`Capacity` AS integer)

END AS `Capacity`


Thanks,

Khurram

________________________________
From: Divya Gehlot <di...@gmail.com>
Sent: Monday, August 14, 2017 12:14:26 PM
To: user@drill.apache.org
Subject: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema changes

Hi,
I am getting below error :

> UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema
> changes Fragment 0:0


*Some background to understand the error scenario : *

I have  data saved in parquet files in multidirectory structure.
One of the column of the table is empty values so I used below to handle
null values

> CASE WHEN `Capacity` = '' THEN CAST(0 AS INT) ELSE CAST(`Capacity` AS INT)
> END AS `Capacity`


My query does some aggregation like sum of the column values .My query
looks like below :
SELECT
`COUNTRY`,
`YEAR`,
`MONTH`,
`DAY`,
`Capacity`,
SUM(CAST(`UserCount` AS INT)) AS `Dailyusers`
FROM `dfs`.`default`.`./parquetfiles/countries/`
WHERE `YEAR` =2017 AND `MONTH` = 2
group by `COUNTRY`,`YEAR`,`MONTH`,`DAY`,`Office``Capacity`

When I include `Capacity` column in above query it gives me below error :

> UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema
> changes


When I exclude the `Capacity` column the query runs fine and I could view
the expected results .

Few approaches/verification  which I tried after browsing the internet
regarding the error whic i encounter :
1 : alter session set `planner.enable_streamagg` = false;
    alter session set `planner.enable_hashjoin` = false;

Still the same error



>  UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema
> changes Fragment


2 : somebody suggested that if  some of your files are empty you will
encounter the error but I checked and confirm that none of my parquet file
is empty.

3.I have same schema across all the files .

4. If I query the files individually there is no error and It works fine .
For instance if my query is
SELECT
`COUNTRY`,
`YEAR`,
`MONTH`,
`DAY`,
`Capacity`,
SUM(CAST(`UserCount` AS INT)) AS `Dailyusers`
FROM `dfs`.`default`.`./parquetfiles/countries/china/0_0_0.parquet`
WHERE `YEAR` =2017 AND `MONTH` = 2
group by `COUNTRY`,`YEAR`,`MONTH`,`DAY`,`Office``Capacity`

Above query works fine for all the countries which I have under countries
directory.
But if I query until directory countries its throws error .

Appreciate the help!

Thanks,
Divya