You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Neha Sood <ne...@gmail.com> on 2013/08/29 11:02:40 UTC

Cannot insert into a bucketized table from the same table

Hive version: 0.10.0-cdh4.2.1

Trying to insert into a bucketized table from the same table, throws
exception:
"FAILED: SemanticException [Error 10122]: Bucketized tables do not support
INSERT INTO: Table:"

To test the scenario, I create the following 3 test tables:
create table temp1
(a int)
PARTITIONED BY (
  b string,
  c string)
  clustered by (a) into 2 buckets
  stored as RCFILE;

create table temp2
(a int)
  stored as RCFILE;

create table temp3
(a int)
PARTITIONED BY (
  b string,
  c string)
  stored as RCFILE;

When I run the below insert queries, queries 1, 3 & 4 work fine, while
query 2 fails with the above mentioned exception.
1. insert into table temp1 partition(b='1', c='1') select * from temp2;

2. insert into table temp1 partition(b='1', c='1') select * from temp2 t2
join temp1 t1 on(t2.a=t1.a);

3. insert into table temp1 partition(b='1', c='1') select t3.a from (select
t1.a from temp2 t2 join temp1 t1 on(t2.a=t1.a)) tt join temp3 t3 on
(tt.a=t3.a);

4. insert into table temp3 partition(b='1', c='1') select t2.a from temp2
t2 join temp3 t1 on(t2.a=t1.a);

So, the above exception occurs only if the target table is bucketed and it
uses itself directly in the join tables. For some reason, case 3 works
fine, where its joining with
some other table finally.

Has anyone faced this issue earlier? Looks like a Hive bug. Is there any
workaround?

Re: Cannot insert into a bucketized table from the same table

Posted by Nitin Pawar <ni...@gmail.com>.
Neha, its not a bug. Hive does not support insert into bucketed table as of
now.

There is a patch available for same at
HIVE-3244<https://issues.apache.org/jira/browse/HIVE-3244>
You can also refer HIVE-3077<https://issues.apache.org/jira/browse/HIVE-3077>


On Thu, Aug 29, 2013 at 2:32 PM, Neha Sood <ne...@gmail.com>wrote:

> Hive version: 0.10.0-cdh4.2.1
>
> Trying to insert into a bucketized table from the same table, throws
> exception:
> "FAILED: SemanticException [Error 10122]: Bucketized tables do not support
> INSERT INTO: Table:"
>
> To test the scenario, I create the following 3 test tables:
> create table temp1
> (a int)
> PARTITIONED BY (
>   b string,
>   c string)
>   clustered by (a) into 2 buckets
>   stored as RCFILE;
>
> create table temp2
> (a int)
>   stored as RCFILE;
>
> create table temp3
> (a int)
> PARTITIONED BY (
>   b string,
>   c string)
>   stored as RCFILE;
>
> When I run the below insert queries, queries 1, 3 & 4 work fine, while
> query 2 fails with the above mentioned exception.
> 1. insert into table temp1 partition(b='1', c='1') select * from temp2;
>
> 2. insert into table temp1 partition(b='1', c='1') select * from temp2 t2
> join temp1 t1 on(t2.a=t1.a);
>
> 3. insert into table temp1 partition(b='1', c='1') select t3.a from
> (select t1.a from temp2 t2 join temp1 t1 on(t2.a=t1.a)) tt join temp3 t3 on
> (tt.a=t3.a);
>
> 4. insert into table temp3 partition(b='1', c='1') select t2.a from temp2
> t2 join temp3 t1 on(t2.a=t1.a);
>
> So, the above exception occurs only if the target table is bucketed and it
> uses itself directly in the join tables. For some reason, case 3 works
> fine, where its joining with
> some other table finally.
>
> Has anyone faced this issue earlier? Looks like a Hive bug. Is there any
> workaround?
>



-- 
Nitin Pawar