You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Sukhendu Chakraborty <su...@gmail.com> on 2014/02/20 00:45:29 UTC

appending data to clustered tables.

Hi,

Is there a way to add data into a bucketed/clustered table in hive-0.11. I
have a clustered table with 32 buckets (no partitions) with some data, can
I append more data by running a "insert into <table>...."? From
http://osdir.com/ml/hive-user-hadoop-apache/2009-03/msg00094.html it looks
like the feature is not supported till 2009.
When I tried experimenting with it in hive-0.11, I saw after the second
insert, a new set of 32 files were created with '000000_*.copy' notation.
So, we had 64 files instead of original 32. Is this an expected behavior
and hive knows how to merge the 64 files into 32 for each bucket before
processing? How about sorted bucketed tables?

Thanks,
-Sukhendu