You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Avram Aelony <aa...@mac.com> on 2010/04/12 20:05:00 UTC
adding code descriptions to partitioned table with many partitions
Hello Hive listers,
I have a Hive table to which I have added quite a few partitions. Each partition represents data for a given date. At one partition per day, I could easily have 60+, 365+ partitions.
What is the best way to create assign a description to the levels of a coded variable without having to do so for each partition?
Below is a contrived example:
--Create a table from existing data
create table x (
e_code string,
e_timestamp string,
id string
)
partitioned by (ds string)
row format delimited fields terminated by '\t'
stored as textfile location
's3://some.bucket.name/'
;
--Add data from the partitions
alter table x add partition( ds='20100401') location '20100401' ;
alter table x add partition( ds='20100402') location '20100402' ;
alter table x add partition( ds='20100403') location '20100403' ;
alter table x add partition( ds='20100404') location '20100404' ;
alter table x add partition( ds='20100405') location '20100405' ;
alter table x add partition( ds='20100406') location '20100406' ;
--Create a new table to have the codes and the descriptions
create table x2 like x;
alter table x2 add columns ( e_description string );
--What is the best way to apply the code-to-description logic to *all* dates I have as partitions?
insert overwrite table x2 partition( ds='20100401') --Populate the descriptions, do I really need to do this one date at a time?
select x.*,
case when x.e_code=1 then 'A'
when x.e_code=2 then 'B'
when x.e_code=3 then 'C'
else 'unknown' end as e_description
from x x
;
Thanks in advance!
-Avram