You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Avram Aelony <aa...@mac.com> on 2010/04/12 20:05:00 UTC

adding code descriptions to partitioned table with many partitions

Hello Hive listers,

I have a Hive table to which I have added quite a few partitions. Each partition represents data for a given date. At one partition per day, I could easily have 60+, 365+ partitions.
What is the best way to create assign a description to the levels of a coded variable without having to do so for each partition?


Below is a contrived example:

--Create a table from existing data
create table x (
 e_code string,
 e_timestamp string,
 id string
)
partitioned by (ds string)
row format delimited fields terminated by '\t'
stored as textfile location
's3://some.bucket.name/'
;


--Add data from the partitions
alter table x add partition( ds='20100401') location '20100401' ;
alter table x add partition( ds='20100402') location '20100402' ;
alter table x add partition( ds='20100403') location '20100403' ;
alter table x add partition( ds='20100404') location '20100404' ;
alter table x add partition( ds='20100405') location '20100405' ;
alter table x add partition( ds='20100406') location '20100406' ;

--Create a new table to have the codes and the descriptions
create table x2 like x;
alter table x2 add columns ( e_description string );



--What is the best way to apply the code-to-description logic to *all* dates I have as partitions?

insert overwrite table x2 partition( ds='20100401')  --Populate the descriptions,  do I really need to do this one date at a time?
select x.*,
case when x.e_code=1 then 'A'
             when x.e_code=2 then 'B'
     when x.e_code=3 then 'C'
else 'unknown' end as e_description
from x x
;


Thanks in advance!

-Avram