You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hive.apache.org by Travis Powell <tp...@tealeaf.com> on 2011/07/08 22:11:58 UTC

Partition by existing field?

Can I partition by an existing field?

 

I have a 10 GB file with a date field and an hour of day field. Can I
load this file into a table, then insert-overwrite into another
partitioned table that uses those fields as a partition? Would something
like the following work?

 

INSERT OVERWRITE TABLE tealeaf_event
PARTITION(dt=evt.datestring,hour=evt.hour) SELECT * FROM staging_event
evt;

 

Thanks!

Travis

Re: Partition by existing field?

Posted by be...@yahoo.com.

Hi Travis
         From my understanding of your requirement, Dynamic Partitions in hive is the most suitable solution.

I have written a blogpost on such requirements please refer
 http://kickstarthadoop.blogspot.com/2011/06/how-to-speed-up-your-hive-queries-in.html for an understanding on the implementation . You can refer the hive wiki as well.

Please revert for any clarification
Regards
Bejoy K S

-----Original Message-----
From: "Travis Powell" <tp...@tealeaf.com>
Date: Fri, 8 Jul 2011 13:11:58 
To: <us...@hive.apache.org>
Reply-To: user@hive.apache.org
Subject: Partition by existing field?

Can I partition by an existing field?

 

I have a 10 GB file with a date field and an hour of day field. Can I
load this file into a table, then insert-overwrite into another
partitioned table that uses those fields as a partition? Would something
like the following work?

 

INSERT OVERWRITE TABLE tealeaf_event
PARTITION(dt=evt.datestring,hour=evt.hour) SELECT * FROM staging_event
evt;

 

Thanks!

Travis