You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Mark <st...@gmail.com> on 2010/12/15 22:52:27 UTC

Hive Partitioning

Can someone explain what partitioning is and why it would be used.. 
example? Thanks

Re: Hive Partitioning

Posted by Edward Capriolo <ed...@gmail.com>.
On Wed, Dec 15, 2010 at 4:52 PM, Mark <st...@gmail.com> wrote:
> Can someone explain what partitioning is and why it would be used.. example?
> Thanks
>

A partition is a physical and logical partition of the data. The query
planner can use partitions in the WHERE clause to prune data that hive
does not need to process.

For example, if you partition your table by day, you can write queries
such as SELECT count(1) FROM table where day=20100101. Hive will only
use the single partition as input, rather then the entire table.

Generally, you do not want to have to many partitions small partitions
or too few.

http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL#Add_Partitions