You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2011/02/02 03:52:41 UTC

[Hadoop Wiki] Update of "Hive/Tutorial" by LarryOgrodnek

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hive/Tutorial" page has been changed by LarryOgrodnek.
The comment on this change is: small comment indicating that dynamic column values are taken from the end of the select clause.
http://wiki.apache.org/hadoop/Hive/Tutorial?action=diff&rev1=31&rev2=32

--------------------------------------------------

  
  There are several syntactic differences from the multi-insert statement: 
    * country appears in the PARTITION specification, but with no value associated. In this case, country is a ''dynamic partition column''. On the other hand, ds has a value associated with it, which means it is a ''static partition column''. If a column is dynamic partition column, its value will be coming from the input column. Currently we only allow dynamic partition columns to be the last column(s) in the partition clause because the partition column order indicates its hierarchical order (meaning dt is the root partition, and country is the child partition). You cannot specify a partition clause with (dt, country='US') because that means you need to update all partitions with any date and its country sub-partition is 'US'. 
-   * An additional pvs.country column is added in the select statement. This is the corresponding input column for the dynamic partition column. Note that you do not need to add an input column for the static partition column because its value is already known in the PARTITION clause. 
+   * An additional pvs.country column is added in the select statement. This is the corresponding input column for the dynamic partition column. Note that you do not need to add an input column for the static partition column because its value is already known in the PARTITION clause. Note that the dynamic partition values are selected by ordering, not name, and taken as the last columns from the select clause.
  
  Semantics of the dynamic partition insert statement:
    * When there are already non-empty partitions exists for the dynamic partition columns, (e.g., country='CA' exists under some ds root partition), it will be overwritten if the dynamic partition insert saw the same value (say 'CA') in the input data. This is in line with the 'insert overwrite' semantics. However, if the partition value 'CA' does not appear in the input data, the existing partition will not be overwritten.