You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2009/07/27 12:26:12 UTC

[Hadoop Wiki] Trivial Update of "Hive/LanguageManual/DDL" by SaurabhNanda

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by SaurabhNanda:
http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL

The comment on the change is:
Added pointer to CompressedStorage

------------------------------------------------------------------------------
   
  You must specify list of columns for tables with native SerDe. Refer to Types part of the User Guide for the allowable column types. List of columns for tables with custom SerDe may be specified but Hive will query the SerDe to determine the list of columns for this table.
  
- Use STORED AS TEXTFILE if the data needs to be stored as plain text files. Use STORED AS SEQUENCEFILE if the data needs to be compressed. 
+ Use STORED AS TEXTFILE if the data needs to be stored as plain text files. Use STORED AS SEQUENCEFILE if the data needs to be compressed. Please read more about CompressedStorage if you are planning to keep data compressed in your Hive tables.
  
  Partitioned tables can be created using PARTIONED BY clause. A table can have one or more partition columns and a separate data directory is created for each set of partition columns values. Further tables or partitions can be bucketed using CLUSTERD BY columns and data can be sorted with in that bucket by SORT BY columns. This can improve performance on certain kind of queries.