You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2010/07/10 04:29:39 UTC
[Hadoop Wiki] Update of "Hive/LanguageManual/DDL" by JohnSichi

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hive/LanguageManual/DDL" page has been changed by JohnSichi.
http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL?action=diff&rev1=65&rev2=66

--------------------------------------------------

    [COMMENT table_comment]
    [PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)]
    [CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name [ASC|DESC], ...)] INTO num_buckets BUCKETS]
-   [ROW FORMAT row_format]
+   [
+    [ROW FORMAT row_format] [STORED AS file_format]
+    | STORED BY 'storage.handler.class.name' [ WITH SERDEPROPERTIES (...) ]  (Note:  only available starting with 0.6.0)
+   ]
    [STORED AS file_format]
    [LOCATION hdfs_path]
-   [TBLPROPERTIES (property_name=property_value, ...)]  (Note:  only available on latest trunk or versions higher than 0.5.0)
+   [TBLPROPERTIES (property_name=property_value, ...)]  (Note:  only available starting with 0.6.0)
-   [AS select_statement]  (Note: this feature is only available on the latest trunk or versions higher than 0.4.0.)
+   [AS select_statement]  (Note: this feature is only available starting with 0.5.0.)
  
  CREATE [EXTERNAL] TABLE [IF NOT EXISTS] table_name
    LIKE existing_table_name
@@ -58, +61 @@

    | TEXTFILE
    | INPUTFORMAT input_format_classname OUTPUTFORMAT output_format_classname
  }}}
+ 
  CREATE TABLE creates a table with the given name. An error is thrown if a table or view with the same name already exists. You can use IF NOT EXISTS to skip the error.
  
  The EXTERNAL keyword lets you create a table and provide a LOCATION so that Hive does not use a default location for this table. This comes in handy if you already have data generated. When dropping an EXTERNAL table, data in the table is NOT deleted from the file system.
@@ -69, +73 @@

  You must specify a list of a columns for tables that use a native SerDe. Refer to the Types part of the User Guide for the allowable column types. A list of columns for tables that use a custom SerDe may be specified but Hive will query the SerDe to determine the actual list of columns for this table.
  
  Use STORED AS TEXTFILE if the data needs to be stored as plain text files. Use STORED AS SEQUENCEFILE if the data needs to be compressed. Please read more about [[Hive/CompressedStorage]] if you are planning to keep data compressed in your Hive tables.  Use INPUTFORMAT and OUTPUTFORMAT to specify the name of a corresponding InputFormat and OutputFormat class as a string literal, e.g. 'org.apache.hadoop.hive.contrib.fileformat.base64.Base64TextInputFormat'.
+ 
+ Use STORED BY to create a non-native table, for example in HBase.  See [[Hive/StorageHandlers]] for more information on this option.
  
  Partitioned tables can be created using the PARTITIONED BY clause. A table can have one or more partition columns and a separate data directory is created for each distinct value combination in the partition columns. Further, tables or partitions can be bucketed using CLUSTERED BY columns, and data can be sorted within that bucket via SORT BY columns. This can improve performance on certain kinds of queries.