You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2007/08/08 13:32:13 UTC

[Lucene-hadoop Wiki] Update of "Hbase/HbaseShell/HQL" by InchulSong

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.

The following page has been changed by InchulSong:
http://wiki.apache.org/lucene-hadoop/Hbase/HbaseShell/HQL

The comment on the change is:
reflect Hbase APIs 

------------------------------------------------------------------------------
  [[TableOfContents(4)]]
  
  ----
- == HBase Query Language (HQL) Introduction ==
+ == Hbase Query Language (HQL) Introduction ==
  HQL is an SQL-like query language for Hbase. You can use it to query and modify tables in Hbase. 
  HQL is not intended to fully support the SQL syntax and semantics. 
  HQL, instead, is developed to make it easy to manipulate tables in Hbase through the Hbase Shell command line,
@@ -14, +14 @@

  == Data Definition Statements ==
  
  === CREATE TABLE Syntax ===
- CREATE TABLE enables you to create a new table and set how many recent versions of values are kept in the table. 
+ CREATE TABLE enables you to create a new table and set various options for each column family.
  
  {{{
+ # Simple version 
  CREATE TABLE table_name
-   (column_family_name [, column_family_name] ...)
+   (column_family_name MAX_VERSIONS=n [, column_family_name MAX_VERSIONS=n] ...)
-   [NUM_VERSIONS n]
  }}}
  
- NUM_VERSIONS is for the management of versioned data. 
+ MAX_VERSIONS is for the management of versioned data. 
- NUM_VERSIONS makes a table keep only the last n versions of values in a cell. 
+ MAX_VERSIONS makes a table keep only the recent n versions in a cell under a column family. 
- Its default value is 1, i.e., if NUM_VERSIONS is not specified, Hbase keeps 
+ Its default value is 1, i.e., if MAX_VERSIONS is not specified, Hbase keeps 
  only the latest version of value in a cell.
  
- [http://labs.google.com/papers/bigtable.html Google's Bigtable] allows us 
- to specify that only new-enough versions be kept (e.g., only keep values that were 
- written in the last seven days). Bigtable also allows us to specify 
- a different versioning policy for each column family.
+ {{{
+ # Full version
+ CREATE TABLE table_name
+   (column_family_spec [, column_family_spec] ...)
+ 
+ colum_family_spec:
+   column_family_name [MAX_VERSIONS=n] [COMPRESSION=no | block | record] 
+     [IN_MEMORY] [MAX_LENGTH=n] [BLOOMFILTER=bloom| counting | retouched]
+ }}}
+ 
+ See [http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/javadoc/org/apache/hadoop/hbase/HColumnDescriptor.html    HColumnDescriptor API] for more information.
+ 
+ === SHOW TABLES Syntax ===
+ SHOW TABLES shows all available tables.
+ 
+ {{{
+ SHOW TABLES
+ }}}
+ 
  
  === DROP TABLE Syntax ===
  DROP TABLE removes one or more tables. 
@@ -50, +65 @@

  alter_spec: 
      ADD column_family_name
    | ADD (column_family_name [, column_family_name] ...)
-   | DROP column_family_name
+   | DROP column_family_name # not supported yet
-   | CHANGE old_column_family_name new_column_family_name
+   | CHANGE old_column_family_name new_column_family_name # not supported yet
  }}}
  
  == Data Manipulation Statements ==
@@ -61, +76 @@

  {{{
  SELECT { column_name [, column_name] ... | * }
    FROM table_name
-   [WHERE row = 'row-key']
-   [VERSION_LIMIT {n | all}] [LIMIT n]
+   [WHERE row = 'row-key' | STARTING 'row-key']
+   [NUM_VERSIONS=n] [TIMESTAMP 'timestamp']
  
  column_name: 
      column_family_name:column_label_name
@@ -73, +88 @@

  
  If you specify only column_family_name part for a column, you get values from all the column_label_names in the column_family_name.
  
+ STARTING returns all the rows starting at 'row-key'.
- VERSION_LIMIT retrieves only the last n versions of values in a cell. 
- If you do not specify VERSION_LIMIT, you get only the latest version of value in a cell. 
  
- LIMIT returns only the last n rows in row-key order.
+ NUM_VERSIONS retrieves only the recent n versions of values in a cell. 
+ 
+ TIMESTAMP returns only the values with the specified timestamp. 
  
  === INSERT Syntax ===
  INSERT inserts a set of values into a table. 
@@ -84, +100 @@

  {{{
  INSERT INTO table_name (colmn_name [, column_name] ...)
    VALUES ('value' [, 'value'] ...)
-   [WHERE row = 'row-key']
+   WHERE row = 'row-key'
-   [WITH TIMESTAMP 'value']
+   [TIMESTAMP 'timestamp']
  }}}
  
  If a specified column already exists, the specified value for the column is stored as a new version. 
  
- If WITH TIMESTAMP is not specified, the current time is used as the value of the timestamp key.
+ If TIMESTAMP is not specified, the current time is used as the value of the timestamp key.
  
  === DELETE Syntax ===
  DELETE removes a subset of data from a table. 
@@ -98, +114 @@

  {{{
  DELETE { column_name, [, column_name] ... | * }
    FROM table_name
-   [WHERE row = 'row-key']
+   WHERE row = 'row-key'
  }}}
  
  === START TRANSACTION, COMMIT, and ROLLBACK Syntax ===
@@ -106, +122 @@

  
  {{{
  START TRANSACTION ON 'row-key' OF table_name | BEGIN ON 'row-key' OF table_name
- COMMIT 
+ COMMIT ['timestamp']
  ROLLBACK 
  }}}
  
- The START TRANSACTION and BEGIN statements begin a new single-row transaction. 
+ The START TRANSACTION and BEGIN statements begin a new single-row transaction
+ under a 'row-key' of table_name. 
+ 
  COMMIT commits the current transaction, making its changes permanent. 
+ If timestamp is specified on commit, all the modifications under a single-row transaction
+ are stored with the specified timestamp. If not, they are stored with the current time.
+ 
  ROLLBACK rolls back the current transaction, canceling its changes. 
  
  By default, for every statement execution that updates a table, 
- Hbase stores the update on disk.
+ Hbase immediately stores the update on disk.