You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2007/08/06 10:07:31 UTC

[Lucene-hadoop Wiki] Update of "HbaseShell/HQL" by InchulSong

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.

The following page has been changed by InchulSong:
http://wiki.apache.org/lucene-hadoop/HbaseShell/HQL

The comment on the change is:
initial version of HQL

New page:
[[TableOfContents(4)]]

----
== HBase Query Language (HQL) Introduction ==
HQL is an SQL-like query language for Hbase. You can use it to query and modify tables in Hbase. 
HQL is not intended to fully support the SQL syntax and semantics. 
HQL, instead, is developed to make it easy to manipulate tables in Hbase through the Hbase Shell command line,
without using programming APIs. 

We borrowed the syntax definition style from MySQL.

 ''-- See [:HBaseShell/Examples], fill it with this page!! ^^ [[BR]]udanax''

== Data Definition Statements ==

=== CREATE TABLE Syntax ===
CREATE TABLE enables you to create a new table and set how many recent versions of values are kept in the table. 

{{{
CREATE TABLE table_name
  (column_family_name [, column_family_name] ...)
  [NUM_VERSIONS n]
}}}

NUM_VERSIONS is for the management of versioned data. 
NUM_VERSIONS makes a table keep only the last n versions of values in a cell. 
Its default value is 1, i.e., if NUM_VERSIONS is not specified, Hbase keeps 
only the latest version of value in a cell.

[http://labs.google.com/papers/bigtable.html Google's Bigtable] allows us 
to specify that only new-enough versions be kept (e.g., only keep values that were 
written in the last seven days). Bigtable also allows us to specify 
a different versioning policy for each column family.

=== DROP TABLE Syntax ===
DROP TABLE removes one or more tables. 

{{{
DROP TABLE table_name [, table_name] ...
}}}

=== ALTER TABLE Syntax ===
ALTER TABLE enables you to change the structure of an existing table. You can 
add, delete, and change column families. 

{{{
ALTER TABLE table_name 
  alter_spec [, alter_spec] ...

alter_spec: 
    ADD column_family_name
  | ADD (column_family_name [, column_family_name] ...)
  | DROP column_family_name
  | CHANGE old_column_family_name new_column_family_name
}}}

== Data Manipulation Statements ==
=== START TRANSACTION, COMMIT, and ROLLBACK Syntax ===
You can group togather a sequence of data manipulation statements in a single-row transaction.

{{{
START TRANSACTION table_name ON 'row-key' | BEGIN table_name ON 'row-key'
COMMIT 
ROLLBACK 
}}}

The START TRANSACTION and BEGIN statements begin a new single-row transaction. 
COMMIT commits the current transaction, making its changes permanent. 
ROLLBACK rolls back the current transaction, canceling its changes. 

By default, for every statement execution that updates a table, 
Hbase stores the update on disk.

=== SELECT Syntax ===
SELECT enables you to retrieve a subset of data in a table.

{{{
SELECT { column_name [, column_name] ... | * }
  FROM table_name
  [WHERE cond_expr [AND cond_expr] ...]
  [VERSION_LIMIT {n | all}] [LIMIT n]

column_name: 
    column_family_name:column_label_name
  | column_family_name:

cond_expr:
    { row | column_name | timestamp } op 'value'

op:
  = | < | > | <= | >= 
}}}

You should quote column_name with single quotes if column_name has spaces in it.

If only column_family_name part is specified for a column, the values from all the column_label_names in the column_family_name are retrieved.

VERSION_LIMIT retrieves only the last n versions of values in a cell. If neither VER_LIMIT nor any condition on tiemstamp is specified, only the latest version of value in a cell is returned. 

LIMIT returns only the last n rows in row-key order.

=== INSERT Syntax ===
INSERT inserts a set of values into a table. 

{{{
INSERT INTO table_name (colmn_name [, column_name] ...)
  VALUES ('value' [, 'value'] ...)
  [WHERE cond_expr [AND cond_expr] ...]
  [WITH TIMESTAMP 'value']
}}}

If a specified column already exists, the specified value for the column is stored as a new version. 

If WITH TIMESTAMP is not specified, the current time is used as the value of the timestamp key.

=== DELETE Syntax ===
DELETE removes a subset of data from a table. 

{{{
DELETE { column_name, [, column_name] ... | * }
  FROM table_name
  [WHERE cond_expr [AND cond_expr] ...]
}}}