You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2007/08/08 00:26:20 UTC

[Lucene-hadoop Wiki] Update of "Hbase/HbaseShell/HQL" by udanax

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.

The following page has been changed by udanax:
http://wiki.apache.org/lucene-hadoop/Hbase/HbaseShell/HQL

New page:
[[TableOfContents(4)]]

----
== HBase Query Language (HQL) Introduction ==
HQL is an SQL-like query language for Hbase. You can use it to query and modify tables in Hbase. 
HQL is not intended to fully support the SQL syntax and semantics. 
HQL, instead, is developed to make it easy to manipulate tables in Hbase through the Hbase Shell command line,
without using programming APIs. 

We borrowed the syntax definition style from MySQL.

== Data Definition Statements ==

=== CREATE TABLE Syntax ===
CREATE TABLE enables you to create a new table and set how many recent versions of values are kept in the table. 

{{{
CREATE TABLE table_name
  (column_family_name [, column_family_name] ...)
  [NUM_VERSIONS n]
}}}

NUM_VERSIONS is for the management of versioned data. 
NUM_VERSIONS makes a table keep only the last n versions of values in a cell. 
Its default value is 1, i.e., if NUM_VERSIONS is not specified, Hbase keeps 
only the latest version of value in a cell.

[http://labs.google.com/papers/bigtable.html Google's Bigtable] allows us 
to specify that only new-enough versions be kept (e.g., only keep values that were 
written in the last seven days). Bigtable also allows us to specify 
a different versioning policy for each column family.

=== DROP TABLE Syntax ===
DROP TABLE removes one or more tables. 

{{{
DROP TABLE table_name [, table_name] ...
}}}

=== ALTER TABLE Syntax ===
ALTER TABLE enables you to change the structure of an existing table. You can 
add, delete, and change column families. 

{{{
ALTER TABLE table_name 
  alter_spec [, alter_spec] ...

alter_spec: 
    ADD column_family_name
  | ADD (column_family_name [, column_family_name] ...)
  | DROP column_family_name
  | CHANGE old_column_family_name new_column_family_name
}}}

== Data Manipulation Statements ==
=== SELECT Syntax ===
SELECT enables you to retrieve a subset of data in a table.

{{{
SELECT { column_name [, column_name] ... | * }
  FROM table_name
  [WHERE row = 'row-key']
  [VERSION_LIMIT {n | all}] [LIMIT n]

column_name: 
    column_family_name:column_label_name
  | column_family_name:
}}}

You should quote column_name with single quotes if column_name has spaces in it.

If you specify only column_family_name part for a column, you get values from all the column_label_names in the column_family_name.

VERSION_LIMIT retrieves only the last n versions of values in a cell. 
If you do not specify VERSION_LIMIT, you get only the latest version of value in a cell. 

LIMIT returns only the last n rows in row-key order.

=== INSERT Syntax ===
INSERT inserts a set of values into a table. 

{{{
INSERT INTO table_name (colmn_name [, column_name] ...)
  VALUES ('value' [, 'value'] ...)
  [WHERE row = 'row-key']
  [WITH TIMESTAMP 'value']
}}}

If a specified column already exists, the specified value for the column is stored as a new version. 

If WITH TIMESTAMP is not specified, the current time is used as the value of the timestamp key.

=== DELETE Syntax ===
DELETE removes a subset of data from a table. 

{{{
DELETE { column_name, [, column_name] ... | * }
  FROM table_name
  [WHERE row = 'row-key']
}}}

=== START TRANSACTION, COMMIT, and ROLLBACK Syntax ===
You can group togather a sequence of data manipulation statements in a single-row transaction.

{{{
START TRANSACTION ON 'row-key' OF table_name | BEGIN ON 'row-key' OF table_name
COMMIT 
ROLLBACK 
}}}

The START TRANSACTION and BEGIN statements begin a new single-row transaction. 
COMMIT commits the current transaction, making its changes permanent. 
ROLLBACK rolls back the current transaction, canceling its changes. 

By default, for every statement execution that updates a table, 
Hbase stores the update on disk.