You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2008/05/23 01:57:11 UTC

[Hadoop Wiki] Trivial Update of "Hbase/Shell/Replacement" by stack

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by stack:
http://wiki.apache.org/hadoop/Hbase/Shell/Replacement

The comment on the change is:
Start on some shell replacement notes

New page:
Notes on the HQL replacement

== Required ==

At least the admin (definitional, DDL) functionality currently in HQL: SHOW (tables), DROP,  CREATE, ALTER.  We don't need JAR (Running MR job jar from HQL cmdline), FS (hadoop fs operations from the HQL cmdline), CLEAR (clear terminal).

At least the manipulative functionality (DML) currently in HQL: SELECT, INSERT, UPDATE, DELETE

Output formatters.  At least ascii (table) and xhtml.  JSON would be a nice-to-have.

User-friendly: 'obvious', 'natural', and lots of help (Hard to have 'fit' criteria for 'user-friendly' but HQL being SQL-like is an example of this requirements' intent)

Read commands from STDIN, dump on STDOUT.

Dynamic language -- python, ruby, etc. -- access to full HBase API as a tool for debugging horked hbase clusters.


== Nice to Haves ==

HBase particular operators: ONLINE/OFFLINE/MERGE

Our replacement should map closely to current client API

Easy to maintain/extend (Hard to have 'fit' criteria for the notion 'easy')

== Some Discussion ==

We might take on SQLs DDL/DML distinction (Was raised when suggested that DELETE could operate on a cell, column, column family, row, or table depending on context).

Create table needs to take table name, table attributes -- e.g. table regionsize -- and column families and their definitions which will include maximum versions, etc.  Attributes on tables and column families are many and will likely evolve over time.  Shouldn't have to rev. the shell parser for every attribute change.  Building these lengthy DDL statements can be involved and error-prone.  Parse failures need to be non-cryptic.  Same table and column family descriptors will be used altering table and column families.

Typing 'help', you should get a dump of all thats possible in the hbase shell.  Should also be able to do help per command and, dependent on how we implement, do help or describe of an object to learn what the object exposes.

Its OK that a user might mistakenly run 'select * from TABLE_WITH_1B_ROWS'.  They won't do it a second time.  A simple search should turn up pointers out of the shell to tools of our manufacture -- MR tools -- or to PIG/JAQL/Cascading.