You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hbase.apache.org by st...@apache.org on 2011/03/31 20:12:04 UTC
svn commit: r1087395 - in /hbase/trunk: CHANGES.txt src/docbkx/book.xml
Author: stack
Date: Thu Mar 31 18:12:04 2011
New Revision: 1087395
URL: http://svn.apache.org/viewvc?rev=1087395&view=rev
Log:
HBASE-3720 Book.xml - porting conceptual-view / physical-view sections of HBaseArchitecture wiki
Modified:
hbase/trunk/CHANGES.txt
hbase/trunk/src/docbkx/book.xml
Modified: hbase/trunk/CHANGES.txt
URL: http://svn.apache.org/viewvc/hbase/trunk/CHANGES.txt?rev=1087395&r1=1087394&r2=1087395&view=diff
==============================================================================
--- hbase/trunk/CHANGES.txt (original)
+++ hbase/trunk/CHANGES.txt Thu Mar 31 18:12:04 2011
@@ -110,6 +110,8 @@ Release 0.91.0 - Unreleased
calls from HBase (Liyin Tang via Stack)
HBASE-3717 deprecate HTable isTableEnabled() methods in favor of
HBaseAdmin methods (David Butler via Stack)
+ HBASE-3720 Book.xml - porting conceptual-view / physical-view sections of
+ HBaseArchitecture wiki (Doug Meil via Stack)
TASK
HBASE-3559 Move report of split to master OFF the heartbeat channel
Modified: hbase/trunk/src/docbkx/book.xml
URL: http://svn.apache.org/viewvc/hbase/trunk/src/docbkx/book.xml?rev=1087395&r1=1087394&r2=1087395&view=diff
==============================================================================
--- hbase/trunk/src/docbkx/book.xml (original)
+++ hbase/trunk/src/docbkx/book.xml Thu Mar 31 18:12:04 2011
@@ -302,8 +302,8 @@ public static byte[][] getHexSplits(Stri
<title>Data Model</title>
<para>In short, applications store data into HBase <link linkend="table">tables</link>.
Tables are made of <link linkend="row">rows</link> and <emphasis>columns</emphasis>.
- All colums in HBase belong to a particular
- <link linkend="columnfamily">Column Family</link>.
+ All columns in HBase belong to a particular
+ <link linkend="columnfamily">column family</link>.
Table <link linkend="cell">cells</link> -- the intersection of row and column
coordinates -- are versioned.
A cellâs content is an uninterpreted array of bytes.
@@ -315,6 +315,99 @@ public static byte[][] getHexSplits(Stri
via the table row key -- its primary key.
</para>
+ <section xml:id="conceptual.view"><title>Conceptual View</title>
+ <para>
+ The following example is a slightly modified form of the one on page
+ 2 of the <link xlink:href="http://labs.google.com/papers/bigtable.html">BigTable</link> paper.
+ There is a table called <varname>webtable</varname> that contains two column families named
+ <varname>contents</varname> and <varname>anchor</varname>.
+ In this example, <varname>anchor</varname> contains two
+ columns (<varname>anchor:cssnsi.com</varname>, <varname>anchor:my.look.ca</varname>)
+ and <varname>contents</varname> contains one column (<varname>contents:html</varname>).
+ <note>
+ <title>Column Names</title>
+ <para>
+ By convention, a column name is made of its column family prefix and a
+ <emphasis>qualifier</emphasis>. For example, the
+ column
+ <emphasis>contents:html</emphasis> is of the column family <varname>contents</varname>
+ The colon character (<literal
+ moreinfo="none">:</literal>) delimits the column family from the
+ column family <emphasis>qualifier</emphasis>.
+ </para>
+ </note>
+ <table frame='all'><title>Table <varname>webtable</varname></title>
+ <tgroup cols='4' align='left' colsep='1' rowsep='1'>
+ <colspec colname='c1'/>
+ <colspec colname='c2'/>
+ <colspec colname='c3'/>
+ <colspec colname='c4'/>
+ <thead>
+ <row><entry>Row Key</entry><entry>Time Stamp</entry><entry>ColumnFamily <varname>contents</varname></entry><entry>ColumnFamily <varname>anchor</varname></entry></row>
+ </thead>
+ <tbody>
+ <row><entry>"com.cnn.www"</entry><entry>t9</entry><entry></entry><entry><varname>anchor:cnnsi.com</varname> = "CNN"</entry></row>
+ <row><entry>"com.cnn.www"</entry><entry>t8</entry><entry></entry><entry><varname>anchor:my.look.ca</varname> = "CNN.com"</entry></row>
+ <row><entry>"com.cnn.www"</entry><entry>t6</entry><entry><varname>contents:html</varname> = "<html>..."</entry><entry></entry></row>
+ <row><entry>"com.cnn.www"</entry><entry>t5</entry><entry><varname>contents:html</varname> = "<html>..."</entry><entry></entry></row>
+ <row><entry>"com.cnn.www"</entry><entry>t3</entry><entry><varname>contents:html</varname> = "<html>..."</entry><entry></entry></row>
+ </tbody>
+ </tgroup>
+ </table>
+ </para>
+ </section>
+ <section xml:id="physical.view"><title>Physical View</title>
+ <para>
+ Although at a conceptual level tables may be viewed as a sparse set of rows.
+ Physically they are stored on a per-column family basis. New columns
+ (i.e., <varname>columnfamily:column</varname>) can be added to any
+ column family without pre-announcing them.
+ <table frame='all'><title>ColumnFamily <varname>anchor</varname></title>
+ <tgroup cols='3' align='left' colsep='1' rowsep='1'>
+ <colspec colname='c1'/>
+ <colspec colname='c2'/>
+ <colspec colname='c3'/>
+ <thead>
+ <row><entry>Row Key</entry><entry>Time Stamp</entry><entry>Column Family <varname>anchor</varname></entry></row>
+ </thead>
+ <tbody>
+ <row><entry>"com.cnn.www"</entry><entry>t9</entry><entry><varname>anchor:cnnsi.com</varname> = "CNN"</entry></row>
+ <row><entry>"com.cnn.www"</entry><entry>t8</entry><entry><varname>anchor:my.look.ca</varname> = "CNN.com"</entry></row>
+ </tbody>
+ </tgroup>
+ </table>
+ <table frame='all'><title>ColumnFamily <varname>contents</varname></title>
+ <tgroup cols='3' align='left' colsep='1' rowsep='1'>
+ <colspec colname='c1'/>
+ <colspec colname='c2'/>
+ <colspec colname='c3'/>
+ <thead>
+ <row><entry>Row Key</entry><entry>Time Stamp</entry><entry>ColumnFamily "contents:"</entry></row>
+ </thead>
+ <tbody>
+ <row><entry>"com.cnn.www"</entry><entry>t6</entry><entry><varname>contents:html</varname> = "<html>..."</entry><entry></entry></row>
+ <row><entry>"com.cnn.www"</entry><entry>t5</entry><entry><varname>contents:html</varname> = "<html>..."</entry><entry></entry></row>
+ <row><entry>"com.cnn.www"</entry><entry>t3</entry><entry><varname>contents:html</varname> = "<html>..."</entry><entry></entry></row>
+ </tbody>
+ </tgroup>
+ </table>
+ It is important to note in the diagram above that the empty cells shown in the
+ conceptual view are not stored since they need not be in a column-oriented
+ storage format. Thus a request for the value of the <varname>contents:html</varname>
+ column at time stamp <literal>t8</literal> would return no value. Similarly, a
+ request for an <varname>anchor:my.look.ca</varname> value at time stamp
+ <literal>t9</literal> would return no value. However, if no timestamp is
+ supplied, the most recent value for a particular column would be returned
+ and would also be the first one found since timestamps are stored in
+ descending order. Thus a request for the values of all columns in the row
+ <varname>com.cnn.www</varname> if no timestamp is specified would be:
+ the value of <varname>contents:html</varname> from time stamp
+ <literal>t6</literal>, the value of <varname>anchor:cnnsi.com</varname>
+ from time stamp <literal>t9</literal>, the value of
+ <varname>anchor:my.look.ca</varname> from time stamp <literal>t8</literal>.
+ </para>
+ </section>
+
<section xml:id="table">
<title>Table</title>
<para>
@@ -334,7 +427,7 @@ public static byte[][] getHexSplits(Stri
<title>Column Family<indexterm><primary>Column Family</primary></indexterm></title>
<para>
Columns in HBase are grouped into <emphasis>column families</emphasis>.
- All column members of a column family have a common prefix. For example, the
+ All column members of a column family have the same prefix. For example, the
columns <emphasis>courses:history</emphasis> and
<emphasis>courses:math</emphasis> are both members of the
<emphasis>courses</emphasis> column family.