You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@accumulo.apache.org by bi...@apache.org on 2014/03/17 22:10:12 UTC
svn commit: r1578577 [2/3] - in /accumulo/site/trunk/content: 1.4/
1.4/examples/ 1.4/user_manual/ 1.5/ 1.5/examples/
Modified: accumulo/site/trunk/content/1.5/accumulo_user_manual.html
URL: http://svn.apache.org/viewvc/accumulo/site/trunk/content/1.5/accumulo_user_manual.html?rev=1578577&r1=1578576&r2=1578577&view=diff
==============================================================================
--- accumulo/site/trunk/content/1.5/accumulo_user_manual.html (original)
+++ accumulo/site/trunk/content/1.5/accumulo_user_manual.html Mon Mar 17 21:10:11 2014
@@ -1,6665 +1,3528 @@
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
-
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
-
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
-
<head>
-
<meta http-equiv="Content-Type" content="application/xhtml+xml; charset=UTF-8" />
-
<meta name="generator" content="AsciiDoc 8.6.8" />
-
<title>Apache Accumulo User Manual Version 1.5</title>
-
<style type="text/css">
-
/* Shared CSS for AsciiDoc xhtml11 and html5 backends */
-
-
/* Default font. */
-
body {
-
font-family: Georgia,serif;
-
}
-
-
/* Title font. */
-
h1, h2, h3, h4, h5, h6,
-
div.title, caption.title,
-
thead, p.table.header,
-
#toctitle,
-
#author, #revnumber, #revdate, #revremark,
-
#footer {
-
font-family: Arial,Helvetica,sans-serif;
-
}
-
-
body {
-
margin: 1em 5% 1em 5%;
-
}
-
-
a {
-
color: blue;
-
text-decoration: underline;
-
}
-
a:visited {
-
color: fuchsia;
-
}
-
-
em {
-
font-style: italic;
-
color: navy;
-
}
-
-
strong {
-
font-weight: bold;
-
color: #083194;
-
}
-
-
h1, h2, h3, h4, h5, h6 {
-
color: #527bbd;
-
margin-top: 1.2em;
-
margin-bottom: 0.5em;
-
line-height: 1.3;
-
}
-
-
h1, h2, h3 {
-
border-bottom: 2px solid silver;
-
}
-
h2 {
-
padding-top: 0.5em;
-
}
-
h3 {
-
float: left;
-
}
-
h3 + * {
-
clear: left;
-
}
-
h5 {
-
font-size: 1.0em;
-
}
-
-
div.sectionbody {
-
margin-left: 0;
-
}
-
-
hr {
-
border: 1px solid silver;
-
}
-
-
p {
-
margin-top: 0.5em;
-
margin-bottom: 0.5em;
-
}
-
-
ul, ol, li > p {
-
margin-top: 0;
-
}
-
ul > li { color: #aaa; }
-
ul > li > * { color: black; }
-
-
pre {
-
padding: 0;
-
margin: 0;
-
}
-
-
#author {
-
color: #527bbd;
-
font-weight: bold;
-
font-size: 1.1em;
-
}
-
#email {
-
}
-
#revnumber, #revdate, #revremark {
-
}
-
-
#footer {
-
font-size: small;
-
border-top: 2px solid silver;
-
padding-top: 0.5em;
-
margin-top: 4.0em;
-
}
-
#footer-text {
-
float: left;
-
padding-bottom: 0.5em;
-
}
-
#footer-badges {
-
float: right;
-
padding-bottom: 0.5em;
-
}
-
-
#preamble {
-
margin-top: 1.5em;
-
margin-bottom: 1.5em;
-
}
-
div.imageblock, div.exampleblock, div.verseblock,
-
div.quoteblock, div.literalblock, div.listingblock, div.sidebarblock,
-
div.admonitionblock {
-
margin-top: 1.0em;
-
margin-bottom: 1.5em;
-
}
-
div.admonitionblock {
-
margin-top: 2.0em;
-
margin-bottom: 2.0em;
-
margin-right: 10%;
-
color: #606060;
-
}
-
-
div.content { /* Block element content. */
-
padding: 0;
-
}
-
-
/* Block element titles. */
-
div.title, caption.title {
-
color: #527bbd;
-
font-weight: bold;
-
text-align: left;
-
margin-top: 1.0em;
-
margin-bottom: 0.5em;
-
}
-
div.title + * {
-
margin-top: 0;
-
}
-
-
td div.title:first-child {
-
margin-top: 0.0em;
-
}
-
div.content div.title:first-child {
-
margin-top: 0.0em;
-
}
-
div.content + div.title {
-
margin-top: 0.0em;
-
}
-
-
div.sidebarblock > div.content {
-
background: #ffffee;
-
border: 1px solid #dddddd;
-
border-left: 4px solid #f0f0f0;
-
padding: 0.5em;
-
}
-
-
div.listingblock > div.content {
-
border: 1px solid #dddddd;
-
border-left: 5px solid #f0f0f0;
-
background: #f8f8f8;
-
padding: 0.5em;
-
}
-
-
div.quoteblock, div.verseblock {
-
padding-left: 1.0em;
-
margin-left: 1.0em;
-
margin-right: 10%;
-
border-left: 5px solid #f0f0f0;
-
color: #777777;
-
}
-
-
div.quoteblock > div.attribution {
-
padding-top: 0.5em;
-
text-align: right;
-
}
-
-
div.verseblock > pre.content {
-
font-family: inherit;
-
font-size: inherit;
-
}
-
div.verseblock > div.attribution {
-
padding-top: 0.75em;
-
text-align: left;
-
}
-
/* DEPRECATED: Pre version 8.2.7 verse style literal block. */
-
div.verseblock + div.attribution {
-
text-align: left;
-
}
-
-
div.admonitionblock .icon {
-
vertical-align: top;
-
font-size: 1.1em;
-
font-weight: bold;
-
text-decoration: underline;
-
color: #527bbd;
-
padding-right: 0.5em;
-
}
-
div.admonitionblock td.content {
-
padding-left: 0.5em;
-
border-left: 3px solid #dddddd;
-
}
-
-
div.exampleblock > div.content {
-
border-left: 3px solid #dddddd;
-
padding-left: 0.5em;
-
}
-
-
div.imageblock div.content { padding-left: 0; }
-
span.image img { border-style: none; }
-
a.image:visited { color: white; }
-
-
dl {
-
margin-top: 0.8em;
-
margin-bottom: 0.8em;
-
}
-
dt {
-
margin-top: 0.5em;
-
margin-bottom: 0;
-
font-style: normal;
-
color: navy;
-
}
-
dd > *:first-child {
-
margin-top: 0.1em;
-
}
-
-
ul, ol {
-
list-style-position: outside;
-
}
-
ol.arabic {
-
list-style-type: decimal;
-
}
-
ol.loweralpha {
-
list-style-type: lower-alpha;
-
}
-
ol.upperalpha {
-
list-style-type: upper-alpha;
-
}
-
ol.lowerroman {
-
list-style-type: lower-roman;
-
}
-
ol.upperroman {
-
list-style-type: upper-roman;
-
}
-
-
div.compact ul, div.compact ol,
-
div.compact p, div.compact p,
-
div.compact div, div.compact div {
-
margin-top: 0.1em;
-
margin-bottom: 0.1em;
-
}
-
-
tfoot {
-
font-weight: bold;
-
}
-
td > div.verse {
-
white-space: pre;
-
}
-
-
div.hdlist {
-
margin-top: 0.8em;
-
margin-bottom: 0.8em;
-
}
-
div.hdlist tr {
-
padding-bottom: 15px;
-
}
-
dt.hdlist1.strong, td.hdlist1.strong {
-
font-weight: bold;
-
}
-
td.hdlist1 {
-
vertical-align: top;
-
font-style: normal;
-
padding-right: 0.8em;
-
color: navy;
-
}
-
td.hdlist2 {
-
vertical-align: top;
-
}
-
div.hdlist.compact tr {
-
margin: 0;
-
padding-bottom: 0;
-
}
-
-
.comment {
-
background: yellow;
-
}
-
-
.footnote, .footnoteref {
-
font-size: 0.8em;
-
}
-
-
span.footnote, span.footnoteref {
-
vertical-align: super;
-
}
-
-
#footnotes {
-
margin: 20px 0 20px 0;
-
padding: 7px 0 0 0;
-
}
-
-
#footnotes div.footnote {
-
margin: 0 0 5px 0;
-
}
-
-
#footnotes hr {
-
border: none;
-
border-top: 1px solid silver;
-
height: 1px;
-
text-align: left;
-
margin-left: 0;
-
width: 20%;
-
min-width: 100px;
-
}
-
-
div.colist td {
-
padding-right: 0.5em;
-
padding-bottom: 0.3em;
-
vertical-align: top;
-
}
-
div.colist td img {
-
margin-top: 0.3em;
-
}
-
-
@media print {
-
#footer-badges { display: none; }
-
}
-
-
#toc {
-
margin-bottom: 2.5em;
-
}
-
-
#toctitle {
-
color: #527bbd;
-
font-size: 1.1em;
-
font-weight: bold;
-
margin-top: 1.0em;
-
margin-bottom: 0.1em;
-
}
-
-
div.toclevel1, div.toclevel2, div.toclevel3, div.toclevel4 {
-
margin-top: 0;
-
margin-bottom: 0;
-
}
-
div.toclevel2 {
-
margin-left: 2em;
-
font-size: 0.9em;
-
}
-
div.toclevel3 {
-
margin-left: 4em;
-
font-size: 0.9em;
-
}
-
div.toclevel4 {
-
margin-left: 6em;
-
font-size: 0.9em;
-
}
-
-
span.aqua { color: aqua; }
-
span.black { color: black; }
-
span.blue { color: blue; }
-
span.fuchsia { color: fuchsia; }
-
span.gray { color: gray; }
-
span.green { color: green; }
-
span.lime { color: lime; }
-
span.maroon { color: maroon; }
-
span.navy { color: navy; }
-
span.olive { color: olive; }
-
span.purple { color: purple; }
-
span.red { color: red; }
-
span.silver { color: silver; }
-
span.teal { color: teal; }
-
span.white { color: white; }
-
span.yellow { color: yellow; }
-
-
span.aqua-background { background: aqua; }
-
span.black-background { background: black; }
-
span.blue-background { background: blue; }
-
span.fuchsia-background { background: fuchsia; }
-
span.gray-background { background: gray; }
-
span.green-background { background: green; }
-
span.lime-background { background: lime; }
-
span.maroon-background { background: maroon; }
-
span.navy-background { background: navy; }
-
span.olive-background { background: olive; }
-
span.purple-background { background: purple; }
-
span.red-background { background: red; }
-
span.silver-background { background: silver; }
-
span.teal-background { background: teal; }
-
span.white-background { background: white; }
-
span.yellow-background { background: yellow; }
-
-
span.big { font-size: 2em; }
-
span.small { font-size: 0.6em; }
-
-
span.underline { text-decoration: underline; }
-
span.overline { text-decoration: overline; }
-
span.line-through { text-decoration: line-through; }
-
-
-
/*
-
* xhtml11 specific
-
*
-
* */
-
-
tt {
-
font-family: monospace;
-
font-size: inherit;
-
color: navy;
-
}
-
-
div.tableblock {
-
margin-top: 1.0em;
-
margin-bottom: 1.5em;
-
}
-
div.tableblock > table {
-
border: 3px solid #527bbd;
-
}
-
thead, p.table.header {
-
font-weight: bold;
-
color: #527bbd;
-
}
-
p.table {
-
margin-top: 0;
-
}
-
/* Because the table frame attribute is overriden by CSS in most browsers. */
-
div.tableblock > table[frame="void"] {
-
border-style: none;
-
}
-
div.tableblock > table[frame="hsides"] {
-
border-left-style: none;
-
border-right-style: none;
-
}
-
div.tableblock > table[frame="vsides"] {
-
border-top-style: none;
-
border-bottom-style: none;
-
}
-
-
-
/*
-
* html5 specific
-
*
-
* */
-
-
.monospaced {
-
font-family: monospace;
-
font-size: inherit;
-
color: navy;
-
}
-
-
table.tableblock {
-
margin-top: 1.0em;
-
margin-bottom: 1.5em;
-
}
-
thead, p.tableblock.header {
-
font-weight: bold;
-
color: #527bbd;
-
}
-
p.tableblock {
-
margin-top: 0;
-
}
-
table.tableblock {
-
border-width: 3px;
-
border-spacing: 0px;
-
border-style: solid;
-
border-color: #527bbd;
-
border-collapse: collapse;
-
}
-
th.tableblock, td.tableblock {
-
border-width: 1px;
-
padding: 4px;
-
border-style: solid;
-
border-color: #527bbd;
-
}
-
-
table.tableblock.frame-topbot {
-
border-left-style: hidden;
-
border-right-style: hidden;
-
}
-
table.tableblock.frame-sides {
-
border-top-style: hidden;
-
border-bottom-style: hidden;
-
}
-
table.tableblock.frame-none {
-
border-style: hidden;
-
}
-
-
th.tableblock.halign-left, td.tableblock.halign-left {
-
text-align: left;
-
}
-
th.tableblock.halign-center, td.tableblock.halign-center {
-
text-align: center;
-
}
-
th.tableblock.halign-right, td.tableblock.halign-right {
-
text-align: right;
-
}
-
-
th.tableblock.valign-top, td.tableblock.valign-top {
-
vertical-align: top;
-
}
-
th.tableblock.valign-middle, td.tableblock.valign-middle {
-
vertical-align: middle;
-
}
-
th.tableblock.valign-bottom, td.tableblock.valign-bottom {
-
vertical-align: bottom;
-
}
-
-
-
/*
-
* manpage specific
-
*
-
* */
-
-
body.manpage h1 {
-
padding-top: 0.5em;
-
padding-bottom: 0.5em;
-
border-top: 2px solid silver;
-
border-bottom: 2px solid silver;
-
}
-
body.manpage h2 {
-
border-style: none;
-
}
-
body.manpage div.sectionbody {
-
margin-left: 3em;
-
}
-
-
@media print {
-
body.manpage div#toc { display: none; }
-
}
-
-
-
/*
-
* Theme specific overrides of the preceding (asciidoc.css) CSS.
-
*
-
*/
-
body {
-
font-family: Garamond, Georgia, serif;
-
font-size: 17px;
-
color: #3E4349;
-
line-height: 1.3em;
-
}
-
h1, h2, h3, h4, h5, h6,
-
div.title, caption.title,
-
thead, p.table.header,
-
#toctitle,
-
#author, #revnumber, #revdate, #revremark,
-
#footer {
-
font-family: Garmond, Georgia, serif;
-
font-weight: normal;
-
border-bottom-width: 0;
-
color: #3E4349;
-
}
-
div.title, caption.title { color: #596673; font-weight: bold; }
-
h1 { font-size: 240%; }
-
h2 { font-size: 180%; }
-
h3 { font-size: 150%; }
-
h4 { font-size: 130%; }
-
h5 { font-size: 115%; }
-
h6 { font-size: 100%; }
-
#header h1 { margin-top: 0; }
-
#toc {
-
color: #444444;
-
line-height: 1.5;
-
padding-top: 1.5em;
-
}
-
#toctitle {
-
font-size: 20px;
-
}
-
#toc a {
-
border-bottom: 1px dotted #999999;
-
color: #444444 !important;
-
text-decoration: none !important;
-
}
-
#toc a:hover {
-
border-bottom: 1px solid #6D4100;
-
color: #6D4100 !important;
-
text-decoration: none !important;
-
}
-
div.toclevel1 { margin-top: 0.2em; font-size: 16px; }
-
div.toclevel2 { margin-top: 0.15em; font-size: 14px; }
-
em, dt, td.hdlist1 { color: black; }
-
strong { color: #3E4349; }
-
a { color: #004B6B; text-decoration: none; border-bottom: 1px dotted #004B6B; }
-
a:visited { color: #615FA0; border-bottom: 1px dotted #615FA0; }
-
a:hover { color: #6D4100; border-bottom: 1px solid #6D4100; }
-
div.tableblock > table, table.tableblock { border: 3px solid #E8E8E8; }
-
th.tableblock, td.tableblock { border: 1px solid #E8E8E8; }
-
ul > li > * { color: #3E4349; }
-
pre, tt, .monospaced { font-family: Consolas,Menlo,'Deja Vu Sans Mono','Bitstream Vera Sans Mono',monospace; }
-
tt, .monospaced { font-size: 0.9em; color: black;
-
}
-
div.exampleblock > div.content, div.sidebarblock > div.content, div.listingblock > div.content { border-width: 0 0 0 3px; border-color: #E8E8E8; }
-
div.verseblock { border-left-width: 0; margin-left: 3em; }
-
div.quoteblock { border-left-width: 3px; margin-left: 0; margin-right: 0;}
-
div.admonitionblock td.content { border-left: 3px solid #E8E8E8; }
-
-
-
@media screen {
-
body {
-
max-width: 50em; /* approximately 80 characters wide */
-
margin-left: 16em;
-
}
-
-
#toc {
-
position: fixed;
-
top: 0;
-
left: 0;
-
bottom: 0;
-
width: 13em;
-
padding: 0.5em;
-
padding-bottom: 1.5em;
-
margin: 0;
-
overflow: auto;
-
border-right: 3px solid #f8f8f8;
-
background-color: white;
-
}
-
-
#toc .toclevel1 {
-
margin-top: 0.5em;
-
}
-
-
#toc .toclevel2 {
-
margin-top: 0.25em;
-
display: list-item;
-
color: #aaaaaa;
-
}
-
-
#toctitle {
-
margin-top: 0.5em;
-
}
-
}
-
</style>
-
<script type="text/javascript">
-
/*<![CDATA[*/
-
var asciidoc = { // Namespace.
-
-
/////////////////////////////////////////////////////////////////////
-
// Table Of Contents generator
-
/////////////////////////////////////////////////////////////////////
-
-
/* Author: Mihai Bazon, September 2002
-
* http://students.infoiasi.ro/~mishoo
-
*
-
* Table Of Content generator
-
* Version: 0.4
-
*
-
* Feel free to use this script under the terms of the GNU General Public
-
* License, as long as you do not remove or alter this notice.
-
*/
-
-
/* modified by Troy D. Hanson, September 2006. License: GPL */
-
/* modified by Stuart Rackham, 2006, 2009. License: GPL */
-
-
// toclevels = 1..4.
-
toc: function (toclevels) {
-
-
function getText(el) {
-
var text = "";
-
for (var i = el.firstChild; i != null; i = i.nextSibling) {
-
if (i.nodeType == 3 /* Node.TEXT_NODE */) // IE doesn't speak constants.
-
text += i.data;
-
else if (i.firstChild != null)
-
text += getText(i);
-
}
-
return text;
-
}
-
-
function TocEntry(el, text, toclevel) {
-
this.element = el;
-
this.text = text;
-
this.toclevel = toclevel;
-
}
-
-
function tocEntries(el, toclevels) {
-
var result = new Array;
-
var re = new RegExp('[hH]([1-'+(toclevels+1)+'])');
-
// Function that scans the DOM tree for header elements (the DOM2
-
// nodeIterator API would be a better technique but not supported by all
-
// browsers).
-
var iterate = function (el) {
-
for (var i = el.firstChild; i != null; i = i.nextSibling) {
-
if (i.nodeType == 1 /* Node.ELEMENT_NODE */) {
-
var mo = re.exec(i.tagName);
-
if (mo && (i.getAttribute("class") || i.getAttribute("className")) != "float") {
-
result[result.length] = new TocEntry(i, getText(i), mo[1]-1);
-
}
-
iterate(i);
-
}
-
}
-
}
-
iterate(el);
-
return result;
-
}
-
-
var toc = document.getElementById("toc");
-
if (!toc) {
-
return;
-
}
-
-
// Delete existing TOC entries in case we're reloading the TOC.
-
var tocEntriesToRemove = [];
-
var i;
-
for (i = 0; i < toc.childNodes.length; i++) {
-
var entry = toc.childNodes[i];
-
if (entry.nodeName.toLowerCase() == 'div'
-
&& entry.getAttribute("class")
-
&& entry.getAttribute("class").match(/^toclevel/))
-
tocEntriesToRemove.push(entry);
-
}
-
for (i = 0; i < tocEntriesToRemove.length; i++) {
-
toc.removeChild(tocEntriesToRemove[i]);
-
}
-
-
// Rebuild TOC entries.
-
var entries = tocEntries(document.getElementById("content"), toclevels);
-
for (var i = 0; i < entries.length; ++i) {
-
var entry = entries[i];
-
if (entry.element.id == "")
-
entry.element.id = "_toc_" + i;
-
var a = document.createElement("a");
-
a.href = "#" + entry.element.id;
-
a.appendChild(document.createTextNode(entry.text));
-
var div = document.createElement("div");
-
div.appendChild(a);
-
div.className = "toclevel" + entry.toclevel;
-
toc.appendChild(div);
-
}
-
if (entries.length == 0)
-
toc.parentNode.removeChild(toc);
-
},
-
-
-
/////////////////////////////////////////////////////////////////////
-
// Footnotes generator
-
/////////////////////////////////////////////////////////////////////
-
-
/* Based on footnote generation code from:
-
* http://www.brandspankingnew.net/archive/2005/07/format_footnote.html
-
*/
-
-
footnotes: function () {
-
// Delete existing footnote entries in case we're reloading the footnodes.
-
var i;
-
var noteholder = document.getElementById("footnotes");
-
if (!noteholder) {
-
return;
-
}
-
var entriesToRemove = [];
-
for (i = 0; i < noteholder.childNodes.length; i++) {
-
var entry = noteholder.childNodes[i];
-
if (entry.nodeName.toLowerCase() == 'div' && entry.getAttribute("class") == "footnote")
-
entriesToRemove.push(entry);
-
}
-
for (i = 0; i < entriesToRemove.length; i++) {
-
noteholder.removeChild(entriesToRemove[i]);
-
}
-
-
// Rebuild footnote entries.
-
var cont = document.getElementById("content");
-
var spans = cont.getElementsByTagName("span");
-
var refs = {};
-
var n = 0;
-
for (i=0; i<spans.length; i++) {
-
if (spans[i].className == "footnote") {
-
n++;
-
var note = spans[i].getAttribute("data-note");
-
if (!note) {
-
// Use [\s\S] in place of . so multi-line matches work.
-
// Because JavaScript has no s (dotall) regex flag.
-
note = spans[i].innerHTML.match(/\s*\[([\s\S]*)]\s*/)[1];
-
spans[i].innerHTML =
-
"[<a id='_footnoteref_" + n + "' href='#_footnote_" + n +
-
"' title='View footnote' class='footnote'>" + n + "</a>]";
-
spans[i].setAttribute("data-note", note);
-
}
-
noteholder.innerHTML +=
-
"<div class='footnote' id='_footnote_" + n + "'>" +
-
"<a href='#_footnoteref_" + n + "' title='Return to text'>" +
-
n + "</a>. " + note + "</div>";
-
var id =spans[i].getAttribute("id");
-
if (id != null) refs["#"+id] = n;
-
}
-
}
-
if (n == 0)
-
noteholder.parentNode.removeChild(noteholder);
-
else {
-
// Process footnoterefs.
-
for (i=0; i<spans.length; i++) {
-
if (spans[i].className == "footnoteref") {
-
var href = spans[i].getElementsByTagName("a")[0].getAttribute("href");
-
href = href.match(/#.*/)[0]; // Because IE return full URL.
-
n = refs[href];
-
spans[i].innerHTML =
-
"[<a href='#_footnote_" + n +
-
"' title='View footnote' class='footnote'>" + n + "</a>]";
-
}
-
}
-
}
-
},
-
-
install: function(toclevels) {
-
var timerId;
-
-
function reinstall() {
-
asciidoc.footnotes();
-
if (toclevels) {
-
asciidoc.toc(toclevels);
-
}
-
}
-
-
function reinstallAndRemoveTimer() {
-
clearInterval(timerId);
-
reinstall();
-
}
-
-
timerId = setInterval(reinstall, 500);
-
if (document.addEventListener)
-
document.addEventListener("DOMContentLoaded", reinstallAndRemoveTimer, false);
-
else
-
window.onload = reinstallAndRemoveTimer;
-
}
-
-
}
-
asciidoc.install(4);
-
/*]]>*/
-
</script>
-
</head>
-
<body class="book" style="max-width:50em">
-
<div id="header">
-
<h1>Apache Accumulo User Manual Version 1.5</h1>
-
<span id="author">Apache Accumulo Project</span><br />
-
<span id="email"><code><<a href="mailto:dev@accumulo.apache.org">dev@accumulo.apache.org</a>></code></span><br />
-
<div id="toc">
<div id="toctitle">Apache Accumulo 1.5</div>
<noscript><p><b>JavaScript must be enabled in your browser to display the table of contents.</b></p></noscript>
</div>
-
</div>
-
<div id="content">
-
<div id="preamble">
-
<div class="sectionbody">
-
<div class="imageblock">
-
<div class="content">
-
<img src="images/accumulo-logo.png" alt="images/accumulo-logo.png" />
-
</div>
-
</div>
-
<div class="paragraph"><p>Copyright © 2011-2013 The Apache Software Foundation, Licensed under the Apache
-
License, Version 2.0. Apache Accumulo, Accumulo, Apache, and the Apache
-
Accumulo project logo are trademarks of the Apache Software Foundation.</p></div>
-
</div>
-
</div>
-
<div class="sect1">
-
<h2 id="_introduction">1. Introduction</h2>
-
<div class="sectionbody">
-
<div class="paragraph"><p>Apache Accumulo is a highly scalable structured store based on Google’s BigTable.
-
Accumulo is written in Java and operates over the Hadoop Distributed File System
-
(HDFS), which is part of the popular Apache Hadoop project. Accumulo supports
-
efficient storage and retrieval of structured data, including queries for ranges, and
-
provides support for using Accumulo tables as input and output for MapReduce
-
jobs.</p></div>
-
<div class="paragraph"><p>Accumulo features automatic load-balancing and partitioning, data compression
-
and fine-grained security labels.</p></div>
-
</div>
-
</div>
-
<div class="sect1">
-
<h2 id="_accumulo_design">2. Accumulo Design</h2>
-
<div class="sectionbody">
-
<div class="sect2">
-
<h3 id="_data_model">2.1. Data Model</h3>
-
<div class="paragraph"><p>Accumulo provides a richer data model than simple key-value stores, but is not a
-
fully relational database. Data is represented as key-value pairs, where the key and
-
value are comprised of the following elements:</p></div>
-
<div class="tableblock">
-
<table rules="all"
-
width="75%"
-
frame="border"
-
cellspacing="0" cellpadding="4">
-
<col width="16%" />
-
<col width="16%" />
-
<col width="16%" />
-
<col width="16%" />
-
<col width="16%" />
-
<col width="16%" />
-
<tbody>
-
<tr>
-
<td colspan="5" align="center" valign="top"><p class="table">Key</p></td>
-
<td rowspan="3" align="center" valign="middle"><p class="table">Value</p></td>
-
</tr>
-
<tr>
-
<td rowspan="2" align="center" valign="middle"><p class="table">Row ID</p></td>
-
<td colspan="3" align="center" valign="top"><p class="table">Column</p></td>
-
<td rowspan="2" align="center" valign="middle"><p class="table">Timestamp</p></td>
-
</tr>
-
<tr>
-
<td align="center" valign="top"><p class="table">Family</p></td>
-
<td align="center" valign="top"><p class="table">Qualifier</p></td>
-
<td align="center" valign="top"><p class="table">Visibility</p></td>
-
</tr>
-
</tbody>
-
</table>
-
</div>
-
<div class="paragraph"><p>All elements of the Key and the Value are represented as byte arrays except for
-
Timestamp, which is a Long. Accumulo sorts keys by element and lexicographically
-
in ascending order. Timestamps are sorted in descending order so that later
-
versions of the same Key appear first in a sequential scan. Tables consist of a set of
-
sorted key-value pairs.</p></div>
-
</div>
-
<div class="sect2">
-
<h3 id="_architecture">2.2. Architecture</h3>
-
<div class="paragraph"><p>Accumulo is a distributed data storage and retrieval system and as such consists of
-
several architectural components, some of which run on many individual servers.
-
Much of the work Accumulo does involves maintaining certain properties of the
-
data, such as organization, availability, and integrity, across many commodity-class
-
machines.</p></div>
-
</div>
-
<div class="sect2">
-
<h3 id="_components">2.3. Components</h3>
-
<div class="paragraph"><p>An instance of Accumulo includes many TabletServers, one Garbage Collector process,
-
one Master server and many Clients.</p></div>
-
<div class="sect3">
-
<h4 id="_tablet_server">2.3.1. Tablet Server</h4>
-
-<div class="paragraph"><p>The TabletServer manages some subset of all the tablets (partitions of tables).
-
-This includes receiving writes from clients, persisting writes to a
-
+<div class="paragraph"><p>The TabletServer manages some subset of all the tablets (partitions of tables). This includes receiving writes from clients, persisting writes to a
write-ahead log, sorting new key-value pairs in memory, periodically
-
flushing sorted key-value pairs to new files in HDFS, and responding
-
to reads from clients, forming a merge-sorted view of all keys and
-
values from all the files it has created and the sorted in-memory
-
store.</p></div>
-
<div class="paragraph"><p>TabletServers also perform recovery of a tablet
-
that was previously on a server that failed, reapplying any writes
-
found in the write-ahead log to the tablet.</p></div>
-
</div>
-
<div class="sect3">
-
<h4 id="_garbage_collector">2.3.2. Garbage Collector</h4>
-
<div class="paragraph"><p>Accumulo processes will share files stored in HDFS. Periodically, the Garbage
-
Collector will identify files that are no longer needed by any process, and
-
delete them.</p></div>
-
</div>
-
<div class="sect3">
-
<h4 id="_master">2.3.3. Master</h4>
-
<div class="paragraph"><p>The Accumulo Master is responsible for detecting and responding to TabletServer
-
failure. It tries to balance the load across TabletServer by assigning tablets carefully
-
and instructing TabletServers to unload tablets when necessary. The Master ensures all
-
tablets are assigned to one TabletServer each, and handles table creation, alteration,
-
and deletion requests from clients. The Master also coordinates startup, graceful
-
shutdown and recovery of changes in write-ahead logs when Tablet servers fail.</p></div>
-
<div class="paragraph"><p>Multiple masters may be run. The masters will choose among themselves a single master,
-
and the others will become backups if the master should fail.</p></div>
-
</div>
-
<div class="sect3">
-
<h4 id="_client">2.3.4. Client</h4>
-
<div class="paragraph"><p>Accumulo includes a client library that is linked to every application. The client
-
library contains logic for finding servers managing a particular tablet, and
-
communicating with TabletServers to write and retrieve key-value pairs.</p></div>
-
</div>
-
</div>
-
<div class="sect2">
-
<h3 id="_data_management">2.4. Data Management</h3>
-
<div class="paragraph"><p>Accumulo stores data in tables, which are partitioned into tablets. Tablets are
-
partitioned on row boundaries so that all of the columns and values for a particular
-
row are found together within the same tablet. The Master assigns Tablets to one
-
TabletServer at a time. This enables row-level transactions to take place without
-
using distributed locking or some other complicated synchronization mechanism. As
-
clients insert and query data, and as machines are added and removed from the
-
cluster, the Master migrates tablets to ensure they remain available and that the
-
ingest and query load is balanced across the cluster.</p></div>
-
<div class="imageblock">
-
<div class="content">
-
<img src="images/data_distribution.png" alt="images/data_distribution.png" width="500" />
-
</div>
-
</div>
-
</div>
-
<div class="sect2">
-
<h3 id="_tablet_service">2.5. Tablet Service</h3>
-
<div class="paragraph"><p>When a write arrives at a TabletServer it is written to a Write-Ahead Log and
-
then inserted into a sorted data structure in memory called a MemTable. When the
-
MemTable reaches a certain size the TabletServer writes out the sorted key-value
-
pairs to a file in HDFS called Indexed Sequential Access Method (ISAM)
-
file. This process is called a minor compaction. A new MemTable is then created
-
and the fact of the compaction is recorded in the Write-Ahead Log.</p></div>
-
<div class="paragraph"><p>When a request to read data arrives at a TabletServer, the TabletServer does a
-
binary search across the MemTable as well as the in-memory indexes associated
-
with each ISAM file to find the relevant values. If clients are performing a
-
scan, several key-value pairs are returned to the client in order from the
-
MemTable and the set of ISAM files by performing a merge-sort as they are read.</p></div>
-
</div>
-
<div class="sect2">
-
<h3 id="_compactions">2.6. Compactions</h3>
-
<div class="paragraph"><p>In order to manage the number of files per tablet, periodically the TabletServer
-
performs Major Compactions of files within a tablet, in which some set of ISAM
-
files are combined into one file. The previous files will eventually be removed
-
by the Garbage Collector. This also provides an opportunity to permanently
-
remove deleted key-value pairs by omitting key-value pairs suppressed by a
-
delete entry when the new file is created.</p></div>
-
</div>
-
<div class="sect2">
-
<h3 id="_splitting">2.7. Splitting</h3>
-
<div class="paragraph"><p>When a table is created it has one tablet. As the table grows its initial
-
tablet eventually splits into two tablets. Its likely that one of these
-
tablets will migrate to another tablet server. As the table continues to grow,
-
its tablets will continue to split and be migrated. The decision to
-
automatically split a tablet is based on the size of a tablets files. The
-
size threshold at which a tablet splits is configurable per table. In addition
-
to automatic splitting, a user can manually add split points to a table to
-
create new tablets. Manually splitting a new table can parallelize reads and
-
writes giving better initial performance without waiting for automatic
-
splitting.</p></div>
-
<div class="paragraph"><p>As data is deleted from a table, tablets may shrink. Over time this can lead
-
to small or empty tablets. To deal with this, merging of tablets was
-
introduced in Accumulo 1.4. This is discussed in more detail later.</p></div>
-
</div>
-
<div class="sect2">
-
<h3 id="_fault_tolerance">2.8. Fault-Tolerance</h3>
-
<div class="paragraph"><p>If a TabletServer fails, the Master detects it and automatically reassigns the tablets
-
assigned from the failed server to other servers. Any key-value pairs that were in
-
memory at the time the TabletServer fails are automatically reapplied from the Write-Ahead
-
-Log to prevent any loss of data.</p></div>
-
-<div class="paragraph"><p>The Master will coordinate the copying of write-ahead logs to HDFS so the logs
-
-are available to all tablet servers. To make recovery efficient, the updates
-
-within a log are grouped by tablet. TabletServers can quickly apply the
-
-mutations from the sorted logs that are destined for the tablets they have now
-
-been assigned.</p></div>
-
+Log (WAL) to prevent any loss of data.</p></div>
+<div class="paragraph"><p>Tablet servers write their WALs directly to HDFS so the logs are available to all tablet
+servers for recovery. To make the recovery process efficient, the updates within a log are
+grouped by tablet. TabletServers can quickly apply the mutations from the sorted logs
+that are destined for the tablets they have now been assigned.</p></div>
<div class="paragraph"><p>TabletServer failures are noted on the Master’s monitor page, accessible via
-
<code>http://master-address:50095/monitor</code>.</p></div>
-
<div class="imageblock">
-
<div class="content">
-
<img src="images/failure_handling.png" alt="images/failure_handling.png" width="500" />
-
</div>
-
</div>
-
</div>
-
</div>
-
</div>
-
<div class="sect1">
-
<h2 id="_accumulo_shell">3. Accumulo Shell</h2>
-
<div class="sectionbody">
-
<div class="paragraph"><p>Accumulo provides a simple shell that can be used to examine the contents and
-
configuration settings of tables, insert/update/delete values, and change
-
configuration settings.</p></div>
-
<div class="paragraph"><p>The shell can be started by the following command:</p></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>$ACCUMULO_HOME/bin/accumulo shell -u [username]</code></pre>
-
</div></div>
-
<div class="paragraph"><p>The shell will prompt for the corresponding password to the username specified
-
and then display the following prompt:</p></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>Shell - Apache Accumulo Interactive Shell
-
-
-
- version 1.5
-
- instance name: myinstance
-
- instance id: 00000000-0000-0000-0000-000000000000
-
-
-
- type 'help' for a list of available commands
-
-</code></pre>
-
</div></div>
-
<div class="sect2">
-
<h3 id="_basic_administration">3.1. Basic Administration</h3>
-
<div class="paragraph"><p>The Accumulo shell can be used to create and delete tables, as well as to configure
-
table and instance specific options.</p></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>root@myinstance> tables
-
!METADATA</code></pre>
-
</div></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>root@myinstance> createtable mytable</code></pre>
-
</div></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>root@myinstance mytable></code></pre>
-
</div></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>root@myinstance mytable> tables
-
!METADATA
-
mytable</code></pre>
-
</div></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>root@myinstance mytable> createtable testtable</code></pre>
-
</div></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>root@myinstance testtable></code></pre>
-
</div></div>
-
<div class="literalblock">
-
<div class="content">
-
-<pre><code>root@myinstance testtable> deletetable testtable</code></pre>
-
+<pre><code>root@myinstance testtable> deletetable testtable
+deletetable { testtable } (yes|no)? yes
+Table: [testtable] has been deleted.</code></pre>
</div></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>root@myinstance></code></pre>
-
</div></div>
-
<div class="paragraph"><p>The Shell can also be used to insert updates and scan tables. This is useful for
-
inspecting tables.</p></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>root@myinstance mytable> scan</code></pre>
-
</div></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>root@myinstance mytable> insert row1 colf colq value1
-
insert successful</code></pre>
-
</div></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>root@myinstance mytable> scan
-
row1 colf:colq [] value1</code></pre>
-
</div></div>
-
<div class="paragraph"><p>The value in brackets “[]” would be the visibility labels. Since none were used, this is empty for this row.
-
You can use the “-st” option to scan to see the timestamp for the cell, too.</p></div>
-
</div>
-
<div class="sect2">
-
<h3 id="_table_maintenance">3.2. Table Maintenance</h3>
-
<div class="paragraph"><p>The <strong>compact</strong> command instructs Accumulo to schedule a compaction of the table during which
-
files are consolidated and deleted entries are removed.</p></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>root@myinstance mytable> compact -t mytable
-
-07 16:13:53,201 [shell.Shell] INFO : Compaction of table mytable
-
-scheduled for 20100707161353EDT</code></pre>
-
+07 16:13:53,201 [shell.Shell] INFO : Compaction of table mytable started for given range</code></pre>
</div></div>
-
<div class="paragraph"><p>The <strong>flush</strong> command instructs Accumulo to write all entries currently in memory for a given table
-
to disk.</p></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>root@myinstance mytable> flush -t mytable
-
07 16:14:19,351 [shell.Shell] INFO : Flush of table mytable
-
initiated...</code></pre>
-
</div></div>
-
</div>
-
<div class="sect2">
-
<h3 id="_user_administration">3.3. User Administration</h3>
-
<div class="paragraph"><p>The Shell can be used to add, remove, and grant privileges to users.</p></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>root@myinstance mytable> createuser bob
-
Enter new password for 'bob': *********
-
Please confirm new password for 'bob': *********</code></pre>
-
</div></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>root@myinstance mytable> authenticate bob
-
Enter current password for 'bob': *********
-
Valid</code></pre>
-
</div></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>root@myinstance mytable> grant System.CREATE_TABLE -s -u bob</code></pre>
-
</div></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>root@myinstance mytable> user bob
-
Enter current password for 'bob': *********</code></pre>
-
</div></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>bob@myinstance mytable> userpermissions
-
System permissions: System.CREATE_TABLE
-
Table permissions (!METADATA): Table.READ
-
Table permissions (mytable): NONE</code></pre>
-
</div></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>bob@myinstance mytable> createtable bobstable
-
bob@myinstance bobstable></code></pre>
-
</div></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>bob@myinstance bobstable> user root
-
Enter current password for 'root': *********</code></pre>
-
</div></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>root@myinstance bobstable> revoke System.CREATE_TABLE -s -u bob</code></pre>
-
</div></div>
-
</div>
-
</div>
-
</div>
-
<div class="sect1">
-
<h2 id="_writing_accumulo_clients">4. Writing Accumulo Clients</h2>
-
<div class="sectionbody">
-
<div class="sect2">
-
<h3 id="_running_client_code">4.1. Running Client Code</h3>
-
<div class="paragraph"><p>There are multiple ways to run Java code that uses Accumulo. Below is a list
-
of the different ways to execute client code.</p></div>
-
<div class="ulist"><ul>
-
<li>
-
<p>
-
using java executable
-
</p>
-
</li>
-
<li>
-
<p>
-
using the accumulo script
-
</p>
-
</li>
-
<li>
-
<p>
-
using the tool script
-
</p>
-
</li>
-
</ul></div>
-
<div class="paragraph"><p>In order to run client code written to run against Accumulo, you will need to
-
include the jars that Accumulo depends on in your classpath. Accumulo client
-
-code depends on Hadoop and Zookeeper. For Hadoop add the hadoop core jar, all
-
+code depends on Hadoop and Zookeeper. For Hadoop add the hadoop client jar, all
of the jars in the Hadoop lib directory, and the conf directory to the
-
classpath. For Zookeeper 3.3 you only need to add the Zookeeper jar, and not
-
what is in the Zookeeper lib directory. You can run the following command on a
-
configured Accumulo system to see what its using for its classpath.</p></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>$ACCUMULO_HOME/bin/accumulo classpath</code></pre>
-
</div></div>
-
<div class="paragraph"><p>Another option for running your code is to put a jar file in
-
-<code>$ACCUMULO_HOME/lib/ext</code>. After doing this you can use the accumulo
-
+<code>$ACCUMULO_HOME/lib/ext</code>. After doing this you can use the accumulo
script to execute your code. For example if you create a jar containing the
-
class com.foo.Client and placed that in <code>lib/ext</code>, then you could use the command
-
<code>$ACCUMULO_HOME/bin/accumulo com.foo.Client</code> to execute your code.</p></div>
-
<div class="paragraph"><p>If you are writing map reduce job that access Accumulo, then you can use the
-
bin/tool.sh script to run those jobs. See the map reduce example.</p></div>
-
</div>
-
<div class="sect2">
-
<h3 id="_connecting">4.2. Connecting</h3>
-
<div class="paragraph"><p>All clients must first identify the Accumulo instance to which they will be
-
communicating. Code to do this is as follows:</p></div>
-
<div class="listingblock">
-
-<div class="content"><!-- Generator: GNU source-highlight 3.1.4
-
+<div class="content"><!-- Generator: GNU source-highlight 3.1.7
by Lorenzo Bettini
-
http://www.lorenzobettini.it
-
http://www.gnu.org/software/src-highlite -->
-
<pre><tt><span style="color: #000000">String</span> instanceName = <span style="color: #4c73a6">"myinstance"</span>;
-
<span style="color: #000000">String</span> zooServers = <span style="color: #4c73a6">"zooserver-one,zooserver-two"</span>
-
<span style="color: #000000">Instance</span> inst = <span style="color: #0000b3">new</span> <span style="font-weight: bold"><span style="color: #000000">ZooKeeperInstance</span></span>(instanceName, zooServers);
-
-<span style="color: #000000">Connector</span> conn = inst.<span style="font-weight: bold"><span style="color: #000000">getConnector</span></span>(<span style="color: #4c73a6">"user"</span>, <span style="color: #4c73a6">"passwd"</span>);</tt></pre></div></div>
-
+<span style="color: #000000">Connector</span> conn = inst.<span style="font-weight: bold"><span style="color: #000000">getConnector</span></span>(<span style="color: #4c73a6">"user"</span>, <span style="color: #0000b3">new</span> <span style="font-weight: bold"><span style="color: #000000">PasswordToken</span></span>(<span style="color: #4c73a6">"passwd"</span>));</tt></pre></div></div>
</div>
-
<div class="sect2">
-
<h3 id="_writing_data">4.3. Writing Data</h3>
-
<div class="paragraph"><p>Data are written to Accumulo by creating Mutation objects that represent all the
-
changes to the columns of a single row. The changes are made atomically in the
-
TabletServer. Clients then add Mutations to a BatchWriter which submits them to
-
the appropriate TabletServers.</p></div>
-
<div class="paragraph"><p>Mutations can be created thus:</p></div>
-
<div class="listingblock">
-
-<div class="content"><!-- Generator: GNU source-highlight 3.1.4
-
+<div class="content"><!-- Generator: GNU source-highlight 3.1.7
by Lorenzo Bettini
-
http://www.lorenzobettini.it
-
http://www.gnu.org/software/src-highlite -->
-
<pre><tt><span style="color: #000000">Text</span> rowID = <span style="color: #0000b3">new</span> <span style="font-weight: bold"><span style="color: #000000">Text</span></span>(<span style="color: #4c73a6">"row1"</span>);
-
<span style="color: #000000">Text</span> colFam = <span style="color: #0000b3">new</span> <span style="font-weight: bold"><span style="color: #000000">Text</span></span>(<span style="color: #4c73a6">"myColFam"</span>);
-
<span style="color: #000000">Text</span> colQual = <span style="color: #0000b3">new</span> <span style="font-weight: bold"><span style="color: #000000">Text</span></span>(<span style="color: #4c73a6">"myColQual"</span>);
-
<span style="color: #000000">ColumnVisibility</span> colVis = <span style="color: #0000b3">new</span> <span style="font-weight: bold"><span style="color: #000000">ColumnVisibility</span></span>(<span style="color: #4c73a6">"public"</span>);
-
<span style="color: #000000">long</span> timestamp = System.<span style="font-weight: bold"><span style="color: #000000">currentTimeMillis</span></span>();
-
-
<span style="color: #000000">Value</span> value = <span style="color: #0000b3">new</span> <span style="font-weight: bold"><span style="color: #000000">Value</span></span>(<span style="color: #4c73a6">"myValue"</span>.<span style="font-weight: bold"><span style="color: #000000">getBytes</span></span>());
-
-
<span style="color: #000000">Mutation</span> mutation = <span style="color: #0000b3">new</span> <span style="font-weight: bold"><span style="color: #000000">Mutation</span></span>(rowID);
-
mutation.<span style="font-weight: bold"><span style="color: #000000">put</span></span>(colFam, colQual, colVis, timestamp, value);</tt></pre></div></div>
-
<div class="sect3">
-
<h4 id="_batchwriter">4.3.1. BatchWriter</h4>
-
<div class="paragraph"><p>The BatchWriter is highly optimized to send Mutations to multiple TabletServers
-
and automatically batches Mutations destined for the same TabletServer to
-
amortize network overhead. Care must be taken to avoid changing the contents of
-
any Object passed to the BatchWriter since it keeps objects in memory while
-
batching.</p></div>
-
<div class="paragraph"><p>Mutations are added to a BatchWriter thus:</p></div>
-
<div class="listingblock">
-
-<div class="content"><!-- Generator: GNU source-highlight 3.1.4
-
+<div class="content"><!-- Generator: GNU source-highlight 3.1.7
by Lorenzo Bettini
-
http://www.lorenzobettini.it
-
http://www.gnu.org/software/src-highlite -->
+<pre><tt><span style="font-style: italic"><span style="color: #b30000">//BatchWriterConfig has reasonable defaults</span></span>
+<span style="color: #000000">BatchWriterConfig</span> config = <span style="color: #0000b3">new</span> <span style="font-weight: bold"><span style="color: #000000">BatchWriterConfig</span></span>();
+config.<span style="font-weight: bold"><span style="color: #000000">setMaxMemory</span></span>(<span style="color: #000000">10000000L</span>); <span style="font-style: italic"><span style="color: #b30000">// bytes available to batchwriter for buffering mutations</span></span>
-<pre><tt><span style="color: #000000">long</span> memBuf = <span style="color: #000000">1000000L</span>; <span style="font-style: italic"><span style="color: #b30000">// bytes to store before sending a batch</span></span>
-
-<span style="color: #000000">long</span> timeout = <span style="color: #000000">1000L</span>; <span style="font-style: italic"><span style="color: #b30000">// milliseconds to wait before sending</span></span>
-
-<span style="color: #000000">int</span> numThreads = <span style="color: #000000">10</span>;
-
-
-
-<span style="color: #000000">BatchWriter</span> writer =
-
- conn.<span style="font-weight: bold"><span style="color: #000000">createBatchWriter</span></span>(<span style="color: #4c73a6">"table"</span>, memBuf, timeout, numThreads)
-
-
-
-writer.<span style="font-weight: bold"><span style="color: #000000">add</span></span>(mutation);
-
+<span style="color: #000000">BatchWriter</span> writer = conn.<span style="font-weight: bold"><span style="color: #000000">createBatchWriter</span></span>(<span style="color: #4c73a6">"table"</span>, config)
+writer.<span style="font-weight: bold"><span style="color: #000000">add</span></span>(mutation);
writer.<span style="font-weight: bold"><span style="color: #000000">close</span></span>();</tt></pre></div></div>
-
<div class="paragraph"><p>An example of using the batch writer can be found at
-
<code>accumulo/docs/examples/README.batch</code></p></div>
-
</div>
-
</div>
-
<div class="sect2">
-
<h3 id="_reading_data">4.4. Reading Data</h3>
-
<div class="paragraph"><p>Accumulo is optimized to quickly retrieve the value associated with a given key, and
-
to efficiently return ranges of consecutive keys and their associated values.</p></div>
-
<div class="sect3">
-
<h4 id="_scanner">4.4.1. Scanner</h4>
-
<div class="paragraph"><p>To retrieve data, Clients use a Scanner, which acts like an Iterator over
-
keys and values. Scanners can be configured to start and stop at particular keys, and
-
to return a subset of the columns available.</p></div>
-
<div class="listingblock">
-
-<div class="content"><!-- Generator: GNU source-highlight 3.1.4
-
+<div class="content"><!-- Generator: GNU source-highlight 3.1.7
by Lorenzo Bettini
-
http://www.lorenzobettini.it
-
http://www.gnu.org/software/src-highlite -->
-
<pre><tt><span style="font-style: italic"><span style="color: #b30000">// specify which visibilities we are allowed to see</span></span>
-
<span style="color: #000000">Authorizations</span> auths = <span style="color: #0000b3">new</span> <span style="font-weight: bold"><span style="color: #000000">Authorizations</span></span>(<span style="color: #4c73a6">"public"</span>);
-
-
<span style="color: #000000">Scanner</span> scan =
-
conn.<span style="font-weight: bold"><span style="color: #000000">createScanner</span></span>(<span style="color: #4c73a6">"table"</span>, auths);
-
-
scan.<span style="font-weight: bold"><span style="color: #000000">setRange</span></span>(<span style="color: #0000b3">new</span> <span style="font-weight: bold"><span style="color: #000000">Range</span></span>(<span style="color: #4c73a6">"harry"</span>,<span style="color: #4c73a6">"john"</span>));
-
scan.<span style="font-weight: bold"><span style="color: #000000">fetchFamily</span></span>(<span style="color: #4c73a6">"attributes"</span>);
-
-
<span style="color: #0000b3">for</span>(<span style="color: #000000">Entry<Key,Value></span> entry : scan) {
-
<span style="color: #000000">String</span> row = entry.<span style="font-weight: bold"><span style="color: #000000">getKey</span></span>().<span style="font-weight: bold"><span style="color: #000000">getRow</span></span>();
-
<span style="color: #000000">Value</span> value = entry.<span style="font-weight: bold"><span style="color: #000000">getValue</span></span>();
-
}</tt></pre></div></div>
-
</div>
-
<div class="sect3">
-
<h4 id="_isolated_scanner">4.4.2. Isolated Scanner</h4>
-
<div class="paragraph"><p>Accumulo supports the ability to present an isolated view of rows when
-
-scanning. There are three possible ways that a row could change in accumulo:</p></div>
-
+scanning. There are three possible ways that a row could change in Accumulo:</p></div>
<div class="ulist"><ul>
-
<li>
-
<p>
-
a mutation applied to a table
-
</p>
-
</li>
-
<li>
-
<p>
-
iterators executed as part of a minor or major compaction
-
</p>
-
</li>
-
<li>
-
<p>
-
bulk import of new files
-
</p>
-
</li>
-
</ul></div>
-
<div class="paragraph"><p>Isolation guarantees that either all or none of the changes made by these
-
operations on a row are seen. Use the IsolatedScanner to obtain an isolated
-
-view of an accumulo table. When using the regular scanner it is possible to see
-
+view of an Accumulo table. When using the regular scanner it is possible to see
a non isolated view of a row. For example if a mutation modifies three
-
columns, it is possible that you will only see two of those modifications.
-
With the isolated scanner either all three of the changes are seen or none.</p></div>
-
<div class="paragraph"><p>The IsolatedScanner buffers rows on the client side so a large row will not
-
crash a tablet server. By default rows are buffered in memory, but the user
-
can easily supply their own buffer if they wish to buffer to disk when rows are
-
large.</p></div>
-
<div class="paragraph"><p>For an example, look at the following
-
<code>examples/simple/src/main/java/org/apache/accumulo/examples/simple/isolation/InterferenceTest.java</code></p></div>
-
</div>
-
<div class="sect3">
-
<h4 id="_batchscanner">4.4.3. BatchScanner</h4>
-
<div class="paragraph"><p>For some types of access, it is more efficient to retrieve several ranges
-
simultaneously. This arises when accessing a set of rows that are not consecutive
-
whose IDs have been retrieved from a secondary index, for example.</p></div>
-
<div class="paragraph"><p>The BatchScanner is configured similarly to the Scanner; it can be configured to
-
retrieve a subset of the columns available, but rather than passing a single Range,
-
BatchScanners accept a set of Ranges. It is important to note that the keys returned
-
by a BatchScanner are not in sorted order since the keys streamed are from multiple
-
TabletServers in parallel.</p></div>
-
<div class="listingblock">
-
-<div class="content"><!-- Generator: GNU source-highlight 3.1.4
-
+<div class="content"><!-- Generator: GNU source-highlight 3.1.7
by Lorenzo Bettini
-
http://www.lorenzobettini.it
-
http://www.gnu.org/software/src-highlite -->
-
<pre><tt><span style="color: #000000">ArrayList<Range></span> ranges = <span style="color: #0000b3">new</span> ArrayList<Range>();
-
<span style="font-style: italic"><span style="color: #b30000">// populate list of ranges ...</span></span>
-
-
<span style="color: #000000">BatchScanner</span> bscan =
-
conn.<span style="font-weight: bold"><span style="color: #000000">createBatchScanner</span></span>(<span style="color: #4c73a6">"table"</span>, auths, <span style="color: #000000">10</span>);
-
-
bscan.<span style="font-weight: bold"><span style="color: #000000">setRanges</span></span>(ranges);
-
bscan.<span style="font-weight: bold"><span style="color: #000000">fetchFamily</span></span>(<span style="color: #4c73a6">"attributes"</span>);
-
-
<span style="color: #0000b3">for</span>(<span style="color: #000000">Entry<Key,Value></span> entry : scan)
-
System.out.<span style="font-weight: bold"><span style="color: #000000">println</span></span>(entry.<span style="font-weight: bold"><span style="color: #000000">getValue</span></span>());</tt></pre></div></div>
-
<div class="paragraph"><p>An example of the BatchScanner can be found at
-
<code>accumulo/docs/examples/README.batch</code></p></div>
-
</div>
-
</div>
-
<div class="sect2">
-
<h3 id="_proxy">4.5. Proxy</h3>
-
<div class="paragraph"><p>The proxy API allows the interaction with Accumulo with languages other than Java.
-
A proxy server is provided in the codebase and a client can further be generated.</p></div>
-
<div class="sect3">
-
<h4 id="_prequisites">4.5.1. Prequisites</h4>
-
<div class="paragraph"><p>The proxy server can live on any node in which the basic client API would work. That
-
means it must be able to communicate with the Master, ZooKeepers, NameNode, and the
-
-Data nodes. A proxy client only needs the ability to communicate with the proxy server.</p></div>
-
+DataNodes. A proxy client only needs the ability to communicate with the proxy server.</p></div>
</div>
-
<div class="sect3">
-
<h4 id="_configuration">4.5.2. Configuration</h4>
-
<div class="paragraph"><p>The configuration options for the proxy server live inside of a properties file. At
-
the very least, you need to supply the following properties:</p></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>protocolFactory=org.apache.thrift.protocol.TCompactProtocol$Factory
-
tokenClass=org.apache.accumulo.core.client.security.tokens.PasswordToken
-
port=42424
-
instance=test
-
zookeepers=localhost:2181</code></pre>
-
</div></div>
-
<div class="paragraph"><p>You can find a sample configuration file in your distribution:</p></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>$ACCUMULO_HOME/proxy/proxy.properties</code></pre>
-
</div></div>
-
-<div class="paragraph"><p>This sample configuration file further demonstrates an abilty to back the proxy server
-
+<div class="paragraph"><p>This sample configuration file further demonstrates an ability to back the proxy server
by MockAccumulo or the MiniAccumuloCluster.</p></div>
-
</div>
-
<div class="sect3">
-
<h4 id="_running_the_proxy_server">4.5.3. Running the Proxy Server</h4>
-
<div class="paragraph"><p>After the properties file holding the configuration is created, the proxy server
-
can be started using the following command in the Accumulo distribution (assuming
-
-you your properties file is named config.properties):</p></div>
-
+your properties file is named <code>config.properties</code>):</p></div>
<div class="literalblock">
-
<div class="content">
-
<pre><code>$ACCUMULO_HOME/bin/accumulo proxy -p config.properties</code></pre>
-
</div></div>
-
</div>
-
<div class="sect3">
-
<h4 id="_creating_a_proxy_client">4.5.4. Creating a Proxy Client</h4>
-
<div class="paragraph"><p>Aside from installing the Thrift compiler, you will also need the language-specific library
-
for Thrift installed to generate client code in that language. Typically, your operating
-
system’s package manager will be able to automatically install these for you in an expected
-
-location such as /usr/lib/python/site-packages/thrift.</p></div>
-
+location such as <code>/usr/lib/python/site-packages/thrift</code>.</p></div>
<div class="paragraph"><p>You can find the thrift file for generating the client:</p></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>$ACCUMULO_HOME/proxy/proxy.thrift</code></pre>
-
</div></div>
-
<div class="paragraph"><p>After a client is generated, the port specified in the configuration properties above will be
-
used to connect to the server.</p></div>
-
</div>
-
<div class="sect3">
-
<h4 id="_using_a_proxy_client">4.5.5. Using a Proxy Client</h4>
-
<div class="paragraph"><p>The following examples have been written in Java and the method signatures may be
-
slightly different depending on the language specified when generating client with
-
the Thrift compiler. After initiating a connection to the Proxy (see Apache Thrift’s
-
documentation for examples of connecting to a Thrift service), the methods on the
-
proxy client will be available. The first thing to do is log in:</p></div>
-
<div class="listingblock">
-
-<div class="content"><!-- Generator: GNU source-highlight 3.1.4
-
+<div class="content"><!-- Generator: GNU source-highlight 3.1.7
by Lorenzo Bettini
-
http://www.lorenzobettini.it
-
http://www.gnu.org/software/src-highlite -->
-
<pre><tt><span style="color: #000000">Map</span> password = <span style="color: #0000b3">new</span> HashMap<String,String>();
-
password.<span style="font-weight: bold"><span style="color: #000000">put</span></span>(<span style="color: #4c73a6">"password"</span>, <span style="color: #4c73a6">"secret"</span>);
-
<span style="color: #000000">ByteBuffer</span> token = client.<span style="font-weight: bold"><span style="color: #000000">login</span></span>(<span style="color: #4c73a6">"root"</span>, password);</tt></pre></div></div>
-
<div class="paragraph"><p>Once logged in, the token returned will be used for most subsequent calls to the client.
-
Let’s create a table, add some data, scan the table, and delete it.</p></div>
-
<div class="paragraph"><p>First, create a table.</p></div>
-
<div class="listingblock">
-
-<div class="content"><!-- Generator: GNU source-highlight 3.1.4
-
+<div class="content"><!-- Generator: GNU source-highlight 3.1.7
by Lorenzo Bettini
-
http://www.lorenzobettini.it
-
http://www.gnu.org/software/src-highlite -->
-
<pre><tt>client.<span style="font-weight: bold"><span style="color: #000000">createTable</span></span>(token, <span style="color: #4c73a6">"myTable"</span>, <span style="color: #0000b3">true</span>, TimeType.MILLIS);</tt></pre></div></div>
-
<div class="paragraph"><p>Next, add some data:</p></div>
-
<div class="listingblock">
-
-<div class="content"><!-- Generator: GNU source-highlight 3.1.4
-
+<div class="content"><!-- Generator: GNU source-highlight 3.1.7
by Lorenzo Bettini
-
http://www.lorenzobettini.it
-
http://www.gnu.org/software/src-highlite -->
-
<pre><tt><span style="font-style: italic"><span style="color: #b30000">// first, create a writer on the server</span></span>
-
<span style="color: #000000">String</span> writer = client.<span style="font-weight: bold"><span style="color: #000000">createWriter</span></span>(token, <span style="color: #4c73a6">"myTable"</span>, <span style="color: #0000b3">new</span> <span style="font-weight: bold"><span style="color: #000000">WriterOptions</span></span>());
-
-
<span style="font-style: italic"><span style="color: #b30000">// build column updates</span></span>
-
<span style="color: #000000">Map<ByteBuffer, List<ColumnUpdate> cells></span> cellsToUpdate = <span style="font-style: italic"><span style="color: #b30000">//...</span></span>
-
-
<span style="font-style: italic"><span style="color: #b30000">// send updates to the server</span></span>
-
client.<span style="font-weight: bold"><span style="color: #000000">updateAndFlush</span></span>(writer, <span style="color: #4c73a6">"myTable"</span>, cellsToUpdate);
-
-
client.<span style="font-weight: bold"><span style="color: #000000">closeWriter</span></span>(writer);</tt></pre></div></div>
-
<div class="paragraph"><p>Scan for the data and batch the return of the results on the server:</p></div>
-
<div class="listingblock">
-
-<div class="content"><!-- Generator: GNU source-highlight 3.1.4
-
+<div class="content"><!-- Generator: GNU source-highlight 3.1.7
by Lorenzo Bettini
-
http://www.lorenzobettini.it
-
http://www.gnu.org/software/src-highlite -->
-
<pre><tt><span style="color: #000000">String</span> scanner = client.<span style="font-weight: bold"><span style="color: #000000">createScanner</span></span>(token, <span style="color: #4c73a6">"myTable"</span>, <span style="color: #0000b3">new</span> <span style="font-weight: bold"><span style="color: #000000">ScanOptions</span></span>());
-
<span style="color: #000000">ScanResult</span> results = client.<span style="font-weight: bold"><span style="color: #000000">nextK</span></span>(scanner, <span style="color: #000000">100</span>);
-
-
<span style="color: #0000b3">for</span>(<span style="color: #000000">KeyValue</span> keyValue : results.<span style="font-weight: bold"><span style="color: #000000">getResultsIterator</span></span>()) {
-
<span style="font-style: italic"><span style="color: #b30000">// do something with results</span></span>
-
}
-
-
client.<span style="font-weight: bold"><span style="color: #000000">closeScanner</span></span>(scanner);</tt></pre></div></div>
-
</div>
-
</div>
-
</div>
-
</div>
-
<div class="sect1">
-
<h2 id="_development_clients">5. Development Clients</h2>
-
<div class="sectionbody">
-
<div class="paragraph"><p>Normally, Accumulo consists of lots of moving parts. Even a stand-alone version of
-
Accumulo requires Hadoop, Zookeeper, the Accumulo master, a tablet server, etc. If
-
you want to write a unit test that uses Accumulo, you need a lot of infrastructure
-
in place before your test can run.</p></div>
-
<div class="sect2">
-
<h3 id="_mock_accumulo">5.1. Mock Accumulo</h3>
-
<div class="paragraph"><p>Mock Accumulo supplies mock implementations for much of the client API. It presently
-
does not enforce users, logins, permissions, etc. It does support Iterators and Combiners.
-
Note that MockAccumulo holds all data in memory, and will not retain any data or
-
settings between runs.</p></div>
-
<div class="paragraph"><p>While normal interaction with the Accumulo client looks like this:</p></div>
-
<div class="listingblock">
-
-<div class="content"><!-- Generator: GNU source-highlight 3.1.4
-
+<div class="content"><!-- Generator: GNU source-highlight 3.1.7
by Lorenzo Bettini
-
http://www.lorenzobettini.it
-
http://www.gnu.org/software/src-highlite -->
-
<pre><tt><span style="color: #000000">Instance</span> instance = <span style="color: #0000b3">new</span> <span style="font-weight: bold"><span style="color: #000000">ZooKeeperInstance</span></span>(...);
-
<span style="color: #000000">Connector</span> conn = instance.<span style="font-weight: bold"><span style="color: #000000">getConnector</span></span>(user, passwordToken);</tt></pre></div></div>
-
<div class="paragraph"><p>To interact with the MockAccumulo, just replace the ZooKeeperInstance with MockInstance:</p></div>
-
<div class="listingblock">
-
-<div class="content"><!-- Generator: GNU source-highlight 3.1.4
-
+<div class="content"><!-- Generator: GNU source-highlight 3.1.7
by Lorenzo Bettini
-
http://www.lorenzobettini.it
-
http://www.gnu.org/software/src-highlite -->
-
<pre><tt><span style="color: #000000">Instance</span> instance = <span style="color: #0000b3">new</span> <span style="font-weight: bold"><span style="color: #000000">MockInstance</span></span>();</tt></pre></div></div>
-
<div class="paragraph"><p>In fact, you can use the <code>--fake</code> option to the Accumulo shell and interact with
-
MockAccumulo:</p></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>$ ./bin/accumulo shell --fake -u root -p ''
-
Shell - Apache Accumulo Interactive Shell
-
-
-
- version: 1.5
-
- instance name: fake
-
- instance id: mock-instance-id
-
-
-
- type 'help' for a list of available commands
-
-
-
root@fake> createtable test
-
root@fake test> insert row1 cf cq value
-
root@fake test> insert row2 cf cq value2
-
root@fake test> insert row3 cf cq value3
-
root@fake test> scan
-
row1 cf:cq [] value
-
row2 cf:cq [] value2
-
row3 cf:cq [] value3
-
root@fake test> scan -b row2 -e row2
-
row2 cf:cq [] value2
-
root@fake test></code></pre>
-
</div></div>
-
<div class="paragraph"><p>When testing Map Reduce jobs, you can also set the Mock Accumulo on the AccumuloInputFormat
-
and AccumuloOutputFormat classes:</p></div>
-
<div class="listingblock">
-
-<div class="content"><!-- Generator: GNU source-highlight 3.1.4
-
+<div class="content"><!-- Generator: GNU source-highlight 3.1.7
by Lorenzo Bettini
-
http://www.lorenzobettini.it
-
http://www.gnu.org/software/src-highlite -->
-
<pre><tt>AccumuloInputFormat.<span style="font-weight: bold"><span style="color: #000000">setMockInstance</span></span>(job, <span style="color: #4c73a6">"mockInstance"</span>);
-
AccumuloOutputFormat.<span style="font-weight: bold"><span style="color: #000000">setMockInstance</span></span>(job, <span style="color: #4c73a6">"mockInstance"</span>);</tt></pre></div></div>
-
</div>
-
<div class="sect2">
-
<h3 id="_mini_accumulo_cluster">5.2. Mini Accumulo Cluster</h3>
-
<div class="paragraph"><p>While the Mock Accumulo provides a lightweight implementation of the client API for unit
-
testing, it is often necessary to write more realistic end-to-end integration tests that
-
take advantage of the entire ecosystem. The Mini Accumulo Cluster makes this possible by
-
configuring and starting Zookeeper, initializing Accumulo, and starting the Master as well
-
as some Tablet Servers. It runs against the local filesystem instead of having to start
-
up HDFS.</p></div>
-
<div class="paragraph"><p>To start it up, you will need to supply an empty directory and a root password as arguments:</p></div>
-
<div class="listingblock">
-
-<div class="content"><!-- Generator: GNU source-highlight 3.1.4
-
+<div class="content"><!-- Generator: GNU source-highlight 3.1.7
by Lorenzo Bettini
-
http://www.lorenzobettini.it
-
http://www.gnu.org/software/src-highlite -->
-
<pre><tt><span style="color: #000000">File</span> tempDirectory = <span style="font-style: italic"><span style="color: #b30000">// JUnit and Guava supply mechanisms for creating temp directories</span></span>
-
<span style="color: #000000">MiniAccumuloCluster</span> accumulo = <span style="color: #0000b3">new</span> <span style="font-weight: bold"><span style="color: #000000">MiniAccumuloCluster</span></span>(tempDirectory, <span style="color: #4c73a6">"password"</span>);
-
accumulo.<span style="font-weight: bold"><span style="color: #000000">start</span></span>();</tt></pre></div></div>
-
<div class="paragraph"><p>Once we have our mini cluster running, we will want to interact with the Accumulo client API:</p></div>
-
<div class="listingblock">
-
-<div class="content"><!-- Generator: GNU source-highlight 3.1.4
-
+<div class="content"><!-- Generator: GNU source-highlight 3.1.7
by Lorenzo Bettini
-
http://www.lorenzobettini.it
-
http://www.gnu.org/software/src-highlite -->
-
<pre><tt><span style="color: #000000">Instance</span> instance = <span style="color: #0000b3">new</span> <span style="font-weight: bold"><span style="color: #000000">ZooKeeperInstance</span></span>(accumulo.<span style="font-weight: bold"><span style="color: #000000">getInstanceName</span></span>(), accumulo.<span style="font-weight: bold"><span style="color: #000000">getZooKeepers</span></span>());
-
<span style="color: #000000">Connector</span> conn = instance.<span style="font-weight: bold"><span style="color: #000000">getConnector</span></span>(<span style="color: #4c73a6">"root"</span>, <span style="color: #0000b3">new</span> <span style="font-weight: bold"><span style="color: #000000">PasswordToken</span></span>(<span style="color: #4c73a6">"password"</span>));</tt></pre></div></div>
-
<div class="paragraph"><p>Upon completion of our development code, we will want to shutdown our MiniAccumuloCluster:</p></div>
-
<div class="listingblock">
-
-<div class="content"><!-- Generator: GNU source-highlight 3.1.4
-
+<div class="content"><!-- Generator: GNU source-highlight 3.1.7
by Lorenzo Bettini
-
http://www.lorenzobettini.it
-
http://www.gnu.org/software/src-highlite -->
-
<pre><tt>accumulo.<span style="font-weight: bold"><span style="color: #000000">stop</span></span>()
-
<span style="font-style: italic"><span style="color: #b30000">// delete your temporary folder</span></span></tt></pre></div></div>
-
</div>
-
</div>
-
</div>
-
<div class="sect1">
-
<h2 id="_table_configuration">6. Table Configuration</h2>
-
<div class="sectionbody">
-
<div class="paragraph"><p>Accumulo tables have a few options that can be configured to alter the default
-
behavior of Accumulo as well as improve performance based on the data stored.
-
These include locality groups, constraints, bloom filters, iterators, and block cache.</p></div>
-
<div class="sect2">
-
<h3 id="_locality_groups">6.1. Locality Groups</h3>
-
<div class="paragraph"><p>Accumulo supports storing sets of column families separately on disk to allow
-
clients to efficiently scan over columns that are frequently used together and to avoid
-
scanning over column families that are not requested. After a locality group is set,
-
Scanner and BatchScanner operations will automatically take advantage of them
-
whenever the fetchColumnFamilies() method is used.</p></div>
-
<div class="paragraph"><p>By default, tables place all column families into the same “default” locality group.
-
-Additional locality groups can be configured anytime via the shell or
-
+Additional locality groups can be configured at any time via the shell or
programmatically as follows:</p></div>
-
<div class="sect3">
-
<h4 id="_managing_locality_groups_via_the_shell">6.1.1. Managing Locality Groups via the Shell</h4>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>usage: setgroups <group>=<col fam>{,<col fam>}{ <group>=<col fam>{,<col fam>}}
-
[-?] -t <table></code></pre>
-
</div></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>user@myinstance mytable> setgroups group_one=colf1,colf2 -t mytable</code></pre>
-
</div></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>user@myinstance mytable> getgroups -t mytable</code></pre>
-
</div></div>
-
</div>
-
<div class="sect3">
-
<h4 id="_managing_locality_groups_via_the_client_api">6.1.2. Managing Locality Groups via the Client API</h4>
-
<div class="listingblock">
-
-<div class="content"><!-- Generator: GNU source-highlight 3.1.4
-
+<div class="content"><!-- Generator: GNU source-highlight 3.1.7
by Lorenzo Bettini
-
http://www.lorenzobettini.it
-
http://www.gnu.org/software/src-highlite -->
-
<pre><tt><span style="color: #000000">Connector</span> conn;
-
-
<span style="color: #000000">HashMap<String,Set<Text>></span> localityGroups = <span style="color: #0000b3">new</span> HashMap<String, Set<Text>>();
-
-
<span style="color: #000000">HashSet<Text></span> metadataColumns = <span style="color: #0000b3">new</span> HashSet<Text>();
-
metadataColumns.<span style="font-weight: bold"><span style="color: #000000">add</span></span>(<span style="color: #0000b3">new</span> <span style="font-weight: bold"><span style="color: #000000">Text</span></span>(<span style="color: #4c73a6">"domain"</span>));
-
metadataColumns.<span style="font-weight: bold"><span style="color: #000000">add</span></span>(<span style="color: #0000b3">new</span> <span style="font-weight: bold"><span style="color: #000000">Text</span></span>(<span style="color: #4c73a6">"link"</span>));
-
-
<span style="color: #000000">HashSet<Text></span> contentColumns = <span style="color: #0000b3">new</span> HashSet<Text>();
-
contentColumns.<span style="font-weight: bold"><span style="color: #000000">add</span></span>(<span style="color: #0000b3">new</span> <span style="font-weight: bold"><span style="color: #000000">Text</span></span>(<span style="color: #4c73a6">"body"</span>));
-
contentColumns.<span style="font-weight: bold"><span style="color: #000000">add</span></span>(<span style="color: #0000b3">new</span> <span style="font-weight: bold"><span style="color: #000000">Text</span></span>(<span style="color: #4c73a6">"images"</span>));
-
-
localityGroups.<span style="font-weight: bold"><span style="color: #000000">put</span></span>(<span style="color: #4c73a6">"metadata"</span>, metadataColumns);
-
localityGroups.<span style="font-weight: bold"><span style="color: #000000">put</span></span>(<span style="color: #4c73a6">"content"</span>, contentColumns);
-
-
conn.<span style="font-weight: bold"><span style="color: #000000">tableOperations</span></span>().<span style="font-weight: bold"><span style="color: #000000">setLocalityGroups</span></span>(<span style="color: #4c73a6">"mytable"</span>, localityGroups);
-
-
<span style="font-style: italic"><span style="color: #b30000">// existing locality groups can be obtained as follows</span></span>
-
<span style="color: #000000">Map<String, Set<Text>></span> groups =
-
conn.<span style="font-weight: bold"><span style="color: #000000">tableOperations</span></span>().<span style="font-weight: bold"><span style="color: #000000">getLocalityGroups</span></span>(<span style="color: #4c73a6">"mytable"</span>);</tt></pre></div></div>
-
<div class="paragraph"><p>The assignment of Column Families to Locality Groups can be changed at any time. The
-
physical movement of column families into their new locality groups takes place via
-
the periodic Major Compaction process that takes place continuously in the
-
background. Major Compaction can also be scheduled to take place immediately
-
through the shell:</p></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>user@myinstance mytable> compact -t mytable</code></pre>
-
</div></div>
-
</div>
-
</div>
-
<div class="sect2">
-
<h3 id="_constraints">6.2. Constraints</h3>
-
<div class="paragraph"><p>Accumulo supports constraints applied on mutations at insert time. This can be
-
used to disallow certain inserts according to a user defined policy. Any mutation
-
that fails to meet the requirements of the constraint is rejected and sent back to the
-
client.</p></div>
-
<div class="paragraph"><p>Constraints can be enabled by setting a table property as follows:</p></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>user@myinstance mytable> constraint -t mytable -a com.test.ExampleConstraint com.test.AnotherConstraint
-
user@myinstance mytable> constraint -l
-
com.test.ExampleConstraint=1
-
com.test.AnotherConstraint=2</code></pre>
-
</div></div>
-
<div class="paragraph"><p>Currently there are no general-purpose constraints provided with the Accumulo
-
distribution. New constraints can be created by writing a Java class that implements
-
the org.apache.accumulo.core.constraints.Constraint interface.</p></div>
-
<div class="paragraph"><p>To deploy a new constraint, create a jar file containing the class implementing the
-
new constraint and place it in the lib directory of the Accumulo installation. New
-
constraint jars can be added to Accumulo and enabled without restarting but any
-
change to an existing constraint class requires Accumulo to be restarted.</p></div>
-
<div class="paragraph"><p>An example of constraints can be found in
-
<code>accumulo/docs/examples/README.constraints</code> with corresponding code under
-
<code>accumulo/examples/simple/main/java/accumulo/examples/simple/constraints</code>.</p></div>
-
</div>
-
<div class="sect2">
-
<h3 id="_bloom_filters">6.3. Bloom Filters</h3>
-
<div class="paragraph"><p>As mutations are applied to an Accumulo table, several files are created per tablet. If
-
bloom filters are enabled, Accumulo will create and load a small data structure into
-
memory to determine whether a file contains a given key before opening the file.
-
This can speed up lookups considerably.</p></div>
-
<div class="paragraph"><p>To enable bloom filters, enter the following command in the Shell:</p></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>user@myinstance> config -t mytable -s table.bloom.enabled=true</code></pre>
-
</div></div>
-
<div class="paragraph"><p>An extensive example of using Bloom Filters can be found at
-
<code>accumulo/docs/examples/README.bloom</code>.</p></div>
-
</div>
-
<div class="sect2">
-
<h3 id="_iterators">6.4. Iterators</h3>
-
<div class="paragraph"><p>Iterators provide a modular mechanism for adding functionality to be executed by
-
TabletServers when scanning or compacting data. This allows users to efficiently
-
summarize, filter, and aggregate data. In fact, the built-in features of cell-level
-
security and column fetching are implemented using Iterators.
-
Some useful Iterators are provided with Accumulo and can be found in the
-
<strong><code>org.apache.accumulo.core.iterators.user</code></strong> package.
-
In each case, any custom Iterators must be included in Accumulo’s classpath,
-
typically by including a jar in <code>$ACCUMULO_HOME/lib</code> or
-
<code>$ACCUMULO_HOME/lib/ext</code>, although the VFS classloader allows for
-
classpath manipulation using a variety of schemes including URLs and HDFS URIs.</p></div>
-
<div class="sect3">
-
<h4 id="_setting_iterators_via_the_shell">6.4.1. Setting Iterators via the Shell</h4>
-
<div class="paragraph"><p>Iterators can be configured on a table at scan, minor compaction and/or major
-
compaction scopes. If the Iterator implements the OptionDescriber interface, the
-
setiter command can be used which will interactively prompt the user to provide
-
values for the given necessary options.</p></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>usage: setiter [-?] -ageoff | -agg | -class <name> | -regex |
-
-reqvis | -vers [-majc] [-minc] [-n <itername>] -p <pri>
-
[-scan] [-t <table>]</code></pre>
-
</div></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>user@myinstance mytable> setiter -t mytable -scan -p 15 -n myiter -class com.company.MyIterator</code></pre>
-
</div></div>
-
<div class="paragraph"><p>The config command can always be used to manually configure iterators which is useful
-
in cases where the Iterator does not implement the OptionDescriber interface.</p></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>config -t mytable -s table.iterator.{scan|minc|majc}.myiter=15,com.company.MyIterator
-
config -t mytable -s table.iteartor.{scan|minc|majc}.myiter.opt.myoptionname=myoptionvalue</code></pre>
-
</div></div>
-
</div>
-
<div class="sect3">
-
<h4 id="_setting_iterators_programmatically">6.4.2. Setting Iterators Programmatically</h4>
-
<div class="listingblock">
-
-<div class="content"><!-- Generator: GNU source-highlight 3.1.4
-
+<div class="content"><!-- Generator: GNU source-highlight 3.1.7
by Lorenzo Bettini
-
http://www.lorenzobettini.it
-
http://www.gnu.org/software/src-highlite -->
-
<pre><tt>scanner.<span style="font-weight: bold"><span style="color: #000000">addIterator</span></span>(<span style="color: #0000b3">new</span> <span style="font-weight: bold"><span style="color: #000000">IteratorSetting</span></span>(
-
<span style="color: #000000">15</span>, <span style="font-style: italic"><span style="color: #b30000">// priority</span></span>
-
<span style="color: #4c73a6">"myiter"</span>, <span style="font-style: italic"><span style="color: #b30000">// name this iterator</span></span>
-
<span style="color: #4c73a6">"com.company.MyIterator"</span> <span style="font-style: italic"><span style="color: #b30000">// class name</span></span>
-
));</tt></pre></div></div>
-
<div class="paragraph"><p>Some iterators take additional parameters from client code, as in the following
-
example:</p></div>
-
<div class="listingblock">
-
-<div class="content"><!-- Generator: GNU source-highlight 3.1.4
-
+<div class="content"><!-- Generator: GNU source-highlight 3.1.7
by Lorenzo Bettini
-
http://www.lorenzobettini.it
-
http://www.gnu.org/software/src-highlite -->
-
<pre><tt><span style="color: #000000">IteratorSetting</span> iter = <span style="color: #0000b3">new</span> <span style="font-weight: bold"><span style="color: #000000">IteratorSetting</span></span>(...);
-
iter.<span style="font-weight: bold"><span style="color: #000000">addOption</span></span>(<span style="color: #4c73a6">"myoptionname"</span>, <span style="color: #4c73a6">"myoptionvalue"</span>);
-
scanner.<span style="font-weight: bold"><span style="color: #000000">addIterator</span></span>(iter)</tt></pre></div></div>
-
<div class="paragraph"><p>Tables support separate Iterator settings to be applied at scan time, upon minor
-
compaction and upon major compaction. For most uses, tables will have identical
-
iterator settings for all three to avoid inconsistent results.</p></div>
-
</div>
-
<div class="sect3">
-
<h4 id="_versioning_iterators_and_timestamps">6.4.3. Versioning Iterators and Timestamps</h4>
-
<div class="paragraph"><p>Accumulo provides the capability to manage versioned data through the use of
-
timestamps within the Key. If a timestamp is not specified in the key created by the
-
client then the system will set the timestamp to the current time. Two keys with
-
identical rowIDs and columns but different timestamps are considered two versions
-
-of the same key. If two inserts are made into accumulo with the same rowID,
-
+of the same key. If two inserts are made into Accumulo with the same rowID,
column, and timestamp, then the behavior is non-deterministic.</p></div>
-
<div class="paragraph"><p>Timestamps are sorted in descending order, so the most recent data comes first.
-
Accumulo can be configured to return the top k versions, or versions later than a
-
given date. The default is to return the one most recent version.</p></div>
-
<div class="paragraph"><p>The version policy can be changed by changing the VersioningIterator options for a
-
table as follows:</p></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>user@myinstance mytable> config -t mytable -s table.iterator.scan.vers.opt.maxVersions=3
-
user@myinstance mytable> config -t mytable -s table.iterator.minc.vers.opt.maxVersions=3
-
user@myinstance mytable> config -t mytable -s table.iterator.majc.vers.opt.maxVersions=3</code></pre>
-
</div></div>
-
<div class="paragraph"><p>When a table is created, by default its configured to use the
-
VersioningIterator and keep one version. A table can be created without the
-
VersioningIterator with the -ndi option in the shell. Also the Java API
-
has the following method</p></div>
-
<div class="listingblock">
-
-<div class="content"><!-- Generator: GNU source-highlight 3.1.4
-
+<div class="content"><!-- Generator: GNU source-highlight 3.1.7
by Lorenzo Bettini
-
http://www.lorenzobettini.it
-
http://www.gnu.org/software/src-highlite -->
-
<pre><tt>connector.tableOperations.<span style="font-weight: bold"><span style="color: #000000">create</span></span>(<span style="color: #000000">String</span> tableName, <span style="color: #000000">boolean</span> limitVersion)</tt></pre></div></div>
-
<div class="sect4">
-
<h5 id="_logical_time">Logical Time</h5>
-
<div class="paragraph"><p>Accumulo 1.2 introduces the concept of logical time. This ensures that timestamps
-
-set by accumulo always move forward. This helps avoid problems caused by
-
+set by Accumulo always move forward. This helps avoid problems caused by
TabletServers that have different time settings. The per tablet counter gives unique
-
one up time stamps on a per mutation basis. When using time in milliseconds, if
-
two things arrive within the same millisecond then both receive the same
-
-timestamp. When using time in milliseconds, accumulo set times will still
-
+timestamp. When using time in milliseconds, Accumulo set times will still
always move forward and never backwards.</p></div>
-
<div class="paragraph"><p>A table can be configured to use logical timestamps at creation time as follows:</p></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>user@myinstance> createtable -tl logical</code></pre>
-
</div></div>
-
</div>
-
<div class="sect4">
-
<h5 id="_deletes">Deletes</h5>
-
-<div class="paragraph"><p>Deletes are special keys in accumulo that get sorted along will all the other data.
-
-When a delete key is inserted, accumulo will not show anything that has a
-
+<div class="paragraph"><p>Deletes are special keys in Accumulo that get sorted along will all the other data.
+When a delete key is inserted, Accumulo will not show anything that has a
timestamp less than or equal to the delete key. During major compaction, any keys
-
older than a delete key are omitted from the new file created, and the omitted keys
-
are removed from disk as part of the regular garbage collection process.</p></div>
-
</div>
-
</div>
-
<div class="sect3">
-
<h4 id="_filters">6.4.4. Filters</h4>
-
<div class="paragraph"><p>When scanning over a set of key-value pairs it is possible to apply an arbitrary
-
filtering policy through the use of a Filter. Filters are types of iterators that return
-
only key-value pairs that satisfy the filter logic. Accumulo has a few built-in filters
-
that can be configured on any table: AgeOff, ColumnAgeOff, Timestamp, NoVis, and RegEx. More can be added
-
by writing a Java class that extends the
-
<code>org.apache.accumulo.core.iterators.Filter</code> class.</p></div>
-
<div class="paragraph"><p>The AgeOff filter can be configured to remove data older than a certain date or a fixed
-
amount of time from the present. The following example sets a table to delete
-
everything inserted over 30 seconds ago:</p></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>user@myinstance> createtable filtertest
-
user@myinstance filtertest> setiter -t filtertest -scan -minc -majc -p 10 -n myfilter -ageoff
-
AgeOffFilter removes entries with timestamps more than <ttl> milliseconds old
-
----------> set org.apache.accumulo.core.iterators.user.AgeOffFilter parameter
-
negate, default false keeps k/v that pass accept method, true rejects k/v
-
that pass accept method:
-
----------> set org.apache.accumulo.core.iterators.user.AgeOffFilter parameter
-
ttl, time to live (milliseconds): 3000
-
----------> set org.apache.accumulo.core.iterators.user.AgeOffFilter parameter
-
currentTime, if set, use the given value as the absolute time in milliseconds
-
as the current time of day:
-
user@myinstance filtertest>
-
user@myinstance filtertest> scan
-
user@myinstance filtertest> insert foo a b c
-
user@myinstance filtertest> scan
-
foo a:b [] c
-
user@myinstance filtertest> sleep 4
-
user@myinstance filtertest> scan
-
user@myinstance filtertest></code></pre>
-
</div></div>
-
<div class="paragraph"><p>To see the iterator settings for a table, use:</p></div>
-
<div class="literalblock">
-
<div class="content">
-
<pre><code>user@example filtertest> config -t filtertest -f iterator
-
---------+---------------------------------------------+------------------
-
SCOPE | NAME | VALUE
-
---------+---------------------------------------------+------------------
-
table | table.iterator.majc.myfilter .............. | 10,org.apache.accumulo.core.iterators.user.AgeOffFilter
-
table | table.iterator.majc.myfilter.opt.ttl ...... | 3000
-
[... 3200 lines stripped ...]