You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@accumulo.apache.org by ct...@apache.org on 2014/03/27 20:07:56 UTC
[2/4] ACCUMULO-1517 Generate LaTeX appendix for config
http://git-wip-us.apache.org/repos/asf/accumulo/blob/40299f89/docs/src/main/latex/accumulo_user_manual/accumulo_user_manual.tex
----------------------------------------------------------------------
diff --git a/docs/src/main/latex/accumulo_user_manual/accumulo_user_manual.tex b/docs/src/main/latex/accumulo_user_manual/accumulo_user_manual.tex
index 8c8fec3..ae2a440 100644
--- a/docs/src/main/latex/accumulo_user_manual/accumulo_user_manual.tex
+++ b/docs/src/main/latex/accumulo_user_manual/accumulo_user_manual.tex
@@ -16,15 +16,13 @@
\documentclass[11pt]{report}
\title{Apache Accumulo User Manual\\
-Version 1.5}
+Version 1.6}
\usepackage{alltt}
\usepackage{multirow}
\usepackage{graphicx}
\usepackage[T1]{fontenc}
+\usepackage{ulem}
\renewcommand{\ttdefault}{txtt}
-%\def\verbatim{%
-% \def\verbatim@font{\small\ttfamily}%
-% \verbatim}
\setlength{\textwidth}{6.25in}
\evensidemargin=0in
@@ -52,4 +50,6 @@ Version 1.5}
\include{chapters/administration}
\include{chapters/multivolume}
\include{chapters/troubleshooting}
+\appendix
+\include{appendices/config}
\end{document}
http://git-wip-us.apache.org/repos/asf/accumulo/blob/40299f89/docs/src/main/latex/accumulo_user_manual/chapters/administration.tex
----------------------------------------------------------------------
diff --git a/docs/src/main/latex/accumulo_user_manual/chapters/administration.tex b/docs/src/main/latex/accumulo_user_manual/chapters/administration.tex
index 7056de5..57c8760 100644
--- a/docs/src/main/latex/accumulo_user_manual/chapters/administration.tex
+++ b/docs/src/main/latex/accumulo_user_manual/chapters/administration.tex
@@ -43,12 +43,10 @@ network bandwidth must be available between any two machines.
Choose a directory for the Accumulo installation. This directory will be referenced
by the environment variable \texttt{\$ACCUMULO\_HOME}. Run the following:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
$ tar xzf accumulo-1.5.0-bin.tar.gz # unpack to subdirectory
$ mv accumulo-1.5.0 $ACCUMULO_HOME # move to desired location
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
Repeat this step at each machine within the cluster. Usually all machines have the
same \texttt{\$ACCUMULO\_HOME}.
@@ -139,35 +137,31 @@ of errors.
Specify appropriate values for the following settings in\\
\texttt{\$ACCUMULO\_HOME/conf/accumulo-site.xml} :
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
<property>
<name>instance.zookeeper.host</name>
<value>zooserver-one:2181,zooserver-two:2181</value>
<description>list of zookeeper servers</description>
</property>
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
This enables Accumulo to find ZooKeeper. Accumulo uses ZooKeeper to coordinate
settings between processes and helps finalize TabletServer failure.
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
<property>
<name>instance.secret</name>
<value>DEFAULT</value>
</property>
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
The instance needs a secret to enable secure communication between servers. Configure your
secret and make sure that the \texttt{accumulo-site.xml} file is not readable to other users.
Some settings can be modified via the Accumulo shell and take effect immediately, but
some settings require a process restart to take effect. See the configuration documentation
-(available on the monitor web pages) for details.
+(available on the monitor web pages and in Appendix~\ref{app:config}) for details.
\subsection{Deploy Configuration}
@@ -214,12 +208,15 @@ take some time for particular configurations.
Update your \texttt{\$ACCUMULO\_HOME/conf/slaves} (or \texttt{\$ACCUMULO\_CONF\_DIR/slaves}) file to account for the addition.
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
$ACCUMULO_HOME/bin/accumulo admin start <host(s)> {<host> ...}
-\end{verbatim}
+\end{verbatim}\endgroup
-Alternatively, you can ssh to each of the hosts you want to add and run
-\texttt{\$ACCUMULO\_HOME/bin/start-here.sh}.
+Alternatively, you can ssh to each of the hosts you want to add and run:
+
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
+$ACCUMULO\_HOME/bin/start-here.sh
+\end{verbatim}\endgroup
Make sure the host in question has the new configuration, or else the tablet
server won't start; at a minimum this needs to be on the host(s) being added,
@@ -230,12 +227,15 @@ but in practice it's good to ensure consistent configuration across all nodes.
If you need to take a node out of operation, you can trigger a graceful shutdown of a tablet
server. Accumulo will automatically rebalance the tablets across the available tablet servers.
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
$ACCUMULO_HOME/bin/accumulo admin stop <host(s)> {<host> ...}
-\end{verbatim}
+\end{verbatim}\endgroup
+
+Alternatively, you can ssh to each of the hosts you want to remove and run:
-Alternatively, you can ssh to each of the hosts you want to remove and run
-\texttt{\$ACCUMULO\_HOME/bin/stop-here.sh}.
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
+$ACCUMULO\_HOME/bin/stop-here.sh
+\end{verbatim}\endgroup
Be sure to update your \texttt{\$ACCUMUL\_HOME/conf/slaves} (or \texttt{\$ACCUMULO\_CONF\_DIR/slaves}) file to
account for the removal of these hosts. Bear in mind that the monitor will not re-read the
@@ -276,33 +276,33 @@ from clients and writes them to the \texttt{trace} table. The Accumulo
user that the tracer connects to Accumulo with can be configured with
the following properties
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
trace.user
trace.token.property.password
-\end{verbatim}
+\end{verbatim}\endgroup
\subsection{Instrumenting a Client}
Tracing can be used to measure a client operation, such as a scan, as
the operation traverses the distributed system. To enable tracing for
your application call
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
DistributedTrace.enable(instance, new ZooReader(instance), hostname, "myApplication");
-\end{verbatim}
+\end{verbatim}\endgroup
Once tracing has been enabled, a client can wrap an operation in a trace.
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
Trace.on("Client Scan");
BatchScanner scanner = conn.createBatchScanner(...);
// Configure your scanner
for (Entry entry : scanner) {
}
Trace.off();
-\end{verbatim}
+\end{verbatim}\endgroup
Additionally, the user can create additional Spans within a Trace.
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
Trace.on("Client Update");
...
Span readSpan = Trace.start("Read");
@@ -313,23 +313,23 @@ Span writeSpan = Trace.start("Write");
...
writeSpan.stop();
Trace.off();
-\end{verbatim}
+\end{verbatim}\endgroup
Like Dapper, Accumulo tracing supports user defined annotations to associate additional data with a Trace.
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
...
int numberOfEntriesRead = 0;
Span readSpan = Trace.start("Read");
// Do the read, update the counter
...
readSpan.data("Number of Entries Read", String.valueOf(numberOfEntriesRead));
-\end{verbatim}
+\end{verbatim}\endgroup
Some client operations may have a high volume within your
application. As such, you may wish to only sample a percentage of
operations for tracing. As seen below, the CountSampler can be used to
help enable tracing for 1-in-1000 operations
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
Sampler sampler = new CountSampler(1000);
...
if (sampler.next()) {
@@ -337,7 +337,7 @@ if (sampler.next()) {
}
...
Trace.offNoFlush();
-\end{verbatim}
+\end{verbatim}\endgroup
It should be noted that it is safe to turn off tracing even if it
isn't currently active. The Trace.offNoFlush() should be used if the
@@ -353,7 +353,7 @@ UI. You can also programmatically access and print traces using the
You can enable tracing for operations run from the shell by using the
\texttt{trace on} and \texttt{trace off} commands.
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
root@test test> trace on
root@test test> scan
a b:c [] d
@@ -367,7 +367,7 @@ Time Start Service@Location Name
7+1691 shell@localhost scan:location
6+1692 tserver@localhost startScan
5+1692 tserver@localhost tablet read ahead 6
-\end{verbatim}
+\end{verbatim}\endgroup
\section{Logging}
Accumulo processes each write to a set of log files. By default these are found under\\
http://git-wip-us.apache.org/repos/asf/accumulo/blob/40299f89/docs/src/main/latex/accumulo_user_manual/chapters/analytics.tex
----------------------------------------------------------------------
diff --git a/docs/src/main/latex/accumulo_user_manual/chapters/analytics.tex b/docs/src/main/latex/accumulo_user_manual/chapters/analytics.tex
index fc50d4a..301120b 100644
--- a/docs/src/main/latex/accumulo_user_manual/chapters/analytics.tex
+++ b/docs/src/main/latex/accumulo_user_manual/chapters/analytics.tex
@@ -38,15 +38,13 @@ two format classes to do the following:
To read from an Accumulo table create a Mapper with the following class
parameterization and be sure to configure the AccumuloInputFormat.
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
class MyMapper extends Mapper<Key,Value,WritableComparable,Writable> {
public void map(Key k, Value v, Context c) {
// transform key and value data here
}
}
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
To write to an Accumulo table, create a Reducer with the following class
parameterization and be sure to configure the AccumuloOutputFormat. The key
@@ -55,21 +53,15 @@ allows a single Reducer to write to more than one table if desired. A default ta
can be configured using the AccumuloOutputFormat, in which case the output table
name does not have to be passed to the Context object within the Reducer.
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
class MyReducer extends Reducer<WritableComparable, Writable, Text, Mutation> {
-
public void reduce(WritableComparable key, Iterable<Text> values, Context c) {
-
Mutation m;
-
// create the mutation based on input key and value
-
c.write(new Text("output-table"), m);
}
}
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
The Text object passed as the output should contain the name of the table to which
this mutation should be applied. The Text can be null in which case the mutation
@@ -78,8 +70,7 @@ options.
\subsection{AccumuloInputFormat options}
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
Job job = new Job(getConf());
AccumuloInputFormat.setInputInfo(job,
"user",
@@ -89,41 +80,35 @@ AccumuloInputFormat.setInputInfo(job,
AccumuloInputFormat.setZooKeeperInstance(job, "myinstance",
"zooserver-one,zooserver-two");
-\end{verbatim}
+\end{verbatim}\endgroup
\Large
-\textbf{Optional settings:}
+\textbf{Optional Settings:}
\normalsize
To restrict Accumulo to a set of row ranges:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
ArrayList<Range> ranges = new ArrayList<Range>();
// populate array list of row ranges ...
AccumuloInputFormat.setRanges(job, ranges);
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
To restrict Accumulo to a list of columns:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
ArrayList<Pair<Text,Text>> columns = new ArrayList<Pair<Text,Text>>();
// populate list of columns
AccumuloInputFormat.fetchColumns(job, columns);
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
To use a regular expression to match row IDs:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
IteratorSetting is = new IteratorSetting(30, RexExFilter.class);
RegExFilter.setRegexs(is, ".*suffix", null, null, null, true);
AccumuloInputFormat.addIterator(job, is);
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
\subsection{AccumuloMultiTableInputFormat options}
@@ -131,23 +116,19 @@ The AccumuloMultiTableInputFormat allows the scanning over multiple tables
in a single MapReduce job. Separate ranges, columns, and iterators can be
used for each table.
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
InputTableConfig tableOneConfig = new InputTableConfig();
InputTableConfig tableTwoConfig = new InputTableConfig();
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
To set the configuration objects on the job:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
Map<String, InputTableConfig> configs = new HashMap<String,InputTableConfig>();
configs.put("table1", tableOneConfig);
configs.put("table2", tableTwoConfig);
AccumuloMultiTableInputFormat.setInputTableConfigs(job, configs);
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
\Large
\textbf{Optional settings:}
@@ -155,45 +136,38 @@ AccumuloMultiTableInputFormat.setInputTableConfigs(job, configs);
To restrict to a set of ranges:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
ArrayList<Range> tableOneRanges = new ArrayList<Range>();
ArrayList<Range> tableTwoRanges = new ArrayList<Range>();
// populate array lists of row ranges for tables...
tableOneConfig.setRanges(tableOneRanges);
tableTwoConfig.setRanges(tableTwoRanges);
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
To restrict Accumulo to a list of columns:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
ArrayList<Pair<Text,Text>> tableOneColumns = new ArrayList<Pair<Text,Text>>();
ArrayList<Pair<Text,Text>> tableTwoColumns = new ArrayList<Pair<Text,Text>>();
// populate lists of columns for each of the tables ...
tableOneConfig.fetchColumns(tableOneColumns);
tableTwoConfig.fetchColumns(tableTwoColumns);
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
To set scan iterators:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
List<IteratorSetting> tableOneIterators = new ArrayList<IteratorSetting>();
List<IteratorSetting> tableTwoIterators = new ArrayList<IteratorSetting>();
// populate the lists of iterator settings for each of the tables ...
tableOneConfig.setIterators(tableOneIterators);
tableTwoConfig.setIterators(tableTwoIterators);
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
The name of the table can be retrieved from the input split:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
class MyMapper extends Mapper<Key,Value,WritableComparable,Writable> {
public void map(Key k, Value v, Context c) {
RangeInputSplit split = (RangeInputSplit)c.getInputSplit();
@@ -201,14 +175,12 @@ class MyMapper extends Mapper<Key,Value,WritableComparable,Writable> {
// do something with table name
}
}
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
\subsection{AccumuloOutputFormat options}
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
boolean createTables = true;
String defaultTable = "mytable";
@@ -220,18 +192,16 @@ AccumuloOutputFormat.setOutputInfo(job,
AccumuloOutputFormat.setZooKeeperInstance(job, "myinstance",
"zooserver-one,zooserver-two");
-\end{verbatim}
+\end{verbatim}\endgroup
\Large
\textbf{Optional Settings:}
\normalsize
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
AccumuloOutputFormat.setMaxLatency(job, 300000); // milliseconds
AccumuloOutputFormat.setMaxMutationBufferSize(job, 50000000); // bytes
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
An example of using MapReduce with Accumulo can be found at\\
accumulo/docs/examples/README.mapred
http://git-wip-us.apache.org/repos/asf/accumulo/blob/40299f89/docs/src/main/latex/accumulo_user_manual/chapters/clients.tex
----------------------------------------------------------------------
diff --git a/docs/src/main/latex/accumulo_user_manual/chapters/clients.tex b/docs/src/main/latex/accumulo_user_manual/chapters/clients.tex
index 9b35d37..925c352 100644
--- a/docs/src/main/latex/accumulo_user_manual/chapters/clients.tex
+++ b/docs/src/main/latex/accumulo_user_manual/chapters/clients.tex
@@ -35,11 +35,9 @@ classpath. For Zookeeper 3.3 you only need to add the Zookeeper jar, and not
what is in the Zookeeper lib directory. You can run the following command on a
configured Accumulo system to see what its using for its classpath.
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
$ACCUMULO_HOME/bin/accumulo classpath
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
Another option for running your code is to put a jar file in
\texttt{\$ACCUMULO\_HOME/lib/ext}. After doing this you can use the accumulo
@@ -55,15 +53,13 @@ bin/tool.sh script to run those jobs. See the map reduce example.
All clients must first identify the Accumulo instance to which they will be
communicating. Code to do this is as follows:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
String instanceName = "myinstance";
String zooServers = "zooserver-one,zooserver-two"
Instance inst = new ZooKeeperInstance(instanceName, zooServers);
Connector conn = inst.getConnector("user", new PasswordToken("passwd"));
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
\section{Writing Data}
@@ -74,8 +70,7 @@ the appropriate TabletServers.
Mutations can be created thus:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
Text rowID = new Text("row1");
Text colFam = new Text("myColFam");
Text colQual = new Text("myColQual");
@@ -86,8 +81,7 @@ Value value = new Value("myValue".getBytes());
Mutation mutation = new Mutation(rowID);
mutation.put(colFam, colQual, colVis, timestamp, value);
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
\subsection{BatchWriter}
The BatchWriter is highly optimized to send Mutations to multiple TabletServers
@@ -98,9 +92,7 @@ batching.
Mutations are added to a BatchWriter thus:
-\small
-\begin{verbatim}
-
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
//BatchWriterConfig has reasonable defaults
BatchWriterConfig config = new BatchWriterConfig();
config.setMaxMemory(10000000L); // bytes available to batchwriter for buffering mutations
@@ -110,8 +102,7 @@ BatchWriter writer = conn.createBatchWriter("table", config)
writer.add(mutation);
writer.close();
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
An example of using the batch writer can be found at\\
accumulo/docs/examples/README.batch
@@ -152,8 +143,7 @@ To retrieve data, Clients use a Scanner, which acts like an Iterator over
keys and values. Scanners can be configured to start and stop at particular keys, and
to return a subset of the columns available.
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
// specify which visibilities we are allowed to see
Authorizations auths = new Authorizations("public");
@@ -167,8 +157,7 @@ for(Entry<Key,Value> entry : scan) {
String row = entry.getKey().getRow();
Value value = entry.getValue();
}
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
\subsection{Isolated Scanner}
@@ -193,8 +182,11 @@ crash a tablet server. By default rows are buffered in memory, but the user
can easily supply their own buffer if they wish to buffer to disk when rows are
large.
-For an example, look at the following\\
-\texttt{examples/simple/src/main/java/org/apache/accumulo/examples/simple/isolation/InterferenceTest.java}
+For an example, look at the following
+
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
+examples/simple/src/main/java/org/apache/accumulo/examples/simple/isolation/InterferenceTest.java
+\end{verbatim}\endgroup
\subsection{BatchScanner}
@@ -208,8 +200,7 @@ BatchScanners accept a set of Ranges. It is important to note that the keys retu
by a BatchScanner are not in sorted order since the keys streamed are from multiple
TabletServers in parallel.
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
ArrayList<Range> ranges = new ArrayList<Range>();
// populate list of ranges ...
@@ -220,8 +211,7 @@ bscan.setRanges(ranges);
bscan.fetchFamily("attributes");
for(Entry<Key,Value> entry : scan)
System.out.println(entry.getValue());
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
An example of the BatchScanner can be found at\\
accumulo/docs/examples/README.batch
@@ -243,23 +233,19 @@ Data nodes. A proxy client only needs the ability to communicate with the proxy
The configuration options for the proxy server live inside of a properties file. At
the very least, you need to supply the following properties:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
protocolFactory=org.apache.thrift.protocol.TCompactProtocol$Factory
tokenClass=org.apache.accumulo.core.client.security.tokens.PasswordToken
port=42424
instance=test
zookeepers=localhost:2181
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
You can find a sample configuration file in your distribution:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
$ACCUMULO_HOME/proxy/proxy.properties.
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
This sample configuration file further demonstrates an abilty to back the proxy server
by MockAccumulo or the MiniAccumuloCluster.
@@ -270,11 +256,9 @@ After the properties file holding the configuration is created, the proxy server
can be started using the following command in the Accumulo distribution (assuming
you your properties file is named config.properties):
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
$ACCUMULO_HOME/bin/accumulo proxy -p config.properties
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
\subsection{Creating a Proxy Client}
@@ -285,11 +269,9 @@ location such as /usr/lib/python/site-packages/thrift.
You can find the thrift file for generating the client:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
$ACCUMULO_HOME/proxy/proxy.thrift.
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
After a client is generated, the port specified in the configuration properties above will be
used to connect to the server.
@@ -302,13 +284,11 @@ the Thrift compiler. After initiating a connection to the Proxy (see Apache Thri
documentation for examples of connecting to a Thrift service), the methods on the
proxy client will be available. The first thing to do is log in:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
Map password = new HashMap<String,String>();
password.put("password", "secret");
ByteBuffer token = client.login("root", password);
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
Once logged in, the token returned will be used for most subsequent calls to the client.
Let's create a table, add some data, scan the table, and delete it.
@@ -316,17 +296,14 @@ Let's create a table, add some data, scan the table, and delete it.
First, create a table.
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
client.createTable(token, "myTable", true, TimeType.MILLIS);
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
Next, add some data:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
// first, create a writer on the server
String writer = client.createWriter(token, "myTable", new WriterOptions());
@@ -337,14 +314,12 @@ Map<ByteBuffer, List<ColumnUpdate> cells> cellsToUpdate = //...
client.updateAndFlush(writer, "myTable", cellsToUpdate);
client.closeWriter(writer);
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
Scan for the data and batch the return of the results on the server:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
String scanner = client.createScanner(token, "myTable", new ScanOptions());
ScanResult results = client.nextK(scanner, 100);
@@ -353,5 +328,4 @@ for(KeyValue keyValue : results.getResultsIterator()) {
}
client.closeScanner(scanner);
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
http://git-wip-us.apache.org/repos/asf/accumulo/blob/40299f89/docs/src/main/latex/accumulo_user_manual/chapters/development_clients.tex
----------------------------------------------------------------------
diff --git a/docs/src/main/latex/accumulo_user_manual/chapters/development_clients.tex b/docs/src/main/latex/accumulo_user_manual/chapters/development_clients.tex
index 3e8f523..fb7195d 100644
--- a/docs/src/main/latex/accumulo_user_manual/chapters/development_clients.tex
+++ b/docs/src/main/latex/accumulo_user_manual/chapters/development_clients.tex
@@ -30,26 +30,21 @@ settings between runs.
While normal interaction with the Accumulo client looks like this:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
Instance instance = new ZooKeeperInstance(...);
Connector conn = instance.getConnector(user, passwordToken);
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
To interact with the MockAccumulo, just replace the ZooKeeperInstance with MockInstance:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
Instance instance = new MockInstance();
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
-In fact, you can use the "--fake" option to the Accumulo shell and interact with
+In fact, you can use the "fake" option to the Accumulo shell and interact with
MockAccumulo:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
$ ./bin/accumulo shell --fake -u root -p ''
Shell - Apache Accumulo Interactive Shell
@@ -71,19 +66,16 @@ row3 cf:cq [] value3
root@fake test> scan -b row2 -e row2
row2 cf:cq [] value2
root@fake test>
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
When testing Map Reduce jobs, you can also set the Mock Accumulo on the AccumuloInputFormat
and AccumuloOutputFormat classes:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
// ... set up job configuration
AccumuloInputFormat.setMockInstance(job, "mockInstance");
AccumuloOutputFormat.setMockInstance(job, "mockInstance");
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
\section{Mini Accumulo Cluster}
@@ -96,28 +88,22 @@ up HDFS.
To start it up, you will need to supply an empty directory and a root password as arguments:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
File tempDirectory = // JUnit and Guava supply mechanisms for creating temp directories
MiniAccumuloCluster accumulo = new MiniAccumuloCluster(tempDirectory, "password");
accumulo.start();
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
Once we have our mini cluster running, we will want to interact with the Accumulo client API:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
Instance instance = new ZooKeeperInstance(accumulo.getInstanceName(), accumulo.getZooKeepers());
Connector conn = instance.getConnector("root", new PasswordToken("password"));
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
Upon completion of our development code, we will want to shutdown our MiniAccumuloCluster:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
accumulo.stop()
// delete your temporary folder
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
http://git-wip-us.apache.org/repos/asf/accumulo/blob/40299f89/docs/src/main/latex/accumulo_user_manual/chapters/high_speed_ingest.tex
----------------------------------------------------------------------
diff --git a/docs/src/main/latex/accumulo_user_manual/chapters/high_speed_ingest.tex b/docs/src/main/latex/accumulo_user_manual/chapters/high_speed_ingest.tex
index ea6237b..ab766d0 100644
--- a/docs/src/main/latex/accumulo_user_manual/chapters/high_speed_ingest.tex
+++ b/docs/src/main/latex/accumulo_user_manual/chapters/high_speed_ingest.tex
@@ -35,11 +35,9 @@ Pre-splitting a table ensures that there are as many tablets as desired availabl
before ingest begins to take advantage of all the parallelism possible with the cluster
hardware. Tables can be split anytime by using the shell:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
user@myinstance mytable> addsplits -sf /local_splitfile -t mytable
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
For the purposes of providing parallelism to ingest it is not necessary to create more
tablets than there are physical machines within the cluster as the aggregate ingest
@@ -78,8 +76,7 @@ data. The split points can be obtained from the shell and used by the MapReduce
RangePartitioner. Note that this is only useful if the existing table is already split
into multiple tablets.
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
user@myinstance mytable> getsplits
aa
ab
@@ -88,18 +85,15 @@ ac
zx
zy
zz
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
Run the MapReduce job, using the AccumuloFileOutputFormat to create the files to
be introduced to Accumulo. Once this is complete, the files can be added to
Accumulo via the shell:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
user@myinstance mytable> importdirectory /files_dir /failures
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
Note that the paths referenced are directories within the same HDFS instance over
which Accumulo is running. Accumulo places any files that failed to be added to the
http://git-wip-us.apache.org/repos/asf/accumulo/blob/40299f89/docs/src/main/latex/accumulo_user_manual/chapters/multivolume.tex
----------------------------------------------------------------------
diff --git a/docs/src/main/latex/accumulo_user_manual/chapters/multivolume.tex b/docs/src/main/latex/accumulo_user_manual/chapters/multivolume.tex
index 07e7a1f..0a0e6fe 100644
--- a/docs/src/main/latex/accumulo_user_manual/chapters/multivolume.tex
+++ b/docs/src/main/latex/accumulo_user_manual/chapters/multivolume.tex
@@ -40,14 +40,12 @@ servers. The configuration ``instance.volumes'' should be set to a
comma-separated list, using full URI references to different NameNode
servers:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
<property>
<name>instance.volumes</name>
<value>hdfs://ns1:9001,hdfs://ns2:9001</value>
</property>
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
The introduction of multiple volume support in 1.6 changed the way Accumulo
stores pointers to files. It now stores fully qualified URI references to
@@ -65,14 +63,12 @@ example configuration below will replace ns1 with nsA and ns2 with nsB in
Accumulo metadata. For this property to take affect, Accumulo will need to be
restarted.
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
<property>
<name>instance.volumes.replacements</name>
<value>hdfs://ns1:9001 hdfs://nsA:9001, hdfs://ns2:9001 hdfs://nsB:9001</value>
</property>
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
Using viewfs or HA namenode, introduced in Hadoop 2, offers another option for
managing the fully qualified URIs stored in Accumulo. Viewfs and HA namenode
http://git-wip-us.apache.org/repos/asf/accumulo/blob/40299f89/docs/src/main/latex/accumulo_user_manual/chapters/security.tex
----------------------------------------------------------------------
diff --git a/docs/src/main/latex/accumulo_user_manual/chapters/security.tex b/docs/src/main/latex/accumulo_user_manual/chapters/security.tex
index a5c4db3..83cfb21 100644
--- a/docs/src/main/latex/accumulo_user_manual/chapters/security.tex
+++ b/docs/src/main/latex/accumulo_user_manual/chapters/security.tex
@@ -29,8 +29,7 @@ When mutations are applied, users can specify a security label for each value. T
done as the Mutation is created by passing a ColumnVisibility object to the put()
method:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
Text rowID = new Text("row1");
Text colFam = new Text("myColFam");
Text colQual = new Text("myColQual");
@@ -41,8 +40,7 @@ Value value = new Value("myValue");
Mutation mutation = new Mutation(rowID);
mutation.put(colFam, colQual, colVis, timestamp, value);
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
\section{Security Label Expression Syntax}
@@ -54,18 +52,15 @@ groups of tokens together.
For example, suppose within our organization we want to label our data values with
security labels defined in terms of user roles. We might have tokens such as:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
admin
audit
system
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
These can be specified alone or combined using logical operators:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
// Users must have admin privileges:
admin
@@ -77,8 +72,7 @@ admin|audit
// Users must have audit and one or both of admin or system
(admin|system)&audit
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
When both \verb^|^ and \verb^&^ operators are used, parentheses must be used to specify
precedence of the operators.
@@ -93,14 +87,12 @@ results sent back to the client.
Authorizations are specified as a comma-separated list of tokens the user possesses:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
// user possess both admin and system level access
Authorization auths = new Authorization("admin","system");
Scanner s = connector.createScanner("table", auths);
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
\section{User Authorizations}
@@ -118,11 +110,9 @@ enable this constraint. For existing tables use the following shell command to
enable the visibility constraint. Ensure the constraint number does not
conflict with any existing constraints.
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
config -t table -s table.constraint.1=org.apache.accumulo.core.security.VisibilityConstraint
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
Any user with the alter table permission can add or remove this constraint.
This constraint is not applied to bulk imported data, if this a concern then
http://git-wip-us.apache.org/repos/asf/accumulo/blob/40299f89/docs/src/main/latex/accumulo_user_manual/chapters/shell.tex
----------------------------------------------------------------------
diff --git a/docs/src/main/latex/accumulo_user_manual/chapters/shell.tex b/docs/src/main/latex/accumulo_user_manual/chapters/shell.tex
index 25ac8a7..f3c11ff 100644
--- a/docs/src/main/latex/accumulo_user_manual/chapters/shell.tex
+++ b/docs/src/main/latex/accumulo_user_manual/chapters/shell.tex
@@ -21,18 +21,14 @@ configuration settings.
The shell can be started by the following command:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
$ACCUMULO_HOME/bin/accumulo shell -u [username]
-\end{verbatim}
-
-\normalsize
+\end{verbatim}\endgroup
The shell will prompt for the corresponding password to the username specified
and then display the following prompt:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
Shell - Apache Accumulo Interactive Shell
-
- version 1.5
@@ -41,16 +37,14 @@ Shell - Apache Accumulo Interactive Shell
-
- type 'help' for a list of available commands
-
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
\section{Basic Administration}
The Accumulo shell can be used to create and delete tables, as well as to configure
table and instance specific options.
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
root@myinstance> tables
accumulo.metadata
accumulo.root
@@ -73,14 +67,12 @@ deletetable { testtable } (yes|no)? yes
Table: [testtable] has been deleted.
root@myinstance>
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
The Shell can also be used to insert updates and scan tables. This is useful for
inspecting tables.
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
root@myinstance mytable> scan
root@myinstance mytable> insert row1 colf colq value1
@@ -88,8 +80,7 @@ insert successful
root@myinstance mytable> scan
row1 colf:colq [] value1
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
The value in brackets "[]" would be the visibility labels. Since none were used, this is empty for this row.
You can use the "-st" option to scan to see the timestamp for the cell, too.
@@ -99,30 +90,25 @@ You can use the "-st" option to scan to see the timestamp for the cell, too.
The \textbf{compact} command instructs Accumulo to schedule a compaction of the table during which
files are consolidated and deleted entries are removed.
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
root@myinstance mytable> compact -t mytable
07 16:13:53,201 [shell.Shell] INFO : Compaction of table mytable started for given range
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
The \textbf{flush} command instructs Accumulo to write all entries currently in memory for a given table
to disk.
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
root@myinstance mytable> flush -t mytable
07 16:14:19,351 [shell.Shell] INFO : Flush of table mytable
initiated...
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
\section{User Administration}
The Shell can be used to add, remove, and grant privileges to users.
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
root@myinstance mytable> createuser bob
Enter new password for 'bob': *********
Please confirm new password for 'bob': *********
@@ -148,6 +134,5 @@ bob@myinstance bobstable> user root
Enter current password for 'root': *********
root@myinstance bobstable> revoke System.CREATE_TABLE -s -u bob
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
http://git-wip-us.apache.org/repos/asf/accumulo/blob/40299f89/docs/src/main/latex/accumulo_user_manual/chapters/table_configuration.tex
----------------------------------------------------------------------
diff --git a/docs/src/main/latex/accumulo_user_manual/chapters/table_configuration.tex b/docs/src/main/latex/accumulo_user_manual/chapters/table_configuration.tex
index d601fae..0e0dad4 100644
--- a/docs/src/main/latex/accumulo_user_manual/chapters/table_configuration.tex
+++ b/docs/src/main/latex/accumulo_user_manual/chapters/table_configuration.tex
@@ -18,7 +18,9 @@
Accumulo tables have a few options that can be configured to alter the default
behavior of Accumulo as well as improve performance based on the data stored.
-These include locality groups, constraints, bloom filters, iterators, and block cache.
+These include locality groups, constraints, bloom filters, iterators, and block
+cache. For a complete list of available configuration options, see
+Appendix~\ref{app:config}.
\section{Locality Groups}
Accumulo supports storing sets of column families separately on disk to allow
@@ -33,21 +35,18 @@ programmatically as follows:
\subsection{Managing Locality Groups via the Shell}
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
usage: setgroups <group>=<col fam>{,<col fam>}{ <group>=<col fam>{,<col
fam>}} [-?] -t <table>
user@myinstance mytable> setgroups group_one=colf1,colf2 -t mytable
user@myinstance mytable> getgroups -t mytable
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
\subsection{Managing Locality Groups via the Client API}
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
Connector conn;
HashMap<String,Set<Text>> localityGroups = new HashMap<String, Set<Text>>();
@@ -68,8 +67,7 @@ conn.tableOperations().setLocalityGroups("mytable", localityGroups);
// existing locality groups can be obtained as follows
Map<String, Set<Text>> groups =
conn.tableOperations().getLocalityGroups("mytable");
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
The assignment of Column Families to Locality Groups can be changed anytime. The
physical movement of column families into their new locality groups takes place via
@@ -77,11 +75,9 @@ the periodic Major Compaction process that takes place continuously in the
background. Major Compaction can also be scheduled to take place immediately
through the shell:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
user@myinstance mytable> compact -t mytable
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
\section{Constraints}
@@ -92,18 +88,20 @@ client.
Constraints can be enabled by setting a table property as follows:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
user@myinstance mytable> constraint -t mytable -a com.test.ExampleConstraint com.test.AnotherConstraint
user@myinstance mytable> constraint -l
com.test.ExampleConstraint=1
com.test.AnotherConstraint=2
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
Currently there are no general-purpose constraints provided with the Accumulo
distribution. New constraints can be created by writing a Java class that implements
-the org.apache.accumulo.core.constraints.Constraint interface.
+the following interface:
+
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
+org.apache.accumulo.core.constraints.Constraint
+\end{verbatim}\endgroup
To deploy a new constraint, create a jar file containing the class implementing the
new constraint and place it in the lib directory of the Accumulo installation. New
@@ -122,11 +120,9 @@ This can speed up lookups considerably.
To enable bloom filters, enter the following command in the Shell:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
user@myinstance> config -t mytable -s table.bloom.enabled=true
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
An extensive example of using Bloom Filters can be found at\\
\texttt{accumulo/docs/examples/README.bloom} .
@@ -149,52 +145,44 @@ compaction scopes. If the Iterator implements the OptionDescriber interface, the
setiter command can be used which will interactively prompt the user to provide
values for the given necessary options.
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
usage: setiter [-?] -ageoff | -agg | -class <name> | -regex |
-reqvis | -vers [-majc] [-minc] [-n <itername>] -p <pri>
[-scan] [-t <table>]
user@myinstance mytable> setiter -t mytable -scan -p 15 -n myiter -class com.company.MyIterator
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
The config command can always be used to manually configure iterators which is useful
in cases where the Iterator does not implement the OptionDescriber interface.
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
config -t mytable -s table.iterator.scan.myiter=15,com.company.MyIterator
config -t mytable -s table.iterator.minc.myiter=15,com.company.MyIterator
config -t mytable -s table.iterator.majc.myiter=15,com.company.MyIterator
config -t mytable -s table.iterator.scan.myiter.opt.myoptionname=myoptionvalue
config -t mytable -s table.iterator.minc.myiter.opt.myoptionname=myoptionvalue
config -t mytable -s table.iterator.majc.myiter.opt.myoptionname=myoptionvalue
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
\subsection{Setting Iterators Programmatically}
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
scanner.addIterator(new IteratorSetting(
15, // priority
"myiter", // name this iterator
"com.company.MyIterator" // class name
));
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
Some iterators take additional parameters from client code, as in the following
example:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
IteratorSetting iter = new IteratorSetting(...);
iter.addOption("myoptionname", "myoptionvalue");
scanner.addIterator(iter)
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
Tables support separate Iterator settings to be applied at scan time, upon minor
compaction and upon major compaction. For most uses, tables will have identical
@@ -216,26 +204,22 @@ given date. The default is to return the one most recent version.
The version policy can be changed by changing the VersioningIterator options for a
table as follows:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
user@myinstance mytable> config -t mytable -s table.iterator.scan.vers.opt.maxVersions=3
user@myinstance mytable> config -t mytable -s table.iterator.minc.vers.opt.maxVersions=3
user@myinstance mytable> config -t mytable -s table.iterator.majc.vers.opt.maxVersions=3
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
When a table is created, by default its configured to use the
VersioningIterator and keep one version. A table can be created without the
VersioningIterator with the -ndi option in the shell. Also the Java API
has the following method
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
connector.tableOperations.create(String tableName, boolean limitVersion).
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
\subsubsection{Logical Time}
@@ -250,11 +234,9 @@ always move forward and never backwards.
A table can be configured to use logical timestamps at creation time as follows:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
user@myinstance> createtable -tl logical
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
\subsubsection{Deletes}
Deletes are special keys in Accumulo that get sorted along will all the other data.
@@ -275,14 +257,16 @@ The AgeOff filter can be configured to remove data older than a certain date or
amount of time from the present. The following example sets a table to delete
everything inserted over 30 seconds ago:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
user@myinstance> createtable filtertest
user@myinstance filtertest> setiter -t filtertest -scan -minc -majc -p 10 -n myfilter -ageoff
AgeOffFilter removes entries with timestamps more than <ttl> milliseconds old
-----------> set org.apache.accumulo.core.iterators.user.AgeOffFilter parameter negate, default false keeps k/v that pass accept method, true rejects k/v that pass accept method:
-----------> set org.apache.accumulo.core.iterators.user.AgeOffFilter parameter ttl, time to live (milliseconds): 3000
-----------> set org.apache.accumulo.core.iterators.user.AgeOffFilter parameter currentTime, if set, use the given value as the absolute time in milliseconds as the current time of day:
+----------> set org.apache.accumulo.core.iterators.user.AgeOffFilter parameter negate, default false
+ keeps k/v that pass accept method, true rejects k/v that pass accept method:
+----------> set org.apache.accumulo.core.iterators.user.AgeOffFilter parameter ttl, time to
+ live (milliseconds): 3000
+----------> set org.apache.accumulo.core.iterators.user.AgeOffFilter parameter currentTime, if set,
+ use the given value as the absolute time in milliseconds as the current time of day:
user@myinstance filtertest>
user@myinstance filtertest> scan
user@myinstance filtertest> insert foo a b c
@@ -291,13 +275,11 @@ foo a:b [] c
user@myinstance filtertest> sleep 4
user@myinstance filtertest> scan
user@myinstance filtertest>
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
To see the iterator settings for a table, use:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
user@example filtertest> config -t filtertest -f iterator
---------+---------------------------------------------+------------------
SCOPE | NAME | VALUE
@@ -314,9 +296,8 @@ table | table.iterator.scan.myfilter .............. | 10,org.apache.accumulo.
table | table.iterator.scan.myfilter.opt.ttl ...... | 3000
table | table.iterator.scan.vers .................. | 20,org.apache.accumulo.core.iterators.VersioningIterator
table | table.iterator.scan.vers.opt.maxVersions .. | 1
----------+------------------------------------------+------------------
-\end{verbatim}
-\normalsize
+---------+---------------------------------------------+------------------
+\end{verbatim}\endgroup
\subsection{Combiners}
@@ -329,26 +310,21 @@ the values associated with a particular key.
For example, if a summing combiner were configured on a table and the following
mutations were inserted:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
Row Family Qualifier Timestamp Value
rowID1 colfA colqA 20100101 1
rowID1 colfA colqA 20100102 1
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
The table would reflect only one aggregate value:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
rowID1 colfA colqA - 2
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
Combiners can be enabled for a table using the setiter command in the shell. Below is an example.
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
root@a14 perDayCounts> setiter -t perDayCounts -p 10 -scan -minc -majc -n daycount
-class org.apache.accumulo.core.iterators.user.SummingCombiner
TypedValueCombiner can interpret Values as a variety of number encodings
@@ -367,8 +343,7 @@ root@a14 perDayCounts> scan
bar day:20080101 [] 2
foo day:20080101 [] 2
foo day:20080103 [] 1
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
Accumulo includes some useful Combiners out of the box. To find these look in
the\\ \texttt{org.apache.accumulo.core.iterators.user} package.
@@ -377,8 +352,11 @@ Additional Combiners can be added by creating a Java class that extends\\
\texttt{org.apache.accumulo.core.iterators.Combiner} and adding a jar containing that
class to Accumulo's lib/ext directory.
-An example of a Combiner can be found under\\
+An example of a Combiner can be found under
+
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
accumulo/examples/simple/main/java/org/apache/accumulo/examples/simple/combiner/StatsCombiner.java
+\end{verbatim}\endgroup
\section{Block Cache}
@@ -390,15 +368,18 @@ Typical queries to Accumulo result in a binary search over several index blocks
The block cache can be configured on a per-table basis, and all tablets hosted on a tablet server share a single resource pool.
To configure the size of the tablet server's block cache, set the following properties:
-\begin{verbatim}
+
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
tserver.cache.data.size: Specifies the size of the cache for file data blocks.
tserver.cache.index.size: Specifies the size of the cache for file indices.
-\end{verbatim}
+\end{verbatim}\endgroup
+
To enable the block cache for your table, set the following properties:
-\begin{verbatim}
+
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
table.cache.block.enable: Determines whether file (data) block cache is enabled.
table.cache.index.enable: Determines whether index cache is enabled.
-\end{verbatim}
+\end{verbatim}\endgroup
The block cache can have a significant effect on alleviating hot spots, as well as reducing query latency.
It is enabled by default for the metadata tables.
@@ -413,9 +394,9 @@ decide which tablets to compact and which files within a tablet to compact.
This decision is made using the compaction ratio, which is configurable on a
per table basis. To configure this ratio modify the following property:
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
table.compaction.major.ratio
-\end{verbatim}
+\end{verbatim}\endgroup
Increasing this ratio will result in more files per tablet and less compaction
work. More files per tablet means more higher query latency. So adjusting
@@ -432,16 +413,16 @@ compaction is triggered or there are no files left to consider.
The number of background threads tablet servers use to run major compactions is
configurable. To configure this modify the following property:
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
tserver.compaction.major.concurrent.max
-\end{verbatim}
+\end{verbatim}\endgroup
Also, the number of threads tablet servers use for minor compactions is
configurable. To configure this modify the following property:
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
tserver.compaction.minor.concurrent.max
-\end{verbatim}
+\end{verbatim}\endgroup
The numbers of minor and major compactions running and queued is visible on the
Accumulo monitor page. This allows you to see if compactions are backing up
@@ -465,9 +446,9 @@ be done.
Another option to deal with the files per tablet growing too large is to adjust
the following property:
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
table.file.max
-\end{verbatim}
+\end{verbatim}\endgroup
When a tablet reaches this number of files and needs to flush its in-memory
data to disk, it will choose to do a merging minor compaction. A merging minor
@@ -503,12 +484,10 @@ is new, or small, you can add split points and generate new tablets.
In the shell:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
root@myinstance> createtable newTable
root@myinstance> addsplits -t newTable g n t
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
This will create a new table with 4 tablets. The table will be split
on the letters ``g'', ``n'', and ``t'' which will work nicely if the
@@ -532,54 +511,44 @@ Accumulo supports tablet merging, which can be used to reduce
the number of split points. The following command will merge all rows
from ``A'' to ``Z'' into a single tablet:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
root@myinstance> merge -t myTable -s A -e Z
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
If the result of a merge produces a tablet that is larger than the
configured split size, the tablet may be split by the tablet server.
Be sure to increase your tablet size prior to any merges if the goal
is to have larger tablets:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
root@myinstance> config -t myTable -s table.split.threshold=2G
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
In order to merge small tablets, you can ask Accumulo to merge
sections of a table smaller than a given size.
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
root@myinstance> merge -t myTable -s 100M
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
By default, small tablets will not be merged into tablets that are
already larger than the given size. This can leave isolated small
tablets. To force small tablets to be merged into larger tablets use
-the ``--{}--force'' option:
+the ``-{}-force'' option:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
root@myinstance> merge -t myTable -s 100M --force
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
Merging away small tablets works on one section at a time. If your
table contains many sections of small split points, or you are
attempting to change the split size of the entire table, it will be
faster to set the split point and merge the entire table:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
root@myinstance> config -t myTable -s table.split.threshold=256M
root@myinstance> merge -t myTable
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
\section{Delete Range}
@@ -590,24 +559,20 @@ this date, say to remove all the data older than the current year.
Accumulo supports a delete range operation which efficiently
removes data between two rows. For example:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
root@myinstance> deleterange -t myTable -s 2010 -e 2011
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
This will delete all rows starting with ``2010'' and it will stop at
any row starting ``2011''. You can delete any data prior to 2011
with:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
root@myinstance> deleterange -t myTable -e 2011 --force
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
The shell will not allow you to delete an unbounded range (no start)
-unless you provide the ``--{}--force'' option.
+unless you provide the ``-{}-force'' option.
Range deletion is implemented using splits at the given start/end
positions, and will affect the number of splits in the table.
@@ -636,8 +601,7 @@ created, only the user that created the clone can read and write to it.
In the following example we see that data inserted after the clone operation is
not visible in the clone.
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
root@a14> createtable people
root@a14 people> insert 890435 name last Doe
root@a14 people> insert 890435 name first John
@@ -654,8 +618,7 @@ root@a14 test> scan
890435 name:first [] John
890435 name:last [] Doe
root@a14 test>
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
The du command in the shell shows how much space a table is using in HDFS.
This command can also show how much overlapping space two cloned tables have in
@@ -665,8 +628,7 @@ inserted into cic and its flushed, du shows the two tables still share 428M but
cic has 226 bytes to itself. Finally, table cic is compacted and then du shows
that each table uses 428M.
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
root@a14> du ci
428,482,573 [ci]
root@a14> clonetable ci cic
@@ -688,8 +650,7 @@ root@a14 cic> du ci cic
428,482,573 [ci]
428,482,612 [cic]
root@a14 cic>
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
\section{Exporting Tables}
http://git-wip-us.apache.org/repos/asf/accumulo/blob/40299f89/docs/src/main/latex/accumulo_user_manual/chapters/table_design.tex
----------------------------------------------------------------------
diff --git a/docs/src/main/latex/accumulo_user_manual/chapters/table_design.tex b/docs/src/main/latex/accumulo_user_manual/chapters/table_design.tex
index a833115..ff1cebd 100644
--- a/docs/src/main/latex/accumulo_user_manual/chapters/table_design.tex
+++ b/docs/src/main/latex/accumulo_user_manual/chapters/table_design.tex
@@ -26,17 +26,14 @@ is to select a unique identifier as the row ID for each entity to be stored and
all the other attributes to be tracked to be columns under this row ID. For example,
if we have the following data in a comma-separated file:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
userid,age,address,account-balance
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
We might choose to store this data using the userid as the rowID, the column
name in the column family, and a blank column qualifier:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
Mutation m = new Mutation(userid);
final String column_qualifier = "";
m.put("age", column_qualifier, age);
@@ -44,14 +41,12 @@ m.put("address", column_qualifier, address);
m.put("balance", column_qualifier, account_balance);
writer.add(m);
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
We could then retrieve any of the columns for a specific userid by specifying the
userid as the range of a scanner and fetching specific columns:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
Range r = new Range(userid, userid); // single row
Scanner s = conn.createScanner("userdata", auths);
s.setRange(r);
@@ -59,8 +54,7 @@ s.fetchColumnFamily(new Text("age"));
for(Entry<Key,Value> entry : s)
System.out.println(entry.getValue().toString());
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
\section{RowID Design}
@@ -69,45 +63,39 @@ that is optimal for anticipated access patterns. A good example of this is rever
the order of components of internet domain names in order to group rows of the
same parent domain together:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
com.google.code
com.google.labs
com.google.mail
com.yahoo.mail
com.yahoo.research
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
Some data may result in the creation of very large rows - rows with many columns.
In this case the table designer may wish to split up these rows for better load
balancing while keeping them sorted together for scanning purposes. This can be
done by appending a random substring at the end of the row:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
com.google.code_00
com.google.code_01
com.google.code_02
com.google.labs_00
com.google.mail_00
com.google.mail_01
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
It could also be done by adding a string representation of some period of time such as date to the week
or month:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
com.google.code_201003
com.google.code_201004
com.google.code_201005
com.google.labs_201003
com.google.mail_201003
com.google.mail_201004
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
Appending dates provides the additional capability of restricting a scan to a given
date range.
@@ -121,8 +109,7 @@ be better seeked or searched in ranges.
The lexicoders are a standard and extensible way of encoding Java types. Here's an example
of a lexicoder that encodes a java Date object so that it sorts lexicographically:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
// create new date lexicoder
DateLexicoder dateEncoder = new DateLexicoder();
@@ -133,14 +120,12 @@ Date hour = new Date(epoch - (epoch % 3600000));
// encode the rowId so that it is sorted lexicographically
Mutation mutation = new Mutation(dateEncoder.encode(hour));
mutation.put(new Text("colf"), new Text("colq"), new Value(new byte[]{}));
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
If we want to return the most recent date first, we can reverse the sort order
with the reverse lexicoder:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
// create new date lexicoder and reverse lexicoder
DateLexicoder dateEncoder = new DateLexicoder();
ReverseLexicoder reverseEncoder = new ReverseLexicoder(dateEncoder);
@@ -152,8 +137,7 @@ Date hour = new Date(epoch - (epoch % 3600000));
// encode the rowId so that it sorts in reverse lexicographic order
Mutation mutation = new Mutation(reverseEncoder.encode(hour));
mutation.put(new Text("colf"), new Text("colq"), new Value(new byte[]{}));
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
\section{Indexing}
@@ -190,8 +174,7 @@ BatchScanner, which performs the lookups in multiple threads to multiple servers
and returns an Iterator over all the rows retrieved. The rows returned are NOT in
sorted order, as is the case with the basic Scanner interface.
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
// first we scan the index for IDs of rows matching our query
Text term = new Text("mySearchTerm");
@@ -213,8 +196,7 @@ bscan.fetchColumnFamily(new Text("attributes"));
for(Entry<Key,Value> entry : bscan)
System.out.println(entry.getValue());
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
One advantage of the dynamic schema capabilities of Accumulo is that different
fields may be indexed into the same physical table. However, it may be necessary to
@@ -333,8 +315,7 @@ bins, a search of all documents must search every bin. We can use the BatchScann
to scan all bins in parallel. The Intersecting Iterator should be enabled on a
BatchScanner within user query code as follows:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
Text[] terms = {new Text("the"), new Text("white"), new Text("house")};
BatchScanner bs = conn.createBatchScanner(table, auths, 20);
@@ -346,8 +327,7 @@ bs.setRanges(Collections.singleton(new Range()));
for(Entry<Key,Value> entry : bs) {
System.out.println(" " + entry.getKey().getColumnQualifier());
}
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
This code effectively has the BatchScanner scan all tablets of a table, looking for
documents that match all the given terms. Because all tablets are being scanned for
http://git-wip-us.apache.org/repos/asf/accumulo/blob/40299f89/docs/src/main/latex/accumulo_user_manual/chapters/troubleshooting.tex
----------------------------------------------------------------------
diff --git a/docs/src/main/latex/accumulo_user_manual/chapters/troubleshooting.tex b/docs/src/main/latex/accumulo_user_manual/chapters/troubleshooting.tex
index a6a86dc..0628e24 100644
--- a/docs/src/main/latex/accumulo_user_manual/chapters/troubleshooting.tex
+++ b/docs/src/main/latex/accumulo_user_manual/chapters/troubleshooting.tex
@@ -1,3 +1,4 @@
+
% Licensed to the Apache Software Foundation (ASF) under one or more
% contributor license agreements. See the NOTICE file distributed with
% this work for additional information regarding copyright ownership.
@@ -65,11 +66,9 @@ from your browser.
It is sometimes helpful to use a text-only browser to sanity-check the
monitor while on the machine running the monitor:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
$ links http://localhost:50095
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
A. Verify that you are not firewalled from the monitor if it is running on a remote host.
@@ -92,22 +91,18 @@ This troubleshooting guide does not cover HDFS, but in general, you
want to make sure that all the datanodes are running and an fsck check
finds the file system clean:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
$ hadoop fsck /accumulo
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
On a larger cluster, you may need to increase the number of Xceivers
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
<property>
<name>dfs.datanode.max.xcievers</name>
<value>4096</value>
</property>
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
A. Verify HDFS is healthy, check the datanode logs.
@@ -119,13 +114,11 @@ Zookeeper is also a distributed service. You will need to ensure that
it is up. You can run the zookeeper command line tool to connect to
any one of the zookeeper servers:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
$ zkCli.sh -server zoohost
...
[zk: zoohost:2181(CONNECTED) 0]
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
It is important to see the word \texttt{CONNECTED}! If you only see
\texttt{CONNECTING} you will need to diagnose zookeeper errors.
@@ -147,8 +140,7 @@ You can check the election status and connection status of clients by
asking the zookeeper nodes for their status. You connect to zookeeper
and ask it with the four-letter ``stat'' command:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
$ nc zoohost 2181
stat
Zookeeper version: 3.4.5-1392090, built on 09/30/2012 17:52 GMT
@@ -165,8 +157,7 @@ Zxid: 0x621a3b
Mode: standalone
Node count: 22524
$
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
A. Check zookeeper status, verify that it has a quorum, and has not exceeded maxClientCnxns.
@@ -231,11 +222,10 @@ machine, this collection can take a long time. This happens more
frequently when the JVM is getting low on free memory. Check the logs
of the tablet server. You will see lines like this:
-\small
-\begin{verbatim}
-2013-06-20 13:43:20,607 [tabletserver.TabletServer] DEBUG: gc ParNew=0.00(+0.00) secs ConcurrentMarkSweep=0.00(+0.00) secs freemem=1,868,325,952(+1,868,325,952) totalmem=2,040,135,680
-\end{verbatim}
-\normalsize
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
+2013-06-20 13:43:20,607 [tabletserver.TabletServer] DEBUG: gc ParNew=0.00(+0.00) secs
+ ConcurrentMarkSweep=0.00(+0.00) secs freemem=1,868,325,952(+1,868,325,952) totalmem=2,040,135,680
+\end{verbatim}\endgroup
When ``freemem'' becomes small relative to the amount of memory
needed, the JVM will spend more time finding free memory than
@@ -253,8 +243,7 @@ more.
There's a class that will examine an accumulo storage file and print
out basic metadata.
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
$ ./bin/accumulo org.apache.accumulo.core.file.rfile.PrintInfo /accumulo/tables/1/default_tablet/A000000n.rf
2013-07-16 08:17:14,778 [util.NativeCodeLoader] INFO : Loaded the native-hadoop library
Locality group : <DEFAULT>
@@ -264,7 +253,11 @@ Locality group : <DEFAULT>
First key : 288be9ab4052fe9e span:34078a86a723e5d3:3da450f02108ced5 [] 1373373521623 false
Last key : start:13fc375709e id:615f5ee2dd822d7a [] 1373373821660 false
Num entries : 466
- Column families : [waitForCommits, start, md major compactor 1, md major compactor 2, md major compactor 3, bringOnline, prep, md major compactor 4, md major compactor 5, md root major compactor 3, minorCompaction, wal, compactFiles, md root major compactor 4, md root major compactor 1, md root major compactor 2, compact, id, client:update, span, update, commit, write, majorCompaction]
+ Column families : [waitForCommits, start, md major compactor 1, md major compactor 2, md major compactor 3,
+ bringOnline, prep, md major compactor 4, md major compactor 5, md root major compactor 3,
+ minorCompaction, wal, compactFiles, md root major compactor 4, md root major compactor 1,
+ md root major compactor 2, compact, id, client:update, span, update, commit, write,
+ majorCompaction]
Meta block : BCFile.index
Raw size : 4 bytes
@@ -275,13 +268,11 @@ Meta block : RFile.index
Raw size : 780 bytes
Compressed size : 344 bytes
Compression type : gz
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
When trying to diagnose problems related to key size, the PrintInfo tool can provide a histogram of the individual key sizes:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
$ ./bin/accumulo org.apache.accumulo.core.file.rfile.PrintInfo --histogram /accumulo/tables/1/default_tablet/A000000n.rf
...
Up to size count %-age
@@ -295,18 +286,15 @@ Up to size count %-age
100000000 : 0 0.00%
1000000000 : 0 0.00%
10000000000 : 0 0.00%
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
Likewise, PrintInfo will dump the key-value pairs and show you the contents of the RFile:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
$ ./bin/accumulo org.apache.accumulo.core.file.rfile.PrintInfo --dump /accumulo/tables/1/default_tablet/A000000n.rf
row columnFamily:columnQualifier [visibility] timestamp deleteFlag -> Value
...
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
Q. Accumulo is not showing me any data!
@@ -322,34 +310,28 @@ does not provide the normal access controls in Accumulo.
If you would like to backup, or otherwise examine the contents of Zookeeper, there are commands to dump and load to/from XML.
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
$ ./bin/accumulo org.apache.accumulo.server.util.DumpZookeeper --root /accumulo >dump.xml
$ ./bin/accumulo org.apache.accumulo.server.util.RestoreZookeeper --overwrite < dump.xml
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
Q. How can I get the information in the monitor page for my cluster monitoring system?
A. Use GetMasterStats:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
$ ./bin/accumulo org.apache.accumulo.test.GetMasterStats | grep Load
OS Load Average: 0.27
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
Q. The monitor page is showing an offline tablet. How can I find out which tablet it is?
A. Use FindOfflineTablets:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
$ ./bin/accumulo org.apache.accumulo.server.util.FindOfflineTablets
2<<@(null,null,localhost:9997) is UNASSIGNED #walogs:2
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
Here's what the output means:
@@ -372,14 +354,12 @@ Q. How can I be sure that the metadata tables are up and consistent?
A. \texttt{CheckForMetadataProblems} will verify the start/end of
every tablet matches, and the start and stop for the table is empty:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
$ ./bin/accumulo org.apache.accumulo.server.util.CheckForMetadataProblems -u root --password
Enter the connection password:
All is well for table !0
All is well for table 1
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
Q. My hadoop cluster has lost a file due to a NameNode failure. How can I remove the file?
@@ -387,24 +367,21 @@ A. There's a utility that will check every file reference and ensure
that the file exists in HDFS. Optionally, it will remove the
reference:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
$ ./bin/accumulo org.apache.accumulo.server.util.RemoveEntriesForMissingFiles -u root --password
Enter the connection password:
-2013-07-16 13:10:57,293 [util.RemoveEntriesForMissingFiles] INFO : File /accumulo/tables/2/default_tablet/F0000005.rf is missing
+2013-07-16 13:10:57,293 [util.RemoveEntriesForMissingFiles] INFO : File /accumulo/tables/2/default_tablet/F0000005.rf
+ is missing
2013-07-16 13:10:57,296 [util.RemoveEntriesForMissingFiles] INFO : 1 files of 3 missing
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
Q. I have many entries in zookeeper for old instances I no longer need. How can I remove them?
A. Use CleanZookeeper:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
$ ./bin/accumulo org.apache.accumulo.server.util.CleanZookeeper
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
This command will not delete the instance pointed to by the local \texttt{conf/accumulo-site.xml} file.
@@ -412,39 +389,33 @@ Q. I need to decommission a node. How do I stop the tablet server on it?
A. Use the admin command:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
$ ./bin/accumulo admin stop hostname:9997
2013-07-16 13:15:38,403 [util.Admin] INFO : Stopping server 12.34.56.78:9997
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
Q. I cannot login to a tablet server host, and the tablet server will not shut down. How can I kill the server?
A. Sometimes you can kill a ``stuck'' tablet server by deleting it's lock in zookeeper:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
$ ./bin/accumulo org.apache.accumulo.server.util.TabletServerLocks --list
127.0.0.1:9997 TSERV_CLIENT=127.0.0.1:9997
$ ./bin/accumulo org.apache.accumulo.server.util.TabletServerLocks -delete 127.0.0.1:9997
$ ./bin/accumulo org.apache.accumulo.server.util.TabletServerLocks -list
127.0.0.1:9997 null
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
You can find the master and instance id for any accumulo instances using the same zookeeper instance:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
$ ./bin/accumulo org.apache.accumulo.server.util.ListInstances
INFO : Using ZooKeepers localhost:2181
Instance Name | Instance ID | Master
---------------------+--------------------------------------+-------------------------------
"test" | 6140b72e-edd8-4126-b2f5-e74a8bbe323b | 127.0.0.1:9999
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
\section{System Metadata Tables}
\label{sec:metadata}
@@ -458,8 +429,7 @@ table, such as its location and write-ahead logs, are stored in ZooKeeper.
Let's create a table and put some data into it:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
shell> createtable test
shell> tables -l
accumulo.metadata => !0
@@ -468,13 +438,11 @@ test => 2
trace => 1
shell> insert a b c d
shell> flush -w
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
Now let's take a look at the metadata for this table:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
shell> table accumulo.metadata
shell> scan -b 3; -e 3<
3< file:/default_tablet/F000009y.rf [] 186,1
@@ -486,30 +454,53 @@ shell> scan -b 3; -e 3<
3< srv:lock [] tservers/127.0.0.1:9997/zlock-0000000001$13fe86cd27101e5
3< srv:time [] M1373998392323
3< ~tab:~pr [] \x00
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
Let's decode this little session:
\begin{enumerate}
-\item{\texttt{scan -b 3; -e 3<} Every tablet gets its own row. Every row starts with the table id followed by ``;'' or ``<'', and followed by the end row split point for that tablet.}
-\item{\texttt{file:/default\_tablet/F000009y.rf [] 186,1} File entry for this tablet. This tablet contains a single file reference. The file is ``/accumulo/tables/3/default\_tablet/F000009y.rf''. It contains 1 key/value pair, and is 186 bytes long. }
-\item{\texttt{last:13fe86cd27101e5 [] 127.0.0.1:9997} Last location for this tablet. It was last held on 127.0.0.1:9997, and the unique tablet server lock data was ``13fe86cd27101e5''. The default balancer will tend to put tablets back on their last location. }
-\item{\texttt{loc:13fe86cd27101e5 [] 127.0.0.1:9997} The current location of this tablet.}
-\item{\texttt{log:127.0.0.1+9997/0cb7ce52-ac46-4bf7-ae1d-acdcfaa97995 [] 127.0.0.1+9997/0cb7ce52-ac46-4bf7-ae1d-acdcfaa97995|6} This tablet has a reference to a single write-ahead log. This file can be found in /accumulo/wal/127.0.0.1+9997/0cb7ce52-ac46-4bf7-ae1d-acdcfaa97995. The value of this entry could refer to multiple files. This tablet's data is encoded as ``6'' within the log.}
-\item{\texttt{srv:dir [] /default\_tablet} Files written for this tablet will be placed into /accumulo/tables/3/default\_tablet.}
-\item{\texttt{srv:flush [] 1} Flush id. This table has successfully completed the flush with the id of ``1''. }
-\item{\texttt{srv:lock [] tservers/127.0.0.1:9997/zlock-0000000001\$13fe86cd27101e5} This is the lock information for the tablet holding the present lock. This information is checked against zookeeper whenever this is updated, which prevents a metadata update from a tablet server that no longer holds its lock.}
+\item{\texttt{scan -b 3; -e 3<}\\
+ Every tablet gets its own row. Every row starts with the table id followed by
+ ``;'' or ``<'', and followed by the end row split point for that tablet.}
+\item{\texttt{file:/default\_tablet/F000009y.rf [] 186,1}\\
+ File entry for this tablet. This tablet contains a single file reference. The
+ file is ``/accumulo/tables/3/default\_tablet/F000009y.rf''. It contains 1
+ key/value pair, and is 186 bytes long.}
+\item{\texttt{last:13fe86cd27101e5 [] 127.0.0.1:9997}\\
+ Last location for this tablet. It was last held on 127.0.0.1:9997, and the
+ unique tablet server lock data was ``13fe86cd27101e5''. The default balancer
+ will tend to put tablets back on their last location.}
+\item{\texttt{loc:13fe86cd27101e5 [] 127.0.0.1:9997}\\
+ The current location of this tablet.}
+\item{\texttt{log:127.0.0.1+9997/0cb7ce52-ac46-4bf7-ae1d-acdcfaa97995 [] 127.0. ...}\\
+ This tablet has a reference to a single write-ahead log. This file can be found in\\
+ /accumulo/wal/127.0.0.1+9997/0cb7ce52-ac46-4bf7-ae1d-acdcfaa97995. The value
+ of this entry could refer to multiple files. This tablet's data is encoded as
+ ``6'' within the log.}
+\item{\texttt{srv:dir [] /default\_tablet}\\
+ Files written for this tablet will be placed into
+ /accumulo/tables/3/default\_tablet.}
+\item{\texttt{srv:flush [] 1}\\
+ Flush id. This table has successfully completed the flush with the id of
+ ``1''.}
+\item{\texttt{srv:lock [] tservers/127.0.0.1:9997/zlock-0000000001\$13fe86cd27101e5}\\
+ This is the lock information for the tablet holding the present lock. This
+ information is checked against zookeeper whenever this is updated, which
+ prevents a metadata update from a tablet server that no longer holds its
+ lock.}
\item{\texttt{srv:time [] M1373998392323} }
-\item{\texttt{~tab:~pr [] \\x00} The end-row marker for the previous tablet (prev-row). The first byte indicates the presence of a prev-row. This tablet has the range (-inf, +inf), so it has no prev-row (or end row). }
+\item{\texttt{\textasciitilde{}tab:\textasciitilde{}pr [] \textbackslash{}x00}\\
+ The end-row marker for the previous tablet (prev-row). The first byte
+ indicates the presence of a prev-row. This tablet has the range (-inf, +inf),
+ so it has no prev-row (or end row).}
\end{enumerate}
Besides these columns, you may see:
\begin{enumerate}
\item{\texttt{rowId future:zooKeeperID location} Tablet has been assigned to a tablet, but not yet loaded.}
-\item{\texttt{~del:filename} When a tablet server is done use a file, it will create a delete marker in the appropriate metadata table, unassociated with any tablet. The garbage collector will remove the marker, and the file, when no other reference to the file exists.}
-\item{\texttt{~blip:txid} Bulk-Load In Progress marker}
+\item{\texttt{\textasciitilde{}del:filename} When a tablet server is done use a file, it will create a delete marker in the appropriate metadata table, unassociated with any tablet. The garbage collector will remove the marker, and the file, when no other reference to the file exists.}
+\item{\texttt{\textasciitilde{}blip:txid} Bulk-Load In Progress marker}
\item{\texttt{rowId loaded:filename} A file has been bulk-loaded into this tablet, however the bulk load has not yet completed on other tablets, so this is marker prevents the file from being loaded multiple times.}
\item{\texttt{rowId !cloned} A marker that indicates that this tablet has been successfully cloned.}
\item{\texttt{rowId splitRatio:ratio} A marker that indicates a split is in progress, and the files are being split at the given ratio.}
@@ -524,22 +515,18 @@ Q. One of my Accumulo processes died. How do I bring it back?
The easiest way to bring all services online for an Accumulo instance is to run the ``start-all.sh`` script.
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
$ bin/start-all.sh
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
This process will check the process listing, using ``jps`` on each host before attempting to restart a service on the given host.
Typically, this check is sufficient except in the face of a hung/zombie process. For large clusters, it may be
undesirable to ssh to every node in the cluster to ensure that all hosts are running the appropriate processes and ``start-here.sh`` may be of use.
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
$ ssh host_with_dead_process
$ bin/start-here.sh
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
``start-here.sh`` should be invoked on the host which is missing a given process. Like start-all.sh, it will start all
necessary processes that are not currently running, but only on the current host and not cluster-wide. Tools such as ``pssh`` or
@@ -579,11 +566,9 @@ Q. How do find out which tablets are offline?
A. Use ``accumulo admin checkTablets''
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
$ bin/accumulo admin checkTablets
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
Q. I lost three data nodes, and I'm missing blocks in a WAL. I don't care about data loss, how
can I get those tablets online?
@@ -592,21 +577,17 @@ See the discussion in section~\ref{sec:metadata}, which shows a typical metadata
The entries with a column family of ``log'' are references to the WAL for that tablet.
If you know what WAL is bad, you can find all the references with a grep in the shell:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
shell> grep 0cb7ce52-ac46-4bf7-ae1d-acdcfaa97995
3< log:127.0.0.1+9997/0cb7ce52-ac46-4bf7-ae1d-acdcfaa97995 [] 127.0.0.1+9997/0cb7ce52-ac46-4bf7-ae1d-acdcfaa97995|6
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
A. You can remove the WAL references in the metadata table.
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
shell> grant -u root Table.WRITE -t accumulo.metadata
shell> delete 3< log 127.0.0.1+9997/0cb7ce52-ac46-4bf7-ae1d-acdcfaa97995
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
Note: the colon (``:'') is omitted when specifying the ``row cf cq'' for the delete command.
@@ -632,11 +613,9 @@ but the basic approach is:
\item Use ``tables -l'' in the shell to discover the table name to table id mapping
\item Stop all accumulo processes on all nodes
\item Move the accumulo directory in HDFS out of the way:
-\small
-\begin{verbatim}
+\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
$ hadoop fs -mv /accumulo /corrupt
-\end{verbatim}
-\normalsize
+\end{verbatim}\endgroup
\item Re-initalize accumulo
\item Recreate tables, users and permissions
\item Import the directories under \texttt{/corrupt/tables/<id>} into the new instance