You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@accumulo.apache.org by ec...@apache.org on 2013/12/11 15:16:16 UTC

[01/11] git commit: ACCUMULO-1956 expand docs, addition/decommision of cluster nodes

Updated Branches:
  refs/heads/1.4.5-SNAPSHOT 19a48da09 -> ff29f08a7
  refs/heads/1.5.1-SNAPSHOT 7655de68f -> e9423ae35
  refs/heads/1.6.0-SNAPSHOT 8f9258500 -> 0d874d05a
  refs/heads/master 71dc0527f -> f84e9a1f3


ACCUMULO-1956 expand docs, addition/decommision of cluster nodes

Signed-off-by: Eric Newton <er...@gmail.com>


Project: http://git-wip-us.apache.org/repos/asf/accumulo/repo
Commit: http://git-wip-us.apache.org/repos/asf/accumulo/commit/ff29f08a
Tree: http://git-wip-us.apache.org/repos/asf/accumulo/tree/ff29f08a
Diff: http://git-wip-us.apache.org/repos/asf/accumulo/diff/ff29f08a

Branch: refs/heads/1.4.5-SNAPSHOT
Commit: ff29f08a7d79be3baecb356a05444f342b74b620
Parents: 19a48da
Author: Alex Moundalexis <al...@clouderagovt.com>
Authored: Mon Dec 9 16:33:57 2013 -0500
Committer: Eric Newton <er...@gmail.com>
Committed: Wed Dec 11 09:15:04 2013 -0500

----------------------------------------------------------------------
 docs/administration.html                        | 23 ++++++++++++++
 .../src/user_manual/chapters/administration.tex | 32 ++++++++++++++++++++
 2 files changed, 55 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/accumulo/blob/ff29f08a/docs/administration.html
----------------------------------------------------------------------
diff --git a/docs/administration.html b/docs/administration.html
index b0c8e88..b8712dc 100644
--- a/docs/administration.html
+++ b/docs/administration.html
@@ -53,6 +53,29 @@ ask the master to shut down the tablet servers gracefully. If the tablet servers
 at the password prompt, and waiting 15 seconds for the script to force a shutdown. Normally, once the shutdown happens gracefully, unresponsive tablet servers are
 forcibly shut down after 5 seconds.
 
+<h3>Adding a Node</h3>
+
+<p>Update your <code>$ACCUMULO_HOME/conf/slaves</code> (or <code>$ACCUMULO_CONF_DIR/slaves</code>) file to account for the addition; at a minimum this needs to be on the host(s) being added, but in practice it's good to ensure consistent configuration across all nodes.</p>
+
+<pre>
+$ACCUMULO_HOME/bin/accumulo admin start &gt;host(s)&gt; {&lt;host&gt; ...}
+</pre>
+
+<p>Alternatively, you can ssh to each of the hosts you want to add and run <code>$ACCUMULO_HOME/bin/start-here.sh</code>.</p>
+
+<p>Make sure the host in question has the new configuration, or else the tablet server won't start.</p>
+
+<h3>Decomissioning a Node</h3>
+
+<p>If you need to take a node out of operation, you can trigger a graceful shutdown of a tablet server. Accumulo will automatically rebalance the tablets across the available tablet servers.</p>
+
+<pre>
+$ACCUMULO_HOME/bin/accumulo admin stop &gt;host(s)&gt; {&lt;host&gt; ...}
+</pre>
+
+<p>Alternatively, you can ssh to each of the hosts you want to remove and run <code>$ACCUMULO_HOME/bin/stop-here.sh</code>.</p>
+
+<p>Be sure to update your <code>$ACCUMULO_HOME/conf/slaves</code> (or <code>$ACCUMULO_CONF_DIR/slaves</code>) file to account for the removal of these hosts. Bear in mind that the monitor will not re-read the slaves file automatically, so it will report the decomissioned servers as down; it's recommended that you restart the monitor so that the node list is up to date.</p>
 
 <h3>Configuration</h3>
 <p>Accumulo configuration information is stored in a xml file and ZooKeeper.  System wide

http://git-wip-us.apache.org/repos/asf/accumulo/blob/ff29f08a/docs/src/user_manual/chapters/administration.tex
----------------------------------------------------------------------
diff --git a/docs/src/user_manual/chapters/administration.tex b/docs/src/user_manual/chapters/administration.tex
index f3feca5..d0533d9 100644
--- a/docs/src/user_manual/chapters/administration.tex
+++ b/docs/src/user_manual/chapters/administration.tex
@@ -184,6 +184,38 @@ To shutdown cleanly, run \texttt{bin/stop-all.sh} and the master will orchestrat
 shutdown of all the tablet servers. Shutdown waits for all minor compactions to finish, so it may
 take some time for particular configurations.
 
+\subsection{Adding a Node}
+
+Update your \texttt{\$ACCUMULO_HOME/conf/slaves} (or \texttt{\$ACCUMULO_CONF_DIR/slaves}) file to account for the addition.
+
+\begin{verbatim}
+$ACCUMULO_HOME/bin/accumulo admin start <host(s)> {<host> ...}
+\end{verbatim}
+
+Alternatively, you can ssh to each of the hosts you want to add and run 
+\texttt{\$ACCUMULO_HOME/bin/start-here.sh}.
+
+Make sure the host in question has the new configuration, or else the tablet 
+server won't start; at a minimum this needs to be on the host(s) being added, 
+but in practice it's good to ensure consistent configuration across all nodes.
+
+\subsection{Decomissioning a Node}
+
+If you need to take a node out of operation, you can trigger a graceful shutdown of a tablet 
+server. Accumulo will automatically rebalance the tablets across the available tablet servers.
+
+\begin{verbatim}
+$ACCUMULO_HOME/bin/accumulo admin stop <host(s)> {<host> ...}
+\end{verbatim}
+
+Alternatively, you can ssh to each of the hosts you want to remove and run 
+\texttt{\$ACCUMULO_HOME/bin/stop-here.sh}.
+
+Be sure to update your \texttt{\$ACCUMULO_HOME/conf/slaves} (or \texttt{\$ACCUMULO_CONF_DIR/slaves}) file to 
+account for the removal of these hosts. Bear in mind that the monitor will not re-read the 
+slaves file automatically, so it will report the decomissioned servers as down; it's 
+recommended that you restart the monitor so that the node list is up to date.
+
 \section{Monitoring}
 
 The Accumulo Master provides an interface for monitoring the status and health of


[06/11] git commit: Merge branch '1.4.5-SNAPSHOT' into 1.5.1-SNAPSHOT

Posted by ec...@apache.org.
Merge branch '1.4.5-SNAPSHOT' into 1.5.1-SNAPSHOT


Project: http://git-wip-us.apache.org/repos/asf/accumulo/repo
Commit: http://git-wip-us.apache.org/repos/asf/accumulo/commit/e9423ae3
Tree: http://git-wip-us.apache.org/repos/asf/accumulo/tree/e9423ae3
Diff: http://git-wip-us.apache.org/repos/asf/accumulo/diff/e9423ae3

Branch: refs/heads/master
Commit: e9423ae350ecd2e8f7bbafc9380713e8fd7fd644
Parents: 7655de6 ff29f08
Author: Eric Newton <er...@gmail.com>
Authored: Wed Dec 11 09:15:40 2013 -0500
Committer: Eric Newton <er...@gmail.com>
Committed: Wed Dec 11 09:15:40 2013 -0500

----------------------------------------------------------------------
 docs/administration.html                        | 23 ++++++++++++++
 .../chapters/administration.tex                 | 32 ++++++++++++++++++++
 2 files changed, 55 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/accumulo/blob/e9423ae3/docs/administration.html
----------------------------------------------------------------------

http://git-wip-us.apache.org/repos/asf/accumulo/blob/e9423ae3/docs/src/main/latex/accumulo_user_manual/chapters/administration.tex
----------------------------------------------------------------------
diff --cc docs/src/main/latex/accumulo_user_manual/chapters/administration.tex
index cc7697c,0000000..086f41c
mode 100644,000000..100644
--- a/docs/src/main/latex/accumulo_user_manual/chapters/administration.tex
+++ b/docs/src/main/latex/accumulo_user_manual/chapters/administration.tex
@@@ -1,362 -1,0 +1,394 @@@
 +
 +% Licensed to the Apache Software Foundation (ASF) under one or more
 +% contributor license agreements. See the NOTICE file distributed with
 +% this work for additional information regarding copyright ownership.
 +% The ASF licenses this file to You under the Apache License, Version 2.0
 +% (the "License"); you may not use this file except in compliance with
 +% the License. You may obtain a copy of the License at
 +%
 +%     http://www.apache.org/licenses/LICENSE-2.0
 +%
 +% Unless required by applicable law or agreed to in writing, software
 +% distributed under the License is distributed on an "AS IS" BASIS,
 +% WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 +% See the License for the specific language governing permissions and
 +% limitations under the License.
 +
 +\chapter{Administration}
 +
 +\section{Hardware}
 +
 +Because we are running essentially two or three systems simultaneously layered
 +across the cluster: HDFS, Accumulo and MapReduce, it is typical for hardware to
 +consist of 4 to 8 cores, and 8 to 32 GB RAM. This is so each running process can have
 +at least one core and 2 - 4 GB each.
 +
 +One core running HDFS can typically keep 2 to 4 disks busy, so each machine may
 +typically have as little as 2 x 300GB disks and as much as 4 x 1TB or 2TB disks.
 +
 +It is possible to do with less than this, such as with 1u servers with 2 cores and 4GB
 +each, but in this case it is recommended to only run up to two processes per
 +machine - i.e. DataNode and TabletServer or DataNode and MapReduce worker but
 +not all three. The constraint here is having enough available heap space for all the
 +processes on a machine.
 +
 +\section{Network}
 +
 +Accumulo communicates via remote procedure calls over TCP/IP for both passing
 +data and control messages. In addition, Accumulo uses HDFS clients to
 +communicate with HDFS. To achieve good ingest and query performance, sufficient
 +network bandwidth must be available between any two machines.
 +
 +\section{Installation}
 +Choose a directory for the Accumulo installation. This directory will be referenced
 +by the environment variable \texttt{\$ACCUMULO\_HOME}. Run the following:
 +
 +\small
 +\begin{verbatim}
 +$ tar xzf accumulo-1.5.0-bin.tar.gz    # unpack to subdirectory
 +$ mv accumulo-1.5.0 $ACCUMULO_HOME # move to desired location
 +\end{verbatim}
 +\normalsize
 +
 +Repeat this step at each machine within the cluster. Usually all machines have the
 +same \texttt{\$ACCUMULO\_HOME}.
 +
 +\section{Dependencies}
 +Accumulo requires HDFS and ZooKeeper to be configured and running
 +before starting. Password-less SSH should be configured between at least the
 +Accumulo master and TabletServer machines. It is also a good idea to run Network
 +Time Protocol (NTP) within the cluster to ensure nodes' clocks don't get too out of
 +sync, which can cause problems with automatically timestamped data. 
 +
 +\section{Configuration}
 +
 +Accumulo is configured by editing several Shell and XML files found in
 +\texttt{\$ACCUMULO\_HOME/conf}. The structure closely resembles Hadoop's configuration
 +files.
 +
 +\subsection{Edit conf/accumulo-env.sh}
 +
 +Accumulo needs to know where to find the software it depends on. Edit accumulo-env.sh 
 +and specify the following:
 +
 +\begin{enumerate}
 +\item{Enter the location of the installation directory of Accumulo for \texttt{\$ACCUMULO\_HOME}}
 +\item{Enter your system's Java home for \texttt{\$JAVA\_HOME}}
 +\item{Enter the location of Hadoop for \texttt{\$HADOOP\_PREFIX}}
 +\item{Choose a location for Accumulo logs and enter it for \texttt{\$ACCUMULO\_LOG\_DIR}}
 +\item{Enter the location of ZooKeeper for \texttt{\$ZOOKEEPER\_HOME}}
 +\end{enumerate}
 +
 +By default Accumulo TabletServers are set to use 1GB of memory. You may change
 +this by altering the value of \texttt{\$ACCUMULO\_TSERVER\_OPTS}. Note the syntax is that of
 +the Java JVM command line options. This value should be less than the physical
 +memory of the machines running TabletServers.
 +
 +There are similar options for the master's memory usage and the garbage collector
 +process. Reduce these if they exceed the physical RAM of your hardware and
 +increase them, within the bounds of the physical RAM, if a process fails because of
 +insufficient memory.
 +
 +Note that you will be specifying the Java heap space in accumulo-env.sh. You should
 +make sure that the total heap space used for the Accumulo tserver and the Hadoop
 +DataNode and TaskTracker is less than the available memory on each slave node in
 +the cluster. On large clusters, it is recommended that the Accumulo master, Hadoop
 +NameNode, secondary NameNode, and Hadoop JobTracker all be run on separate
 +machines to allow them to use more heap space. If you are running these on the
 +same machine on a small cluster, likewise make sure their heap space settings fit
 +within the available memory.
 +
 +\subsection{Native Map}
 +
 +The tablet server uses a data structure called a MemTable to store sorted key/value
 +pairs in memory when they are first received from the client. When a minor compaction
 +occurs, this data structure is written to HDFS. The MemTable will default to using
 +memory in the JVM but a JNI version, called the native map, can be used to significantly
 +speed up performance by utilizing the memory space of the native operating system. The
 +native map also avoids the performance implications brought on by garbage collection
 +in the JVM by causing it to pause much less frequently.
 +
 +32-bit and 64-bit Linux versions of the native map ship with the Accumulo dist package.
 +For other operating systems, the native map can be built from the codebase in two ways-
 +from maven or from the Makefile.
 +
 +\begin{enumerate}
 +\item{Build from maven using the following command: \texttt{mvn clean package -Pnative.}}
 +\item{Build from the c++ source by running \texttt{make} in the \texttt{\$ACCUMULO\_HOME/server/src/main/c++} directory.}
 +\end{enumerate}
 +
 +After building the native map from the source, you will find the artifact in
 +\texttt{\$ACCUMULO\_HOME/lib/native.} Upon starting up, the tablet server will look
 +in this directory for the map library. If the file is renamed or moved from its
 +target directory, the tablet server may not be able to find it.
 +
 +\subsection{Cluster Specification}
 +
 +On the machine that will serve as the Accumulo master:
 +
 +\begin{enumerate}
 +\item{Write the IP address or domain name of the Accumulo Master to the\\\texttt{\$ACCUMULO\_HOME/conf/masters} file.}
 +\item{Write the IP addresses or domain name of the machines that will be TabletServers in\\\texttt{\$ACCUMULO\_HOME/conf/slaves}, one per line.}
 +\end{enumerate}
 +
 +Note that if using domain names rather than IP addresses, DNS must be configured
 +properly for all machines participating in the cluster. DNS can be a confusing source
 +of errors.
 +
 +\subsection{Accumulo Settings}
 +Specify appropriate values for the following settings in\\
 +\texttt{\$ACCUMULO\_HOME/conf/accumulo-site.xml} :
 +
 +\small
 +\begin{verbatim}
 +<property>
 +    <name>zookeeper</name>
 +    <value>zooserver-one:2181,zooserver-two:2181</value>
 +    <description>list of zookeeper servers</description>
 +</property>
 +\end{verbatim}
 +\normalsize
 +
 +This enables Accumulo to find ZooKeeper. Accumulo uses ZooKeeper to coordinate
 +settings between processes and helps finalize TabletServer failure.
 +
 +
 +\small
 +\begin{verbatim}
 +<property>
 +    <name>instance.secret</name>
 +    <value>DEFAULT</value>
 +</property>
 +\end{verbatim}
 +\normalsize
 +
 +The instance needs a secret to enable secure communication between servers. Configure your
 +secret and make sure that the \texttt{accumulo-site.xml} file is not readable to other users.
 +
 +Some settings can be modified via the Accumulo shell and take effect immediately, but
 +some settings require a process restart to take effect. See the configuration documentation
 +(available on the monitor web pages) for details.
 +
 +\subsection{Deploy Configuration}
 +
 +Copy the masters, slaves, accumulo-env.sh, and if necessary, accumulo-site.xml
 +from the\\\texttt{\$ACCUMULO\_HOME/conf/} directory on the master to all the machines
 +specified in the slaves file.
 +
 +\section{Initialization}
 +
 +Accumulo must be initialized to create the structures it uses internally to locate
 +data across the cluster. HDFS is required to be configured and running before
 +Accumulo can be initialized.
 +
 +Once HDFS is started, initialization can be performed by executing\\
 +\texttt{\$ACCUMULO\_HOME/bin/accumulo init} . This script will prompt for a name
 +for this instance of Accumulo. The instance name is used to identify a set of tables
 +and instance-specific settings. The script will then write some information into
 +HDFS so Accumulo can start properly.
 +
 +The initialization script will prompt you to set a root password. Once Accumulo is
 +initialized it can be started.
 +
 +\section{Running}
 +
 +\subsection{Starting Accumulo}
 +
 +Make sure Hadoop is configured on all of the machines in the cluster, including
 +access to a shared HDFS instance. Make sure HDFS and ZooKeeper are running.
 +Make sure ZooKeeper is configured and running on at least one machine in the
 +cluster.
 +Start Accumulo using the \texttt{bin/start-all.sh} script.
 +
 +To verify that Accumulo is running, check the Status page as described under
 +\emph{Monitoring}. In addition, the Shell can provide some information about the status of
 +tables via reading the !METADATA table.
 +
 +\subsection{Stopping Accumulo}
 +
 +To shutdown cleanly, run \texttt{bin/stop-all.sh} and the master will orchestrate the
 +shutdown of all the tablet servers. Shutdown waits for all minor compactions to finish, so it may
 +take some time for particular configurations.
 +
++\subsection{Adding a Node}
++
++Update your \texttt{\$ACCUMULO_HOME/conf/slaves} (or \texttt{\$ACCUMULO_CONF_DIR/slaves}) file to account for the addition.
++
++\begin{verbatim}
++$ACCUMULO_HOME/bin/accumulo admin start <host(s)> {<host> ...}
++\end{verbatim}
++
++Alternatively, you can ssh to each of the hosts you want to add and run 
++\texttt{\$ACCUMULO_HOME/bin/start-here.sh}.
++
++Make sure the host in question has the new configuration, or else the tablet 
++server won't start; at a minimum this needs to be on the host(s) being added, 
++but in practice it's good to ensure consistent configuration across all nodes.
++
++\subsection{Decomissioning a Node}
++
++If you need to take a node out of operation, you can trigger a graceful shutdown of a tablet 
++server. Accumulo will automatically rebalance the tablets across the available tablet servers.
++
++\begin{verbatim}
++$ACCUMULO_HOME/bin/accumulo admin stop <host(s)> {<host> ...}
++\end{verbatim}
++
++Alternatively, you can ssh to each of the hosts you want to remove and run 
++\texttt{\$ACCUMULO_HOME/bin/stop-here.sh}.
++
++Be sure to update your \texttt{\$ACCUMULO_HOME/conf/slaves} (or \texttt{\$ACCUMULO_CONF_DIR/slaves}) file to 
++account for the removal of these hosts. Bear in mind that the monitor will not re-read the 
++slaves file automatically, so it will report the decomissioned servers as down; it's 
++recommended that you restart the monitor so that the node list is up to date.
++
 +\section{Monitoring}
 +
 +The Accumulo Master provides an interface for monitoring the status and health of
 +Accumulo components. This interface can be accessed by pointing a web browser to\\
 +\texttt{http://accumulomaster:50095/status}
 +
 +\section{Tracing}
 +It can be difficult to determine why some operations are taking longer
 +than expected. For example, you may be looking up items with very low
 +latency, but sometimes the lookups take much longer. Determining the
 +cause of the delay is difficult because the system is distributed, and
 +the typical lookup is fast.
 +
 +Accumulo has been instrumented to record the time that various
 +operations take when tracing is turned on. The fact that tracing is
 +enabled follows all the requests made on behalf of the user throughout
 +the distributed infrastructure of accumulo, and across all threads of
 +execution.
 +
 +These time spans will be inserted into the \texttt{trace} table in
 +Accumulo. You can browse recent traces from the Accumulo monitor
 +page. You can also read the \texttt{trace} table directly like any
 +other table.
 +
 +The design of Accumulo's distributed tracing follows that of
 +\href{http://research.google.com/pubs/pub36356.html}{Google's Dapper}.
 +
 +\subsection{Tracers}
 +To collect traces, Accumulo needs at least one server listed in
 +\\\texttt{\$ACCUMULO\_HOME/conf/tracers}. The server collects traces
 +from clients and writes them to the \texttt{trace} table. The Accumulo
 +user that the tracer connects to Accumulo with can be configured with
 +the following properties
 +
 +\begin{verbatim}
 +trace.user
 +trace.token.property.password
 +\end{verbatim}
 +
 +\subsection{Instrumenting a Client}
 +Tracing can be used to measure a client operation, such as a scan, as
 +the operation traverses the distributed system. To enable tracing for
 +your application call
 +
 +\begin{verbatim}
 +DistributedTrace.enable(instance, new ZooReader(instance), hostname, "myApplication");
 +\end{verbatim}
 +
 +Once tracing has been enabled, a client can wrap an operation in a trace.
 +
 +\begin{verbatim}
 +Trace.on("Client Scan");
 +BatchScanner scanner = conn.createBatchScanner(...);
 +// Configure your scanner
 +for (Entry entry : scanner) {
 +}
 +Trace.off();
 +\end{verbatim}
 +
 +Additionally, the user can create additional Spans within a Trace.
 +\begin{verbatim}
 +Trace.on("Client Update");
 +...
 +Span readSpan = Trace.start("Read");
 +...
 +readSpan.stop();
 +...
 +Span writeSpan = Trace.start("Write");
 +...
 +writeSpan.stop();
 +Trace.off();
 +\end{verbatim}
 +
 +Like Dapper, Accumulo tracing supports user defined annotations to associate additional data with a Trace.
 +\begin{verbatim}
 +...
 +int numberOfEntriesRead = 0;
 +Span readSpan = Trace.start("Read");
 +// Do the read, update the counter
 +...
 +readSpan.data("Number of Entries Read", String.valueOf(numberOfEntriesRead));
 +\end{verbatim}
 +
 +Some client operations may have a high volume within your
 +application. As such, you may wish to only sample a percentage of
 +operations for tracing. As seen below, the CountSampler can be used to
 +help enable tracing for 1-in-1000 operations
 +\begin{verbatim}
 +Sampler sampler = new CountSampler(1000);
 +...
 +if (sampler.next()) {
 +  Trace.on("Read");
 +}
 +...
 +Trace.offNoFlush();
 +\end{verbatim}
 +
 +It should be noted that it is safe to turn off tracing even if it
 +isn't currently active. The Trace.offNoFlush() should be used if the
 +user does not wish to have Trace.off() block while flushing trace
 +data.
 +
 +\subsection{Viewing Collected Traces}
 +To view collected traces, use the "Recent Traces" link on the Monitor
 +UI. You can also programmatically access and print traces using the
 +\texttt{TraceDump} class.
 +
 +\subsection{Tracing from the Shell}
 +You can enable tracing for operations run from the shell by using the
 +\texttt{trace on} and \texttt{trace off} commands.
 +
 +\begin{verbatim}
 +root@test test> trace on
 +root@test test> scan
 +a b:c []    d
 +root@test test> trace off
 +Waiting for trace information
 +Waiting for trace information
 +Trace started at 2013/08/26 13:24:08.332
 +Time  Start  Service@Location       Name
 + 3628+0      shell@localhost shell:root
 +    8+1690     shell@localhost scan
 +    7+1691       shell@localhost scan:location
 +    6+1692         tserver@localhost startScan
 +    5+1692           tserver@localhost tablet read ahead 6
 +\end{verbatim}
 +
 +\section{Logging}
 +Accumulo processes each write to a set of log files. By default these are found under\\
 +\texttt{\$ACCUMULO/logs/}.
 +
 +\section{Recovery}
 +
 +In the event of TabletServer failure or error on shutting Accumulo down, some
 +mutations may not have been minor compacted to HDFS properly. In this case,
 +Accumulo will automatically reapply such mutations from the write-ahead log
 +either when the tablets from the failed server are reassigned by the Master, in the
 +case of a single TabletServer failure or the next time Accumulo starts, in the event of
 +failure during shutdown.
 +
 +Recovery is performed by asking a tablet server to sort the logs so that tablets can easily find their missing
 +updates. The sort status of each file is displayed on
 +Accumulo monitor status page. Once the recovery is complete any
 +tablets involved should return to an ``online" state. Until then those tablets will be
 +unavailable to clients.
 +
 +The Accumulo client library is configured to retry failed mutations and in many
 +cases clients will be able to continue processing after the recovery process without
 +throwing an exception.
 +


[08/11] git commit: Merge branch '1.5.1-SNAPSHOT' into 1.6.0-SNAPSHOT

Posted by ec...@apache.org.
Merge branch '1.5.1-SNAPSHOT' into 1.6.0-SNAPSHOT


Project: http://git-wip-us.apache.org/repos/asf/accumulo/repo
Commit: http://git-wip-us.apache.org/repos/asf/accumulo/commit/0d874d05
Tree: http://git-wip-us.apache.org/repos/asf/accumulo/tree/0d874d05
Diff: http://git-wip-us.apache.org/repos/asf/accumulo/diff/0d874d05

Branch: refs/heads/master
Commit: 0d874d05a67950d9d5d9bec96de22e71f946b62c
Parents: 8f92585 e9423ae
Author: Eric Newton <er...@gmail.com>
Authored: Wed Dec 11 09:15:55 2013 -0500
Committer: Eric Newton <er...@gmail.com>
Committed: Wed Dec 11 09:15:55 2013 -0500

----------------------------------------------------------------------
 .../chapters/administration.tex                 | 32 ++++++++++++++++++++
 .../src/main/resources/docs/administration.html | 23 ++++++++++++++
 2 files changed, 55 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/accumulo/blob/0d874d05/docs/src/main/latex/accumulo_user_manual/chapters/administration.tex
----------------------------------------------------------------------

http://git-wip-us.apache.org/repos/asf/accumulo/blob/0d874d05/server/monitor/src/main/resources/docs/administration.html
----------------------------------------------------------------------
diff --cc server/monitor/src/main/resources/docs/administration.html
index 96f4a8b,0000000..51b1c31
mode 100644,000000..100644
--- a/server/monitor/src/main/resources/docs/administration.html
+++ b/server/monitor/src/main/resources/docs/administration.html
@@@ -1,148 -1,0 +1,171 @@@
 +<!--
 +  Licensed to the Apache Software Foundation (ASF) under one or more
 +  contributor license agreements.  See the NOTICE file distributed with
 +  this work for additional information regarding copyright ownership.
 +  The ASF licenses this file to You under the Apache License, Version 2.0
 +  (the "License"); you may not use this file except in compliance with
 +  the License.  You may obtain a copy of the License at
 +
 +      http://www.apache.org/licenses/LICENSE-2.0
 +
 +  Unless required by applicable law or agreed to in writing, software
 +  distributed under the License is distributed on an "AS IS" BASIS,
 +  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 +  See the License for the specific language governing permissions and
 +  limitations under the License.
 +-->
 +<html>
 +<head>
 +<title>Accumulo Administration</title>
 +<link rel='stylesheet' type='text/css' href='documentation.css' media='screen'/>
 +</head>
 +<body>
 +
 +<h1>Apache Accumulo Documentation : Administration</h1>
 +
 +<h3>Starting accumulo for the first time</h3>
 +
 +<p>For the most part, accumulo is ready to go out of the box. To start it, first you must distribute and install
 +the accumulo software to each machine in the cloud that you wish to run on. The software should be installed
 +in the same directory on each machine and configured identically (or at least similarly... see the configuration
 +sections for more details). Select one machine to be your bootstrap machine, the one that you will start accumulo
 +with. Note that you must have passphrase-less ssh access to each machine from your bootstrap machine. On this machine,
 +create a conf/masters and conf/slaves file. In the masters file, type the hostname of the machine you wish to run the master on (probably localhost).
 +In the slaves file, type the hostnames, separated by newlines of each machine you wish to participate in accumulo as a tablet server. If you neglect
 +to create these files, the startup scripts will assume you are trying to run on localhost only, and will instantiate a single-node instance only.
 +It is probably a good idea to back up these files, or distribute them to the other nodes as well, so that you can easily boot up accumulo
 +from another machine, if necessary. You can also make create a <code>conf/accumulo-env.sh</code> file if you want to configure any custom environment variables.
 +
 +<p>Once properly configured, you can initialize or prepare an instance of accumulo by running: <code>bin/accumulo&nbsp;init</code><br />
 +Follow the prompts and you are ready to go. This step only prepares accumulo to run, it does not start up accumulo.
 +
 +<h3>Starting accumulo</h3>
 +
 +<p>Once you have configured accumulo to your liking, and distributed the appropriate configuration to each machine, you can start accumulo with
 +bin/start-all.sh. If at any time, you wish to bring accumulo servers online after one or more have been shutdown, you can run bin/start-all.sh again.
 +This step will only start services that are not already running. Be aware that if you run this command on more than one machine, you may unintentionally
 +start an extra copy of the garbage collector service and the monitoring service, since each of these will run on the server on which you run this script.
 +
 +<h3>Stopping accumulo</h3>
 +
 +<p>Similar to the start-all.sh script, we provide a bin/stop-all.sh script to shut down accumulo. This will prompt for the root password so that it can
 +ask the master to shut down the tablet servers gracefully. If the tablet servers do not respond, or the master takes too long, you can force a shutdown by hitting Ctrl-C
 +at the password prompt, and waiting 15 seconds for the script to force a shutdown. Normally, once the shutdown happens gracefully, unresponsive tablet servers are
 +forcibly shut down after 5 seconds.
 +
++<h3>Adding a Node</h3>
++
++<p>Update your <code>$ACCUMULO_HOME/conf/slaves</code> (or <code>$ACCUMULO_CONF_DIR/slaves</code>) file to account for the addition; at a minimum this needs to be on the host(s) being added, but in practice it's good to ensure consistent configuration across all nodes.</p>
++
++<pre>
++$ACCUMULO_HOME/bin/accumulo admin start &gt;host(s)&gt; {&lt;host&gt; ...}
++</pre>
++
++<p>Alternatively, you can ssh to each of the hosts you want to add and run <code>$ACCUMULO_HOME/bin/start-here.sh</code>.</p>
++
++<p>Make sure the host in question has the new configuration, or else the tablet server won't start.</p>
++
++<h3>Decomissioning a Node</h3>
++
++<p>If you need to take a node out of operation, you can trigger a graceful shutdown of a tablet server. Accumulo will automatically rebalance the tablets across the available tablet servers.</p>
++
++<pre>
++$ACCUMULO_HOME/bin/accumulo admin stop &gt;host(s)&gt; {&lt;host&gt; ...}
++</pre>
++
++<p>Alternatively, you can ssh to each of the hosts you want to remove and run <code>$ACCUMULO_HOME/bin/stop-here.sh</code>.</p>
++
++<p>Be sure to update your <code>$ACCUMULO_HOME/conf/slaves</code> (or <code>$ACCUMULO_CONF_DIR/slaves</code>) file to account for the removal of these hosts. Bear in mind that the monitor will not re-read the slaves file automatically, so it will report the decomissioned servers as down; it's recommended that you restart the monitor so that the node list is up to date.</p>
 +
 +<h3>Configuration</h3>
 +<p>Accumulo configuration information is stored in a xml file and ZooKeeper.  System wide
 +configuration information is stored in accumulo-site.xml. In order for accumulo to
 +find this file its directory must be on the classpath.  Accumulo will log a warning if it can not find 
 +it, and will use built-in default values. The accumulo scripts try to put the config directory on the classpath.  
 +
 +<p>Starting with version 1.0, per-table configuration was
 +introduced. This information is stored in ZooKeeper. This information
 +can be manipulated using the config command in the accumulo
 +shell. ZooKeeper will notify all tablet servers when config properties
 +are modified. This makes it possible to change major compaction
 +settings, for example, for a table while accumulo is running.
 +
 +<p>Per-table configuration settings override system settings. 
 +
 +<p>See the possible configuration options and their default values <a href='config.html'>here</a>
 +
 +<h3>Managing system resources</h3>
 +
 +<p>It is very important how disk and memory usage are allocated across the cluster and how servers processes are allocated across the cluster. 
 +
 +<ul>
 + <li> On larger clusters, run the namenode, secondary namenode, jobtracker, accumulo master, and zookeepers on dedicated nodes.  On a smaller cluster you may want to run all master processes on one node.  When doing this ensure that the max total memory that could be used by all master processes does not exceed system memory.  Swapping on your single master node would not be good.
 + <li> Accumulo 1.2 and earlier rely on zookeeper but do not use it heavily.  On a large cluster setting up 3 or 5 zookeepers should be plenty.  Since there is no performance gain when running more zookeepers, fault tolerance is the only benefit.
 + <li> On slave nodes ensure the memory used by all slave processes is less than system memory.  For example the following slave node config could use up to 38G of RAM : tablet server 3G, logger 1G, data node 2G, up to 10 mappers each using 2G, and up 6 reducers each using 2G.  If the slave nodes only have 32G, then using 38G will result in swapping which could cause tablet server to lose their lock in zookeeper and die.  Even if swapping does not cause tablet servers to die, it will kill performance.
 + <li>Accumulo and map reduce will work with less memory, but it has an impact.  Accumulo will minor compact more frequently when it has less map memory, resulting in more major compactions.  The minor and major compactions both use CPU and HDFS I/O.   The same goes for map reduce, the less memory you give it, the more it has to sort and spill.  Try to minimize spilling and compactions as much as possible without causing swapping.
 + <li>Accumulo writes data to disk before it sorts it in memory.  This allows data that was in memory when a tablet server crashes to be recovered.  Each slave node needs a local directory to write this data to.  Ensure the file system holding this directory has at least 100G free on all nodes.  Also, if this directory is in a filesystem used by map reduce or hdfs they may effect each others performance.
 +</ul>
 +
 +<p>There are a few settings that determine how much memory accumulo tablet
 +servers use.  In accumulo-env.sh there is a setting called
 +ACCUMULO_TSERVER_OPTS.  By default this is set to something like "-Xmx512m
 +-Xms512m".  These are Java jvm options asking Java to use 512 megabytes of
 +memory.  By default accumulo stores data written to it outside of the Java
 +memory space in order to avoid pauses caused by the Java garbage collector.  The
 +amount of memory it uses for this data is determined by the accumulo setting
 +"tserver.memory.maps.max".  Since this memory is outside of the Java managed
 +memory, the process can grow larger than the -Xmx setting.  So if -Xmx is set
 +to 512M and tserver.memory.maps.max is set to 1G, a tablet server process can
 +be expected to use 1.5G.  If tserver.memory.maps.native.enabled is set to
 +false, then accumulo will only use memory managed by Java and the process will
 +not use more than what -Xmx is set to.  In this case the
 +tserver.memory.maps.max setting should be 75% of the -Xmx setting. 
 +
 +<h3>Swappiness</h3>
 +
 +<p>The linux kernel will swap out memory of running programs to increase
 +the size of the disk buffers.  This tendency to swap out is controlled by
 +a kernel setting called "swappiness."  This behavior does not work well for
 +large java servers.  When a java process runs a garbage collection, it touches
 +lots of pages forcing all swapped out pages back into memory.  It is suggested
 +that swappiness be set to zero.
 +
 +<pre>
 + # sysctl -w vm.swappiness=0
 + # echo "vm.swappiness = 0" &gt;&gt; /etc/sysctl.conf
 +</pre>
 +
 +<h3>Hadoop timeouts</h3>
 +
 +<p>In order to detect failed datanodes, use shorter timeouts.  Add the following to your
 +hdfs-site.xml file:
 +
 +<pre>
 +
 +  &lt;property&gt;
 +    &lt;name&gt;dfs.socket.timeout&lt;/name&gt;
 +    &lt;value&gt;3000&lt;/value&gt;
 +  &lt;/property&gt;
 +
 +  &lt;property&gt;
 +    &lt;name&gt;dfs.socket.write.timeout&lt;/name&gt;
 +    &lt;value&gt;5000&lt;/value&gt;
 +  &lt;/property&gt;
 +
 +  &lt;property&gt;
 +    &lt;name&gt;ipc.client.connect.timeout&lt;/name&gt;
 +    &lt;value&gt;1000&lt;/value&gt;
 +  &lt;/property&gt;
 +
 +  &lt;property&gt;
 +    &lt;name&gt;ipc.clident.connect.max.retries.on.timeouts&lt;/name&gt;
 +    &lt;value&gt;2&lt;/value&gt;
 +  &lt;/property&gt;
 +
 +
 +
 +</pre>
 +
 +
 +</body>
 +</html>


[04/11] git commit: ACCUMULO-1956 expand docs, addition/decommision of cluster nodes

Posted by ec...@apache.org.
ACCUMULO-1956 expand docs, addition/decommision of cluster nodes

Signed-off-by: Eric Newton <er...@gmail.com>


Project: http://git-wip-us.apache.org/repos/asf/accumulo/repo
Commit: http://git-wip-us.apache.org/repos/asf/accumulo/commit/ff29f08a
Tree: http://git-wip-us.apache.org/repos/asf/accumulo/tree/ff29f08a
Diff: http://git-wip-us.apache.org/repos/asf/accumulo/diff/ff29f08a

Branch: refs/heads/master
Commit: ff29f08a7d79be3baecb356a05444f342b74b620
Parents: 19a48da
Author: Alex Moundalexis <al...@clouderagovt.com>
Authored: Mon Dec 9 16:33:57 2013 -0500
Committer: Eric Newton <er...@gmail.com>
Committed: Wed Dec 11 09:15:04 2013 -0500

----------------------------------------------------------------------
 docs/administration.html                        | 23 ++++++++++++++
 .../src/user_manual/chapters/administration.tex | 32 ++++++++++++++++++++
 2 files changed, 55 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/accumulo/blob/ff29f08a/docs/administration.html
----------------------------------------------------------------------
diff --git a/docs/administration.html b/docs/administration.html
index b0c8e88..b8712dc 100644
--- a/docs/administration.html
+++ b/docs/administration.html
@@ -53,6 +53,29 @@ ask the master to shut down the tablet servers gracefully. If the tablet servers
 at the password prompt, and waiting 15 seconds for the script to force a shutdown. Normally, once the shutdown happens gracefully, unresponsive tablet servers are
 forcibly shut down after 5 seconds.
 
+<h3>Adding a Node</h3>
+
+<p>Update your <code>$ACCUMULO_HOME/conf/slaves</code> (or <code>$ACCUMULO_CONF_DIR/slaves</code>) file to account for the addition; at a minimum this needs to be on the host(s) being added, but in practice it's good to ensure consistent configuration across all nodes.</p>
+
+<pre>
+$ACCUMULO_HOME/bin/accumulo admin start &gt;host(s)&gt; {&lt;host&gt; ...}
+</pre>
+
+<p>Alternatively, you can ssh to each of the hosts you want to add and run <code>$ACCUMULO_HOME/bin/start-here.sh</code>.</p>
+
+<p>Make sure the host in question has the new configuration, or else the tablet server won't start.</p>
+
+<h3>Decomissioning a Node</h3>
+
+<p>If you need to take a node out of operation, you can trigger a graceful shutdown of a tablet server. Accumulo will automatically rebalance the tablets across the available tablet servers.</p>
+
+<pre>
+$ACCUMULO_HOME/bin/accumulo admin stop &gt;host(s)&gt; {&lt;host&gt; ...}
+</pre>
+
+<p>Alternatively, you can ssh to each of the hosts you want to remove and run <code>$ACCUMULO_HOME/bin/stop-here.sh</code>.</p>
+
+<p>Be sure to update your <code>$ACCUMULO_HOME/conf/slaves</code> (or <code>$ACCUMULO_CONF_DIR/slaves</code>) file to account for the removal of these hosts. Bear in mind that the monitor will not re-read the slaves file automatically, so it will report the decomissioned servers as down; it's recommended that you restart the monitor so that the node list is up to date.</p>
 
 <h3>Configuration</h3>
 <p>Accumulo configuration information is stored in a xml file and ZooKeeper.  System wide

http://git-wip-us.apache.org/repos/asf/accumulo/blob/ff29f08a/docs/src/user_manual/chapters/administration.tex
----------------------------------------------------------------------
diff --git a/docs/src/user_manual/chapters/administration.tex b/docs/src/user_manual/chapters/administration.tex
index f3feca5..d0533d9 100644
--- a/docs/src/user_manual/chapters/administration.tex
+++ b/docs/src/user_manual/chapters/administration.tex
@@ -184,6 +184,38 @@ To shutdown cleanly, run \texttt{bin/stop-all.sh} and the master will orchestrat
 shutdown of all the tablet servers. Shutdown waits for all minor compactions to finish, so it may
 take some time for particular configurations.
 
+\subsection{Adding a Node}
+
+Update your \texttt{\$ACCUMULO_HOME/conf/slaves} (or \texttt{\$ACCUMULO_CONF_DIR/slaves}) file to account for the addition.
+
+\begin{verbatim}
+$ACCUMULO_HOME/bin/accumulo admin start <host(s)> {<host> ...}
+\end{verbatim}
+
+Alternatively, you can ssh to each of the hosts you want to add and run 
+\texttt{\$ACCUMULO_HOME/bin/start-here.sh}.
+
+Make sure the host in question has the new configuration, or else the tablet 
+server won't start; at a minimum this needs to be on the host(s) being added, 
+but in practice it's good to ensure consistent configuration across all nodes.
+
+\subsection{Decomissioning a Node}
+
+If you need to take a node out of operation, you can trigger a graceful shutdown of a tablet 
+server. Accumulo will automatically rebalance the tablets across the available tablet servers.
+
+\begin{verbatim}
+$ACCUMULO_HOME/bin/accumulo admin stop <host(s)> {<host> ...}
+\end{verbatim}
+
+Alternatively, you can ssh to each of the hosts you want to remove and run 
+\texttt{\$ACCUMULO_HOME/bin/stop-here.sh}.
+
+Be sure to update your \texttt{\$ACCUMULO_HOME/conf/slaves} (or \texttt{\$ACCUMULO_CONF_DIR/slaves}) file to 
+account for the removal of these hosts. Bear in mind that the monitor will not re-read the 
+slaves file automatically, so it will report the decomissioned servers as down; it's 
+recommended that you restart the monitor so that the node list is up to date.
+
 \section{Monitoring}
 
 The Accumulo Master provides an interface for monitoring the status and health of


[03/11] git commit: ACCUMULO-1956 expand docs, addition/decommision of cluster nodes

Posted by ec...@apache.org.
ACCUMULO-1956 expand docs, addition/decommision of cluster nodes

Signed-off-by: Eric Newton <er...@gmail.com>


Project: http://git-wip-us.apache.org/repos/asf/accumulo/repo
Commit: http://git-wip-us.apache.org/repos/asf/accumulo/commit/ff29f08a
Tree: http://git-wip-us.apache.org/repos/asf/accumulo/tree/ff29f08a
Diff: http://git-wip-us.apache.org/repos/asf/accumulo/diff/ff29f08a

Branch: refs/heads/1.6.0-SNAPSHOT
Commit: ff29f08a7d79be3baecb356a05444f342b74b620
Parents: 19a48da
Author: Alex Moundalexis <al...@clouderagovt.com>
Authored: Mon Dec 9 16:33:57 2013 -0500
Committer: Eric Newton <er...@gmail.com>
Committed: Wed Dec 11 09:15:04 2013 -0500

----------------------------------------------------------------------
 docs/administration.html                        | 23 ++++++++++++++
 .../src/user_manual/chapters/administration.tex | 32 ++++++++++++++++++++
 2 files changed, 55 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/accumulo/blob/ff29f08a/docs/administration.html
----------------------------------------------------------------------
diff --git a/docs/administration.html b/docs/administration.html
index b0c8e88..b8712dc 100644
--- a/docs/administration.html
+++ b/docs/administration.html
@@ -53,6 +53,29 @@ ask the master to shut down the tablet servers gracefully. If the tablet servers
 at the password prompt, and waiting 15 seconds for the script to force a shutdown. Normally, once the shutdown happens gracefully, unresponsive tablet servers are
 forcibly shut down after 5 seconds.
 
+<h3>Adding a Node</h3>
+
+<p>Update your <code>$ACCUMULO_HOME/conf/slaves</code> (or <code>$ACCUMULO_CONF_DIR/slaves</code>) file to account for the addition; at a minimum this needs to be on the host(s) being added, but in practice it's good to ensure consistent configuration across all nodes.</p>
+
+<pre>
+$ACCUMULO_HOME/bin/accumulo admin start &gt;host(s)&gt; {&lt;host&gt; ...}
+</pre>
+
+<p>Alternatively, you can ssh to each of the hosts you want to add and run <code>$ACCUMULO_HOME/bin/start-here.sh</code>.</p>
+
+<p>Make sure the host in question has the new configuration, or else the tablet server won't start.</p>
+
+<h3>Decomissioning a Node</h3>
+
+<p>If you need to take a node out of operation, you can trigger a graceful shutdown of a tablet server. Accumulo will automatically rebalance the tablets across the available tablet servers.</p>
+
+<pre>
+$ACCUMULO_HOME/bin/accumulo admin stop &gt;host(s)&gt; {&lt;host&gt; ...}
+</pre>
+
+<p>Alternatively, you can ssh to each of the hosts you want to remove and run <code>$ACCUMULO_HOME/bin/stop-here.sh</code>.</p>
+
+<p>Be sure to update your <code>$ACCUMULO_HOME/conf/slaves</code> (or <code>$ACCUMULO_CONF_DIR/slaves</code>) file to account for the removal of these hosts. Bear in mind that the monitor will not re-read the slaves file automatically, so it will report the decomissioned servers as down; it's recommended that you restart the monitor so that the node list is up to date.</p>
 
 <h3>Configuration</h3>
 <p>Accumulo configuration information is stored in a xml file and ZooKeeper.  System wide

http://git-wip-us.apache.org/repos/asf/accumulo/blob/ff29f08a/docs/src/user_manual/chapters/administration.tex
----------------------------------------------------------------------
diff --git a/docs/src/user_manual/chapters/administration.tex b/docs/src/user_manual/chapters/administration.tex
index f3feca5..d0533d9 100644
--- a/docs/src/user_manual/chapters/administration.tex
+++ b/docs/src/user_manual/chapters/administration.tex
@@ -184,6 +184,38 @@ To shutdown cleanly, run \texttt{bin/stop-all.sh} and the master will orchestrat
 shutdown of all the tablet servers. Shutdown waits for all minor compactions to finish, so it may
 take some time for particular configurations.
 
+\subsection{Adding a Node}
+
+Update your \texttt{\$ACCUMULO_HOME/conf/slaves} (or \texttt{\$ACCUMULO_CONF_DIR/slaves}) file to account for the addition.
+
+\begin{verbatim}
+$ACCUMULO_HOME/bin/accumulo admin start <host(s)> {<host> ...}
+\end{verbatim}
+
+Alternatively, you can ssh to each of the hosts you want to add and run 
+\texttt{\$ACCUMULO_HOME/bin/start-here.sh}.
+
+Make sure the host in question has the new configuration, or else the tablet 
+server won't start; at a minimum this needs to be on the host(s) being added, 
+but in practice it's good to ensure consistent configuration across all nodes.
+
+\subsection{Decomissioning a Node}
+
+If you need to take a node out of operation, you can trigger a graceful shutdown of a tablet 
+server. Accumulo will automatically rebalance the tablets across the available tablet servers.
+
+\begin{verbatim}
+$ACCUMULO_HOME/bin/accumulo admin stop <host(s)> {<host> ...}
+\end{verbatim}
+
+Alternatively, you can ssh to each of the hosts you want to remove and run 
+\texttt{\$ACCUMULO_HOME/bin/stop-here.sh}.
+
+Be sure to update your \texttt{\$ACCUMULO_HOME/conf/slaves} (or \texttt{\$ACCUMULO_CONF_DIR/slaves}) file to 
+account for the removal of these hosts. Bear in mind that the monitor will not re-read the 
+slaves file automatically, so it will report the decomissioned servers as down; it's 
+recommended that you restart the monitor so that the node list is up to date.
+
 \section{Monitoring}
 
 The Accumulo Master provides an interface for monitoring the status and health of


[09/11] git commit: Merge branch '1.5.1-SNAPSHOT' into 1.6.0-SNAPSHOT

Posted by ec...@apache.org.
Merge branch '1.5.1-SNAPSHOT' into 1.6.0-SNAPSHOT


Project: http://git-wip-us.apache.org/repos/asf/accumulo/repo
Commit: http://git-wip-us.apache.org/repos/asf/accumulo/commit/0d874d05
Tree: http://git-wip-us.apache.org/repos/asf/accumulo/tree/0d874d05
Diff: http://git-wip-us.apache.org/repos/asf/accumulo/diff/0d874d05

Branch: refs/heads/1.6.0-SNAPSHOT
Commit: 0d874d05a67950d9d5d9bec96de22e71f946b62c
Parents: 8f92585 e9423ae
Author: Eric Newton <er...@gmail.com>
Authored: Wed Dec 11 09:15:55 2013 -0500
Committer: Eric Newton <er...@gmail.com>
Committed: Wed Dec 11 09:15:55 2013 -0500

----------------------------------------------------------------------
 .../chapters/administration.tex                 | 32 ++++++++++++++++++++
 .../src/main/resources/docs/administration.html | 23 ++++++++++++++
 2 files changed, 55 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/accumulo/blob/0d874d05/docs/src/main/latex/accumulo_user_manual/chapters/administration.tex
----------------------------------------------------------------------

http://git-wip-us.apache.org/repos/asf/accumulo/blob/0d874d05/server/monitor/src/main/resources/docs/administration.html
----------------------------------------------------------------------
diff --cc server/monitor/src/main/resources/docs/administration.html
index 96f4a8b,0000000..51b1c31
mode 100644,000000..100644
--- a/server/monitor/src/main/resources/docs/administration.html
+++ b/server/monitor/src/main/resources/docs/administration.html
@@@ -1,148 -1,0 +1,171 @@@
 +<!--
 +  Licensed to the Apache Software Foundation (ASF) under one or more
 +  contributor license agreements.  See the NOTICE file distributed with
 +  this work for additional information regarding copyright ownership.
 +  The ASF licenses this file to You under the Apache License, Version 2.0
 +  (the "License"); you may not use this file except in compliance with
 +  the License.  You may obtain a copy of the License at
 +
 +      http://www.apache.org/licenses/LICENSE-2.0
 +
 +  Unless required by applicable law or agreed to in writing, software
 +  distributed under the License is distributed on an "AS IS" BASIS,
 +  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 +  See the License for the specific language governing permissions and
 +  limitations under the License.
 +-->
 +<html>
 +<head>
 +<title>Accumulo Administration</title>
 +<link rel='stylesheet' type='text/css' href='documentation.css' media='screen'/>
 +</head>
 +<body>
 +
 +<h1>Apache Accumulo Documentation : Administration</h1>
 +
 +<h3>Starting accumulo for the first time</h3>
 +
 +<p>For the most part, accumulo is ready to go out of the box. To start it, first you must distribute and install
 +the accumulo software to each machine in the cloud that you wish to run on. The software should be installed
 +in the same directory on each machine and configured identically (or at least similarly... see the configuration
 +sections for more details). Select one machine to be your bootstrap machine, the one that you will start accumulo
 +with. Note that you must have passphrase-less ssh access to each machine from your bootstrap machine. On this machine,
 +create a conf/masters and conf/slaves file. In the masters file, type the hostname of the machine you wish to run the master on (probably localhost).
 +In the slaves file, type the hostnames, separated by newlines of each machine you wish to participate in accumulo as a tablet server. If you neglect
 +to create these files, the startup scripts will assume you are trying to run on localhost only, and will instantiate a single-node instance only.
 +It is probably a good idea to back up these files, or distribute them to the other nodes as well, so that you can easily boot up accumulo
 +from another machine, if necessary. You can also make create a <code>conf/accumulo-env.sh</code> file if you want to configure any custom environment variables.
 +
 +<p>Once properly configured, you can initialize or prepare an instance of accumulo by running: <code>bin/accumulo&nbsp;init</code><br />
 +Follow the prompts and you are ready to go. This step only prepares accumulo to run, it does not start up accumulo.
 +
 +<h3>Starting accumulo</h3>
 +
 +<p>Once you have configured accumulo to your liking, and distributed the appropriate configuration to each machine, you can start accumulo with
 +bin/start-all.sh. If at any time, you wish to bring accumulo servers online after one or more have been shutdown, you can run bin/start-all.sh again.
 +This step will only start services that are not already running. Be aware that if you run this command on more than one machine, you may unintentionally
 +start an extra copy of the garbage collector service and the monitoring service, since each of these will run on the server on which you run this script.
 +
 +<h3>Stopping accumulo</h3>
 +
 +<p>Similar to the start-all.sh script, we provide a bin/stop-all.sh script to shut down accumulo. This will prompt for the root password so that it can
 +ask the master to shut down the tablet servers gracefully. If the tablet servers do not respond, or the master takes too long, you can force a shutdown by hitting Ctrl-C
 +at the password prompt, and waiting 15 seconds for the script to force a shutdown. Normally, once the shutdown happens gracefully, unresponsive tablet servers are
 +forcibly shut down after 5 seconds.
 +
++<h3>Adding a Node</h3>
++
++<p>Update your <code>$ACCUMULO_HOME/conf/slaves</code> (or <code>$ACCUMULO_CONF_DIR/slaves</code>) file to account for the addition; at a minimum this needs to be on the host(s) being added, but in practice it's good to ensure consistent configuration across all nodes.</p>
++
++<pre>
++$ACCUMULO_HOME/bin/accumulo admin start &gt;host(s)&gt; {&lt;host&gt; ...}
++</pre>
++
++<p>Alternatively, you can ssh to each of the hosts you want to add and run <code>$ACCUMULO_HOME/bin/start-here.sh</code>.</p>
++
++<p>Make sure the host in question has the new configuration, or else the tablet server won't start.</p>
++
++<h3>Decomissioning a Node</h3>
++
++<p>If you need to take a node out of operation, you can trigger a graceful shutdown of a tablet server. Accumulo will automatically rebalance the tablets across the available tablet servers.</p>
++
++<pre>
++$ACCUMULO_HOME/bin/accumulo admin stop &gt;host(s)&gt; {&lt;host&gt; ...}
++</pre>
++
++<p>Alternatively, you can ssh to each of the hosts you want to remove and run <code>$ACCUMULO_HOME/bin/stop-here.sh</code>.</p>
++
++<p>Be sure to update your <code>$ACCUMULO_HOME/conf/slaves</code> (or <code>$ACCUMULO_CONF_DIR/slaves</code>) file to account for the removal of these hosts. Bear in mind that the monitor will not re-read the slaves file automatically, so it will report the decomissioned servers as down; it's recommended that you restart the monitor so that the node list is up to date.</p>
 +
 +<h3>Configuration</h3>
 +<p>Accumulo configuration information is stored in a xml file and ZooKeeper.  System wide
 +configuration information is stored in accumulo-site.xml. In order for accumulo to
 +find this file its directory must be on the classpath.  Accumulo will log a warning if it can not find 
 +it, and will use built-in default values. The accumulo scripts try to put the config directory on the classpath.  
 +
 +<p>Starting with version 1.0, per-table configuration was
 +introduced. This information is stored in ZooKeeper. This information
 +can be manipulated using the config command in the accumulo
 +shell. ZooKeeper will notify all tablet servers when config properties
 +are modified. This makes it possible to change major compaction
 +settings, for example, for a table while accumulo is running.
 +
 +<p>Per-table configuration settings override system settings. 
 +
 +<p>See the possible configuration options and their default values <a href='config.html'>here</a>
 +
 +<h3>Managing system resources</h3>
 +
 +<p>It is very important how disk and memory usage are allocated across the cluster and how servers processes are allocated across the cluster. 
 +
 +<ul>
 + <li> On larger clusters, run the namenode, secondary namenode, jobtracker, accumulo master, and zookeepers on dedicated nodes.  On a smaller cluster you may want to run all master processes on one node.  When doing this ensure that the max total memory that could be used by all master processes does not exceed system memory.  Swapping on your single master node would not be good.
 + <li> Accumulo 1.2 and earlier rely on zookeeper but do not use it heavily.  On a large cluster setting up 3 or 5 zookeepers should be plenty.  Since there is no performance gain when running more zookeepers, fault tolerance is the only benefit.
 + <li> On slave nodes ensure the memory used by all slave processes is less than system memory.  For example the following slave node config could use up to 38G of RAM : tablet server 3G, logger 1G, data node 2G, up to 10 mappers each using 2G, and up 6 reducers each using 2G.  If the slave nodes only have 32G, then using 38G will result in swapping which could cause tablet server to lose their lock in zookeeper and die.  Even if swapping does not cause tablet servers to die, it will kill performance.
 + <li>Accumulo and map reduce will work with less memory, but it has an impact.  Accumulo will minor compact more frequently when it has less map memory, resulting in more major compactions.  The minor and major compactions both use CPU and HDFS I/O.   The same goes for map reduce, the less memory you give it, the more it has to sort and spill.  Try to minimize spilling and compactions as much as possible without causing swapping.
 + <li>Accumulo writes data to disk before it sorts it in memory.  This allows data that was in memory when a tablet server crashes to be recovered.  Each slave node needs a local directory to write this data to.  Ensure the file system holding this directory has at least 100G free on all nodes.  Also, if this directory is in a filesystem used by map reduce or hdfs they may effect each others performance.
 +</ul>
 +
 +<p>There are a few settings that determine how much memory accumulo tablet
 +servers use.  In accumulo-env.sh there is a setting called
 +ACCUMULO_TSERVER_OPTS.  By default this is set to something like "-Xmx512m
 +-Xms512m".  These are Java jvm options asking Java to use 512 megabytes of
 +memory.  By default accumulo stores data written to it outside of the Java
 +memory space in order to avoid pauses caused by the Java garbage collector.  The
 +amount of memory it uses for this data is determined by the accumulo setting
 +"tserver.memory.maps.max".  Since this memory is outside of the Java managed
 +memory, the process can grow larger than the -Xmx setting.  So if -Xmx is set
 +to 512M and tserver.memory.maps.max is set to 1G, a tablet server process can
 +be expected to use 1.5G.  If tserver.memory.maps.native.enabled is set to
 +false, then accumulo will only use memory managed by Java and the process will
 +not use more than what -Xmx is set to.  In this case the
 +tserver.memory.maps.max setting should be 75% of the -Xmx setting. 
 +
 +<h3>Swappiness</h3>
 +
 +<p>The linux kernel will swap out memory of running programs to increase
 +the size of the disk buffers.  This tendency to swap out is controlled by
 +a kernel setting called "swappiness."  This behavior does not work well for
 +large java servers.  When a java process runs a garbage collection, it touches
 +lots of pages forcing all swapped out pages back into memory.  It is suggested
 +that swappiness be set to zero.
 +
 +<pre>
 + # sysctl -w vm.swappiness=0
 + # echo "vm.swappiness = 0" &gt;&gt; /etc/sysctl.conf
 +</pre>
 +
 +<h3>Hadoop timeouts</h3>
 +
 +<p>In order to detect failed datanodes, use shorter timeouts.  Add the following to your
 +hdfs-site.xml file:
 +
 +<pre>
 +
 +  &lt;property&gt;
 +    &lt;name&gt;dfs.socket.timeout&lt;/name&gt;
 +    &lt;value&gt;3000&lt;/value&gt;
 +  &lt;/property&gt;
 +
 +  &lt;property&gt;
 +    &lt;name&gt;dfs.socket.write.timeout&lt;/name&gt;
 +    &lt;value&gt;5000&lt;/value&gt;
 +  &lt;/property&gt;
 +
 +  &lt;property&gt;
 +    &lt;name&gt;ipc.client.connect.timeout&lt;/name&gt;
 +    &lt;value&gt;1000&lt;/value&gt;
 +  &lt;/property&gt;
 +
 +  &lt;property&gt;
 +    &lt;name&gt;ipc.clident.connect.max.retries.on.timeouts&lt;/name&gt;
 +    &lt;value&gt;2&lt;/value&gt;
 +  &lt;/property&gt;
 +
 +
 +
 +</pre>
 +
 +
 +</body>
 +</html>


[11/11] git commit: Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/accumulo

Posted by ec...@apache.org.
Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/accumulo


Project: http://git-wip-us.apache.org/repos/asf/accumulo/repo
Commit: http://git-wip-us.apache.org/repos/asf/accumulo/commit/f84e9a1f
Tree: http://git-wip-us.apache.org/repos/asf/accumulo/tree/f84e9a1f
Diff: http://git-wip-us.apache.org/repos/asf/accumulo/diff/f84e9a1f

Branch: refs/heads/master
Commit: f84e9a1f30165881ab7e48e7ba120398131969dc
Parents: 6d7288d 71dc052
Author: Eric Newton <er...@gmail.com>
Authored: Wed Dec 11 09:16:19 2013 -0500
Committer: Eric Newton <er...@gmail.com>
Committed: Wed Dec 11 09:16:19 2013 -0500

----------------------------------------------------------------------

----------------------------------------------------------------------



[07/11] git commit: Merge branch '1.4.5-SNAPSHOT' into 1.5.1-SNAPSHOT

Posted by ec...@apache.org.
Merge branch '1.4.5-SNAPSHOT' into 1.5.1-SNAPSHOT


Project: http://git-wip-us.apache.org/repos/asf/accumulo/repo
Commit: http://git-wip-us.apache.org/repos/asf/accumulo/commit/e9423ae3
Tree: http://git-wip-us.apache.org/repos/asf/accumulo/tree/e9423ae3
Diff: http://git-wip-us.apache.org/repos/asf/accumulo/diff/e9423ae3

Branch: refs/heads/1.5.1-SNAPSHOT
Commit: e9423ae350ecd2e8f7bbafc9380713e8fd7fd644
Parents: 7655de6 ff29f08
Author: Eric Newton <er...@gmail.com>
Authored: Wed Dec 11 09:15:40 2013 -0500
Committer: Eric Newton <er...@gmail.com>
Committed: Wed Dec 11 09:15:40 2013 -0500

----------------------------------------------------------------------
 docs/administration.html                        | 23 ++++++++++++++
 .../chapters/administration.tex                 | 32 ++++++++++++++++++++
 2 files changed, 55 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/accumulo/blob/e9423ae3/docs/administration.html
----------------------------------------------------------------------

http://git-wip-us.apache.org/repos/asf/accumulo/blob/e9423ae3/docs/src/main/latex/accumulo_user_manual/chapters/administration.tex
----------------------------------------------------------------------
diff --cc docs/src/main/latex/accumulo_user_manual/chapters/administration.tex
index cc7697c,0000000..086f41c
mode 100644,000000..100644
--- a/docs/src/main/latex/accumulo_user_manual/chapters/administration.tex
+++ b/docs/src/main/latex/accumulo_user_manual/chapters/administration.tex
@@@ -1,362 -1,0 +1,394 @@@
 +
 +% Licensed to the Apache Software Foundation (ASF) under one or more
 +% contributor license agreements. See the NOTICE file distributed with
 +% this work for additional information regarding copyright ownership.
 +% The ASF licenses this file to You under the Apache License, Version 2.0
 +% (the "License"); you may not use this file except in compliance with
 +% the License. You may obtain a copy of the License at
 +%
 +%     http://www.apache.org/licenses/LICENSE-2.0
 +%
 +% Unless required by applicable law or agreed to in writing, software
 +% distributed under the License is distributed on an "AS IS" BASIS,
 +% WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 +% See the License for the specific language governing permissions and
 +% limitations under the License.
 +
 +\chapter{Administration}
 +
 +\section{Hardware}
 +
 +Because we are running essentially two or three systems simultaneously layered
 +across the cluster: HDFS, Accumulo and MapReduce, it is typical for hardware to
 +consist of 4 to 8 cores, and 8 to 32 GB RAM. This is so each running process can have
 +at least one core and 2 - 4 GB each.
 +
 +One core running HDFS can typically keep 2 to 4 disks busy, so each machine may
 +typically have as little as 2 x 300GB disks and as much as 4 x 1TB or 2TB disks.
 +
 +It is possible to do with less than this, such as with 1u servers with 2 cores and 4GB
 +each, but in this case it is recommended to only run up to two processes per
 +machine - i.e. DataNode and TabletServer or DataNode and MapReduce worker but
 +not all three. The constraint here is having enough available heap space for all the
 +processes on a machine.
 +
 +\section{Network}
 +
 +Accumulo communicates via remote procedure calls over TCP/IP for both passing
 +data and control messages. In addition, Accumulo uses HDFS clients to
 +communicate with HDFS. To achieve good ingest and query performance, sufficient
 +network bandwidth must be available between any two machines.
 +
 +\section{Installation}
 +Choose a directory for the Accumulo installation. This directory will be referenced
 +by the environment variable \texttt{\$ACCUMULO\_HOME}. Run the following:
 +
 +\small
 +\begin{verbatim}
 +$ tar xzf accumulo-1.5.0-bin.tar.gz    # unpack to subdirectory
 +$ mv accumulo-1.5.0 $ACCUMULO_HOME # move to desired location
 +\end{verbatim}
 +\normalsize
 +
 +Repeat this step at each machine within the cluster. Usually all machines have the
 +same \texttt{\$ACCUMULO\_HOME}.
 +
 +\section{Dependencies}
 +Accumulo requires HDFS and ZooKeeper to be configured and running
 +before starting. Password-less SSH should be configured between at least the
 +Accumulo master and TabletServer machines. It is also a good idea to run Network
 +Time Protocol (NTP) within the cluster to ensure nodes' clocks don't get too out of
 +sync, which can cause problems with automatically timestamped data. 
 +
 +\section{Configuration}
 +
 +Accumulo is configured by editing several Shell and XML files found in
 +\texttt{\$ACCUMULO\_HOME/conf}. The structure closely resembles Hadoop's configuration
 +files.
 +
 +\subsection{Edit conf/accumulo-env.sh}
 +
 +Accumulo needs to know where to find the software it depends on. Edit accumulo-env.sh 
 +and specify the following:
 +
 +\begin{enumerate}
 +\item{Enter the location of the installation directory of Accumulo for \texttt{\$ACCUMULO\_HOME}}
 +\item{Enter your system's Java home for \texttt{\$JAVA\_HOME}}
 +\item{Enter the location of Hadoop for \texttt{\$HADOOP\_PREFIX}}
 +\item{Choose a location for Accumulo logs and enter it for \texttt{\$ACCUMULO\_LOG\_DIR}}
 +\item{Enter the location of ZooKeeper for \texttt{\$ZOOKEEPER\_HOME}}
 +\end{enumerate}
 +
 +By default Accumulo TabletServers are set to use 1GB of memory. You may change
 +this by altering the value of \texttt{\$ACCUMULO\_TSERVER\_OPTS}. Note the syntax is that of
 +the Java JVM command line options. This value should be less than the physical
 +memory of the machines running TabletServers.
 +
 +There are similar options for the master's memory usage and the garbage collector
 +process. Reduce these if they exceed the physical RAM of your hardware and
 +increase them, within the bounds of the physical RAM, if a process fails because of
 +insufficient memory.
 +
 +Note that you will be specifying the Java heap space in accumulo-env.sh. You should
 +make sure that the total heap space used for the Accumulo tserver and the Hadoop
 +DataNode and TaskTracker is less than the available memory on each slave node in
 +the cluster. On large clusters, it is recommended that the Accumulo master, Hadoop
 +NameNode, secondary NameNode, and Hadoop JobTracker all be run on separate
 +machines to allow them to use more heap space. If you are running these on the
 +same machine on a small cluster, likewise make sure their heap space settings fit
 +within the available memory.
 +
 +\subsection{Native Map}
 +
 +The tablet server uses a data structure called a MemTable to store sorted key/value
 +pairs in memory when they are first received from the client. When a minor compaction
 +occurs, this data structure is written to HDFS. The MemTable will default to using
 +memory in the JVM but a JNI version, called the native map, can be used to significantly
 +speed up performance by utilizing the memory space of the native operating system. The
 +native map also avoids the performance implications brought on by garbage collection
 +in the JVM by causing it to pause much less frequently.
 +
 +32-bit and 64-bit Linux versions of the native map ship with the Accumulo dist package.
 +For other operating systems, the native map can be built from the codebase in two ways-
 +from maven or from the Makefile.
 +
 +\begin{enumerate}
 +\item{Build from maven using the following command: \texttt{mvn clean package -Pnative.}}
 +\item{Build from the c++ source by running \texttt{make} in the \texttt{\$ACCUMULO\_HOME/server/src/main/c++} directory.}
 +\end{enumerate}
 +
 +After building the native map from the source, you will find the artifact in
 +\texttt{\$ACCUMULO\_HOME/lib/native.} Upon starting up, the tablet server will look
 +in this directory for the map library. If the file is renamed or moved from its
 +target directory, the tablet server may not be able to find it.
 +
 +\subsection{Cluster Specification}
 +
 +On the machine that will serve as the Accumulo master:
 +
 +\begin{enumerate}
 +\item{Write the IP address or domain name of the Accumulo Master to the\\\texttt{\$ACCUMULO\_HOME/conf/masters} file.}
 +\item{Write the IP addresses or domain name of the machines that will be TabletServers in\\\texttt{\$ACCUMULO\_HOME/conf/slaves}, one per line.}
 +\end{enumerate}
 +
 +Note that if using domain names rather than IP addresses, DNS must be configured
 +properly for all machines participating in the cluster. DNS can be a confusing source
 +of errors.
 +
 +\subsection{Accumulo Settings}
 +Specify appropriate values for the following settings in\\
 +\texttt{\$ACCUMULO\_HOME/conf/accumulo-site.xml} :
 +
 +\small
 +\begin{verbatim}
 +<property>
 +    <name>zookeeper</name>
 +    <value>zooserver-one:2181,zooserver-two:2181</value>
 +    <description>list of zookeeper servers</description>
 +</property>
 +\end{verbatim}
 +\normalsize
 +
 +This enables Accumulo to find ZooKeeper. Accumulo uses ZooKeeper to coordinate
 +settings between processes and helps finalize TabletServer failure.
 +
 +
 +\small
 +\begin{verbatim}
 +<property>
 +    <name>instance.secret</name>
 +    <value>DEFAULT</value>
 +</property>
 +\end{verbatim}
 +\normalsize
 +
 +The instance needs a secret to enable secure communication between servers. Configure your
 +secret and make sure that the \texttt{accumulo-site.xml} file is not readable to other users.
 +
 +Some settings can be modified via the Accumulo shell and take effect immediately, but
 +some settings require a process restart to take effect. See the configuration documentation
 +(available on the monitor web pages) for details.
 +
 +\subsection{Deploy Configuration}
 +
 +Copy the masters, slaves, accumulo-env.sh, and if necessary, accumulo-site.xml
 +from the\\\texttt{\$ACCUMULO\_HOME/conf/} directory on the master to all the machines
 +specified in the slaves file.
 +
 +\section{Initialization}
 +
 +Accumulo must be initialized to create the structures it uses internally to locate
 +data across the cluster. HDFS is required to be configured and running before
 +Accumulo can be initialized.
 +
 +Once HDFS is started, initialization can be performed by executing\\
 +\texttt{\$ACCUMULO\_HOME/bin/accumulo init} . This script will prompt for a name
 +for this instance of Accumulo. The instance name is used to identify a set of tables
 +and instance-specific settings. The script will then write some information into
 +HDFS so Accumulo can start properly.
 +
 +The initialization script will prompt you to set a root password. Once Accumulo is
 +initialized it can be started.
 +
 +\section{Running}
 +
 +\subsection{Starting Accumulo}
 +
 +Make sure Hadoop is configured on all of the machines in the cluster, including
 +access to a shared HDFS instance. Make sure HDFS and ZooKeeper are running.
 +Make sure ZooKeeper is configured and running on at least one machine in the
 +cluster.
 +Start Accumulo using the \texttt{bin/start-all.sh} script.
 +
 +To verify that Accumulo is running, check the Status page as described under
 +\emph{Monitoring}. In addition, the Shell can provide some information about the status of
 +tables via reading the !METADATA table.
 +
 +\subsection{Stopping Accumulo}
 +
 +To shutdown cleanly, run \texttt{bin/stop-all.sh} and the master will orchestrate the
 +shutdown of all the tablet servers. Shutdown waits for all minor compactions to finish, so it may
 +take some time for particular configurations.
 +
++\subsection{Adding a Node}
++
++Update your \texttt{\$ACCUMULO_HOME/conf/slaves} (or \texttt{\$ACCUMULO_CONF_DIR/slaves}) file to account for the addition.
++
++\begin{verbatim}
++$ACCUMULO_HOME/bin/accumulo admin start <host(s)> {<host> ...}
++\end{verbatim}
++
++Alternatively, you can ssh to each of the hosts you want to add and run 
++\texttt{\$ACCUMULO_HOME/bin/start-here.sh}.
++
++Make sure the host in question has the new configuration, or else the tablet 
++server won't start; at a minimum this needs to be on the host(s) being added, 
++but in practice it's good to ensure consistent configuration across all nodes.
++
++\subsection{Decomissioning a Node}
++
++If you need to take a node out of operation, you can trigger a graceful shutdown of a tablet 
++server. Accumulo will automatically rebalance the tablets across the available tablet servers.
++
++\begin{verbatim}
++$ACCUMULO_HOME/bin/accumulo admin stop <host(s)> {<host> ...}
++\end{verbatim}
++
++Alternatively, you can ssh to each of the hosts you want to remove and run 
++\texttt{\$ACCUMULO_HOME/bin/stop-here.sh}.
++
++Be sure to update your \texttt{\$ACCUMULO_HOME/conf/slaves} (or \texttt{\$ACCUMULO_CONF_DIR/slaves}) file to 
++account for the removal of these hosts. Bear in mind that the monitor will not re-read the 
++slaves file automatically, so it will report the decomissioned servers as down; it's 
++recommended that you restart the monitor so that the node list is up to date.
++
 +\section{Monitoring}
 +
 +The Accumulo Master provides an interface for monitoring the status and health of
 +Accumulo components. This interface can be accessed by pointing a web browser to\\
 +\texttt{http://accumulomaster:50095/status}
 +
 +\section{Tracing}
 +It can be difficult to determine why some operations are taking longer
 +than expected. For example, you may be looking up items with very low
 +latency, but sometimes the lookups take much longer. Determining the
 +cause of the delay is difficult because the system is distributed, and
 +the typical lookup is fast.
 +
 +Accumulo has been instrumented to record the time that various
 +operations take when tracing is turned on. The fact that tracing is
 +enabled follows all the requests made on behalf of the user throughout
 +the distributed infrastructure of accumulo, and across all threads of
 +execution.
 +
 +These time spans will be inserted into the \texttt{trace} table in
 +Accumulo. You can browse recent traces from the Accumulo monitor
 +page. You can also read the \texttt{trace} table directly like any
 +other table.
 +
 +The design of Accumulo's distributed tracing follows that of
 +\href{http://research.google.com/pubs/pub36356.html}{Google's Dapper}.
 +
 +\subsection{Tracers}
 +To collect traces, Accumulo needs at least one server listed in
 +\\\texttt{\$ACCUMULO\_HOME/conf/tracers}. The server collects traces
 +from clients and writes them to the \texttt{trace} table. The Accumulo
 +user that the tracer connects to Accumulo with can be configured with
 +the following properties
 +
 +\begin{verbatim}
 +trace.user
 +trace.token.property.password
 +\end{verbatim}
 +
 +\subsection{Instrumenting a Client}
 +Tracing can be used to measure a client operation, such as a scan, as
 +the operation traverses the distributed system. To enable tracing for
 +your application call
 +
 +\begin{verbatim}
 +DistributedTrace.enable(instance, new ZooReader(instance), hostname, "myApplication");
 +\end{verbatim}
 +
 +Once tracing has been enabled, a client can wrap an operation in a trace.
 +
 +\begin{verbatim}
 +Trace.on("Client Scan");
 +BatchScanner scanner = conn.createBatchScanner(...);
 +// Configure your scanner
 +for (Entry entry : scanner) {
 +}
 +Trace.off();
 +\end{verbatim}
 +
 +Additionally, the user can create additional Spans within a Trace.
 +\begin{verbatim}
 +Trace.on("Client Update");
 +...
 +Span readSpan = Trace.start("Read");
 +...
 +readSpan.stop();
 +...
 +Span writeSpan = Trace.start("Write");
 +...
 +writeSpan.stop();
 +Trace.off();
 +\end{verbatim}
 +
 +Like Dapper, Accumulo tracing supports user defined annotations to associate additional data with a Trace.
 +\begin{verbatim}
 +...
 +int numberOfEntriesRead = 0;
 +Span readSpan = Trace.start("Read");
 +// Do the read, update the counter
 +...
 +readSpan.data("Number of Entries Read", String.valueOf(numberOfEntriesRead));
 +\end{verbatim}
 +
 +Some client operations may have a high volume within your
 +application. As such, you may wish to only sample a percentage of
 +operations for tracing. As seen below, the CountSampler can be used to
 +help enable tracing for 1-in-1000 operations
 +\begin{verbatim}
 +Sampler sampler = new CountSampler(1000);
 +...
 +if (sampler.next()) {
 +  Trace.on("Read");
 +}
 +...
 +Trace.offNoFlush();
 +\end{verbatim}
 +
 +It should be noted that it is safe to turn off tracing even if it
 +isn't currently active. The Trace.offNoFlush() should be used if the
 +user does not wish to have Trace.off() block while flushing trace
 +data.
 +
 +\subsection{Viewing Collected Traces}
 +To view collected traces, use the "Recent Traces" link on the Monitor
 +UI. You can also programmatically access and print traces using the
 +\texttt{TraceDump} class.
 +
 +\subsection{Tracing from the Shell}
 +You can enable tracing for operations run from the shell by using the
 +\texttt{trace on} and \texttt{trace off} commands.
 +
 +\begin{verbatim}
 +root@test test> trace on
 +root@test test> scan
 +a b:c []    d
 +root@test test> trace off
 +Waiting for trace information
 +Waiting for trace information
 +Trace started at 2013/08/26 13:24:08.332
 +Time  Start  Service@Location       Name
 + 3628+0      shell@localhost shell:root
 +    8+1690     shell@localhost scan
 +    7+1691       shell@localhost scan:location
 +    6+1692         tserver@localhost startScan
 +    5+1692           tserver@localhost tablet read ahead 6
 +\end{verbatim}
 +
 +\section{Logging}
 +Accumulo processes each write to a set of log files. By default these are found under\\
 +\texttt{\$ACCUMULO/logs/}.
 +
 +\section{Recovery}
 +
 +In the event of TabletServer failure or error on shutting Accumulo down, some
 +mutations may not have been minor compacted to HDFS properly. In this case,
 +Accumulo will automatically reapply such mutations from the write-ahead log
 +either when the tablets from the failed server are reassigned by the Master, in the
 +case of a single TabletServer failure or the next time Accumulo starts, in the event of
 +failure during shutdown.
 +
 +Recovery is performed by asking a tablet server to sort the logs so that tablets can easily find their missing
 +updates. The sort status of each file is displayed on
 +Accumulo monitor status page. Once the recovery is complete any
 +tablets involved should return to an ``online" state. Until then those tablets will be
 +unavailable to clients.
 +
 +The Accumulo client library is configured to retry failed mutations and in many
 +cases clients will be able to continue processing after the recovery process without
 +throwing an exception.
 +


[02/11] git commit: ACCUMULO-1956 expand docs, addition/decommision of cluster nodes

Posted by ec...@apache.org.
ACCUMULO-1956 expand docs, addition/decommision of cluster nodes

Signed-off-by: Eric Newton <er...@gmail.com>


Project: http://git-wip-us.apache.org/repos/asf/accumulo/repo
Commit: http://git-wip-us.apache.org/repos/asf/accumulo/commit/ff29f08a
Tree: http://git-wip-us.apache.org/repos/asf/accumulo/tree/ff29f08a
Diff: http://git-wip-us.apache.org/repos/asf/accumulo/diff/ff29f08a

Branch: refs/heads/1.5.1-SNAPSHOT
Commit: ff29f08a7d79be3baecb356a05444f342b74b620
Parents: 19a48da
Author: Alex Moundalexis <al...@clouderagovt.com>
Authored: Mon Dec 9 16:33:57 2013 -0500
Committer: Eric Newton <er...@gmail.com>
Committed: Wed Dec 11 09:15:04 2013 -0500

----------------------------------------------------------------------
 docs/administration.html                        | 23 ++++++++++++++
 .../src/user_manual/chapters/administration.tex | 32 ++++++++++++++++++++
 2 files changed, 55 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/accumulo/blob/ff29f08a/docs/administration.html
----------------------------------------------------------------------
diff --git a/docs/administration.html b/docs/administration.html
index b0c8e88..b8712dc 100644
--- a/docs/administration.html
+++ b/docs/administration.html
@@ -53,6 +53,29 @@ ask the master to shut down the tablet servers gracefully. If the tablet servers
 at the password prompt, and waiting 15 seconds for the script to force a shutdown. Normally, once the shutdown happens gracefully, unresponsive tablet servers are
 forcibly shut down after 5 seconds.
 
+<h3>Adding a Node</h3>
+
+<p>Update your <code>$ACCUMULO_HOME/conf/slaves</code> (or <code>$ACCUMULO_CONF_DIR/slaves</code>) file to account for the addition; at a minimum this needs to be on the host(s) being added, but in practice it's good to ensure consistent configuration across all nodes.</p>
+
+<pre>
+$ACCUMULO_HOME/bin/accumulo admin start &gt;host(s)&gt; {&lt;host&gt; ...}
+</pre>
+
+<p>Alternatively, you can ssh to each of the hosts you want to add and run <code>$ACCUMULO_HOME/bin/start-here.sh</code>.</p>
+
+<p>Make sure the host in question has the new configuration, or else the tablet server won't start.</p>
+
+<h3>Decomissioning a Node</h3>
+
+<p>If you need to take a node out of operation, you can trigger a graceful shutdown of a tablet server. Accumulo will automatically rebalance the tablets across the available tablet servers.</p>
+
+<pre>
+$ACCUMULO_HOME/bin/accumulo admin stop &gt;host(s)&gt; {&lt;host&gt; ...}
+</pre>
+
+<p>Alternatively, you can ssh to each of the hosts you want to remove and run <code>$ACCUMULO_HOME/bin/stop-here.sh</code>.</p>
+
+<p>Be sure to update your <code>$ACCUMULO_HOME/conf/slaves</code> (or <code>$ACCUMULO_CONF_DIR/slaves</code>) file to account for the removal of these hosts. Bear in mind that the monitor will not re-read the slaves file automatically, so it will report the decomissioned servers as down; it's recommended that you restart the monitor so that the node list is up to date.</p>
 
 <h3>Configuration</h3>
 <p>Accumulo configuration information is stored in a xml file and ZooKeeper.  System wide

http://git-wip-us.apache.org/repos/asf/accumulo/blob/ff29f08a/docs/src/user_manual/chapters/administration.tex
----------------------------------------------------------------------
diff --git a/docs/src/user_manual/chapters/administration.tex b/docs/src/user_manual/chapters/administration.tex
index f3feca5..d0533d9 100644
--- a/docs/src/user_manual/chapters/administration.tex
+++ b/docs/src/user_manual/chapters/administration.tex
@@ -184,6 +184,38 @@ To shutdown cleanly, run \texttt{bin/stop-all.sh} and the master will orchestrat
 shutdown of all the tablet servers. Shutdown waits for all minor compactions to finish, so it may
 take some time for particular configurations.
 
+\subsection{Adding a Node}
+
+Update your \texttt{\$ACCUMULO_HOME/conf/slaves} (or \texttt{\$ACCUMULO_CONF_DIR/slaves}) file to account for the addition.
+
+\begin{verbatim}
+$ACCUMULO_HOME/bin/accumulo admin start <host(s)> {<host> ...}
+\end{verbatim}
+
+Alternatively, you can ssh to each of the hosts you want to add and run 
+\texttt{\$ACCUMULO_HOME/bin/start-here.sh}.
+
+Make sure the host in question has the new configuration, or else the tablet 
+server won't start; at a minimum this needs to be on the host(s) being added, 
+but in practice it's good to ensure consistent configuration across all nodes.
+
+\subsection{Decomissioning a Node}
+
+If you need to take a node out of operation, you can trigger a graceful shutdown of a tablet 
+server. Accumulo will automatically rebalance the tablets across the available tablet servers.
+
+\begin{verbatim}
+$ACCUMULO_HOME/bin/accumulo admin stop <host(s)> {<host> ...}
+\end{verbatim}
+
+Alternatively, you can ssh to each of the hosts you want to remove and run 
+\texttt{\$ACCUMULO_HOME/bin/stop-here.sh}.
+
+Be sure to update your \texttt{\$ACCUMULO_HOME/conf/slaves} (or \texttt{\$ACCUMULO_CONF_DIR/slaves}) file to 
+account for the removal of these hosts. Bear in mind that the monitor will not re-read the 
+slaves file automatically, so it will report the decomissioned servers as down; it's 
+recommended that you restart the monitor so that the node list is up to date.
+
 \section{Monitoring}
 
 The Accumulo Master provides an interface for monitoring the status and health of


[10/11] git commit: Merge branch '1.6.0-SNAPSHOT'

Posted by ec...@apache.org.
Merge branch '1.6.0-SNAPSHOT'


Project: http://git-wip-us.apache.org/repos/asf/accumulo/repo
Commit: http://git-wip-us.apache.org/repos/asf/accumulo/commit/6d7288dd
Tree: http://git-wip-us.apache.org/repos/asf/accumulo/tree/6d7288dd
Diff: http://git-wip-us.apache.org/repos/asf/accumulo/diff/6d7288dd

Branch: refs/heads/master
Commit: 6d7288dd68bd7221a834f2948d577f5d331d2af0
Parents: f3a8677 0d874d0
Author: Eric Newton <er...@gmail.com>
Authored: Wed Dec 11 09:16:09 2013 -0500
Committer: Eric Newton <er...@gmail.com>
Committed: Wed Dec 11 09:16:09 2013 -0500

----------------------------------------------------------------------
 README                                          |    24 +-
 bin/build_native_library.sh                     |    73 +
 bin/config.sh                                   |    70 +-
 bin/start-here.sh                               |     2 +-
 bin/start-server.sh                             |    10 +-
 bin/stop-here.sh                                |    10 +-
 .../1GB/native-standalone/accumulo-env.sh       |     3 +
 conf/examples/1GB/standalone/accumulo-env.sh    |     3 +
 .../2GB/native-standalone/accumulo-env.sh       |     3 +
 conf/examples/2GB/standalone/accumulo-env.sh    |     3 +
 .../3GB/native-standalone/accumulo-env.sh       |     3 +
 conf/examples/3GB/standalone/accumulo-env.sh    |     3 +
 .../512MB/native-standalone/accumulo-env.sh     |     3 +
 conf/examples/512MB/standalone/accumulo-env.sh  |     3 +
 core/pom.xml                                    |    59 +-
 .../org/apache/accumulo/core/Constants.java     |    62 +-
 .../apache/accumulo/core/client/Connector.java  |    40 +-
 .../core/client/NamespaceExistsException.java   |    67 +
 .../core/client/NamespaceNotEmptyException.java |    74 +
 .../core/client/NamespaceNotFoundException.java |    77 +
 .../core/client/admin/NamespaceOperations.java  |   370 +
 .../client/admin/NamespaceOperationsHelper.java |   203 +
 .../client/admin/NamespaceOperationsImpl.java   |   347 +
 .../core/client/admin/SecurityOperations.java   |    88 +-
 .../client/admin/SecurityOperationsImpl.java    |    83 +-
 .../core/client/admin/TableOperations.java      |   107 +-
 .../core/client/admin/TableOperationsImpl.java  |   450 +-
 .../core/client/impl/ConnectorImpl.java         |    10 +
 .../accumulo/core/client/impl/Namespaces.java   |   107 +
 .../accumulo/core/client/impl/Tables.java       |   112 +-
 .../client/impl/TabletServerBatchWriter.java    |     2 +-
 .../core/client/impl/thrift/ClientService.java  |  9196 +++++++++++--
 .../client/impl/thrift/SecurityErrorCode.java   |     5 +-
 .../core/client/mapred/InputFormatBase.java     |     5 +-
 .../core/client/mapreduce/InputFormatBase.java  |     4 +-
 .../accumulo/core/client/mock/MockAccumulo.java |    53 +-
 .../core/client/mock/MockConnector.java         |     6 +
 .../core/client/mock/MockNamespace.java         |    54 +
 .../client/mock/MockNamespaceOperations.java    |   130 +
 .../client/mock/MockSecurityOperations.java     |    83 +-
 .../accumulo/core/client/mock/MockTable.java    |    36 +-
 .../core/client/mock/MockTableOperations.java   |   133 +-
 .../core/client/security/SecurityErrorCode.java |     3 +-
 .../org/apache/accumulo/core/conf/Property.java |     7 +-
 .../accumulo/core/file/rfile/bcfile/BCFile.java |     2 +-
 .../core/master/thrift/MasterClientService.java | 11700 ++++++++++++++---
 .../accumulo/core/metadata/MetadataTable.java   |     9 +-
 .../accumulo/core/metadata/RootTable.java       |    15 +-
 .../core/security/NamespacePermission.java      |    66 +
 .../core/security/SystemPermission.java         |     5 +-
 .../org/apache/accumulo/core/util/Merge.java    |     2 +-
 .../org/apache/accumulo/core/util/Pair.java     |    57 +-
 .../apache/accumulo/core/util/shell/Shell.java  |   340 +-
 .../core/util/shell/ShellCompletor.java         |    58 +-
 .../accumulo/core/util/shell/ShellOptions.java  |     3 +-
 .../core/util/shell/commands/ConfigCommand.java |   113 +-
 .../util/shell/commands/ConstraintCommand.java  |    84 +-
 .../shell/commands/CreateNamespaceCommand.java  |   108 +
 .../util/shell/commands/CreateTableCommand.java |    69 +-
 .../core/util/shell/commands/DUCommand.java     |    37 +-
 .../util/shell/commands/DeleteIterCommand.java  |    61 +-
 .../shell/commands/DeleteNamespaceCommand.java  |   100 +
 .../util/shell/commands/DeleteTableCommand.java |    12 +-
 .../core/util/shell/commands/GrantCommand.java  |    47 +-
 .../util/shell/commands/ListIterCommand.java    |    59 +-
 .../commands/NamespacePermissionsCommand.java   |    44 +
 .../util/shell/commands/NamespacesCommand.java  |    83 +
 .../core/util/shell/commands/OptUtil.java       |    56 +-
 .../shell/commands/RenameNamespaceCommand.java  |    79 +
 .../util/shell/commands/RenameTableCommand.java |    17 +-
 .../core/util/shell/commands/RevokeCommand.java |    45 +-
 .../util/shell/commands/SetIterCommand.java     |   149 +-
 .../util/shell/commands/TableOperation.java     |    56 +-
 .../core/util/shell/commands/TablesCommand.java |    88 +-
 .../shell/commands/UserPermissionsCommand.java  |    34 +-
 core/src/main/thrift/client.thrift              |    12 +-
 core/src/main/thrift/master.thrift              |    12 +-
 .../core/client/impl/TabletLocatorImplTest.java |    10 +-
 .../mapreduce/AccumuloInputFormatTest.java      |     4 +-
 .../core/client/mock/MockNamespacesTest.java    |   315 +
 .../accumulo/core/util/shell/ShellTest.java     |    54 +-
 .../chapters/administration.tex                 |    34 +-
 .../accumulo_user_manual/chapters/shell.tex     |     8 +-
 .../chapters/table_configuration.tex            |     2 +-
 .../chapters/troubleshooting.tex                |    30 +-
 examples/simple/pom.xml                         |    41 +-
 .../simple/filedata/FileDataIngest.java         |     2 +-
 .../examples/simple/filedata/FileDataQuery.java |     2 +-
 fate/pom.xml                                    |     3 -
 minicluster/pom.xml                             |    25 +-
 .../apache/accumulo/minicluster/MemoryUnit.java |    14 +-
 .../minicluster/MiniAccumuloInstance.java       |     4 +-
 .../minicluster/MiniAccumuloRunner.java         |    48 +-
 .../minicluster/ProcessNotFoundException.java   |     2 +-
 .../accumulo/minicluster/ProcessReference.java  |     6 +-
 proxy/pom.xml                                   |    28 +-
 .../apache/accumulo/proxy/SimpleProxyIT.java    |     2 +-
 server/base/pom.xml                             |     5 +
 .../apache/accumulo/server/ServerConstants.java |     9 +-
 .../server/client/ClientServiceHandler.java     |   170 +-
 .../server/conf/NamespaceConfWatcher.java       |   107 +
 .../server/conf/NamespaceConfiguration.java     |   174 +
 .../server/conf/ServerConfiguration.java        |    75 +-
 .../server/conf/TableConfiguration.java         |    46 +-
 .../server/conf/TableParentConfiguration.java   |    39 +
 .../org/apache/accumulo/server/fs/FileRef.java  |     5 +-
 .../accumulo/server/fs/VolumeManager.java       |     2 +-
 .../accumulo/server/fs/VolumeManagerImpl.java   |    17 +-
 .../apache/accumulo/server/init/Initialize.java |   178 +-
 .../master/balancer/TableLoadBalancer.java      |    30 +-
 .../master/state/MetaDataTableScanner.java      |     2 +-
 .../security/AuditedSecurityOperation.java      |     2 +-
 .../server/security/SecurityOperation.java      |   411 +-
 .../security/handler/InsecurePermHandler.java   |    31 +
 .../security/handler/PermissionHandler.java     |    61 +-
 .../server/security/handler/ZKAuthorizor.java   |    41 +-
 .../server/security/handler/ZKPermHandler.java  |   209 +-
 .../server/security/handler/ZKSecurityTool.java |    21 +
 .../accumulo/server/tables/TableManager.java    |   121 +-
 .../accumulo/server/util/MetadataTableUtil.java |     4 +-
 .../accumulo/server/util/NamespacePropUtil.java |    60 +
 .../accumulo/server/util/TablePropUtil.java     |     2 +-
 .../accumulo/server/init/InitializeTest.java    |   157 +
 server/extras/pom.xml                           |    19 +-
 server/gc/pom.xml                               |    19 +-
 .../accumulo/gc/SimpleGarbageCollector.java     |     8 +-
 server/master/pom.xml                           |    28 +-
 .../java/org/apache/accumulo/master/Master.java |   307 +-
 .../accumulo/master/tableOps/BulkImport.java    |     2 +-
 .../master/tableOps/CancelCompactions.java      |    17 +-
 .../master/tableOps/ChangeTableState.java       |    26 +-
 .../accumulo/master/tableOps/CloneTable.java    |   116 +-
 .../accumulo/master/tableOps/CompactRange.java  |    95 +-
 .../master/tableOps/CreateNamespace.java        |   196 +
 .../accumulo/master/tableOps/CreateTable.java   |   134 +-
 .../master/tableOps/DeleteNamespace.java        |   104 +
 .../accumulo/master/tableOps/DeleteTable.java   |    19 +-
 .../accumulo/master/tableOps/ExportTable.java   |    10 +-
 .../accumulo/master/tableOps/ImportTable.java   |   268 +-
 .../master/tableOps/RenameNamespace.java        |    92 +
 .../accumulo/master/tableOps/RenameTable.java   |    54 +-
 .../accumulo/master/tableOps/TableRangeOp.java  |    56 +-
 .../apache/accumulo/master/tableOps/Utils.java  |    62 +-
 .../apache/accumulo/master/TestMergeState.java  |    52 +-
 server/monitor/pom.xml                          |    29 +-
 .../src/main/resources/docs/administration.html |    23 +
 .../src/main/resources/docs/bulkIngest.html     |     2 +-
 .../main/resources/docs/examples/README.bloom   |     9 +-
 .../main/resources/docs/examples/README.export  |     9 +-
 .../resources/docs/examples/README.filedata     |     4 +-
 .../resources/docs/examples/README.visibility   |     4 +-
 server/native/src/main/resources/Makefile       |     2 +-
 server/tracer/pom.xml                           |    19 +-
 server/tserver/pom.xml                          |    55 +-
 .../org/apache/accumulo/tserver/NativeMap.java  |     8 +-
 .../org/apache/accumulo/tserver/Tablet.java     |    24 +-
 .../apache/accumulo/tserver/TabletServer.java   |     9 +-
 .../accumulo/tserver/log/MultiReaderTest.java   |     8 +-
 start/pom.xml                                   |    12 +-
 test/pom.xml                                    |    54 +-
 test/scale/deleteLargeTable.txt                 |     2 +-
 .../accumulo/test/functional/ZombieTServer.java |     2 +-
 .../metadata/MetadataBatchScanTest.java         |     2 +-
 .../test/performance/thrift/NullTserver.java    |     4 +-
 .../concurrent/ChangePermissions.java           |    37 +-
 .../randomwalk/concurrent/CheckPermission.java  |    27 +-
 .../test/randomwalk/concurrent/CloneTable.java  |     4 +-
 .../test/randomwalk/concurrent/Config.java      |   111 +-
 .../randomwalk/concurrent/CreateNamespace.java  |    48 +
 .../test/randomwalk/concurrent/CreateTable.java |     4 +-
 .../randomwalk/concurrent/DeleteNamespace.java  |    48 +
 .../test/randomwalk/concurrent/Merge.java       |     2 +-
 .../randomwalk/concurrent/RenameNamespace.java  |    52 +
 .../test/randomwalk/concurrent/RenameTable.java |     4 +-
 .../test/randomwalk/concurrent/Setup.java       |    23 +-
 .../randomwalk/security/WalkingSecurity.java    |   183 +-
 .../org/apache/accumulo/test/DumpConfigIT.java  |     6 +-
 .../test/MultiTableBatchWriterTest.java         |     5 +-
 .../org/apache/accumulo/test/NamespacesIT.java  |   555 +
 .../org/apache/accumulo/test/ShellServerIT.java |   111 +-
 .../apache/accumulo/test/SplitRecoveryIT.java   |     2 +-
 .../test/functional/GarbageCollectorIT.java     |     4 +-
 .../accumulo/test/functional/PermissionsIT.java |    93 +-
 .../accumulo/test/functional/RestartIT.java     |    18 +-
 test/system/bench/cloudstone1/cloudstone1.py    |     4 +-
 test/system/continuous/README                   |     8 +-
 test/system/continuous/agitator.pl              |    71 +-
 .../system/continuous/continuous-env.sh.example |    12 +-
 test/system/continuous/hdfs-agitator.pl         |   217 +
 test/system/continuous/magitator.pl             |    10 +-
 test/system/continuous/start-agitator.sh        |    19 +-
 test/system/continuous/stop-agitator.sh         |     8 +-
 .../randomwalk/conf/modules/Concurrent.xml      |    23 +-
 trace/pom.xml                                   |    11 +-
 194 files changed, 26414 insertions(+), 5692 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/accumulo/blob/6d7288dd/core/pom.xml
----------------------------------------------------------------------

http://git-wip-us.apache.org/repos/asf/accumulo/blob/6d7288dd/examples/simple/pom.xml
----------------------------------------------------------------------

http://git-wip-us.apache.org/repos/asf/accumulo/blob/6d7288dd/fate/pom.xml
----------------------------------------------------------------------

http://git-wip-us.apache.org/repos/asf/accumulo/blob/6d7288dd/minicluster/pom.xml
----------------------------------------------------------------------

http://git-wip-us.apache.org/repos/asf/accumulo/blob/6d7288dd/proxy/pom.xml
----------------------------------------------------------------------

http://git-wip-us.apache.org/repos/asf/accumulo/blob/6d7288dd/server/base/pom.xml
----------------------------------------------------------------------

http://git-wip-us.apache.org/repos/asf/accumulo/blob/6d7288dd/server/extras/pom.xml
----------------------------------------------------------------------

http://git-wip-us.apache.org/repos/asf/accumulo/blob/6d7288dd/server/gc/pom.xml
----------------------------------------------------------------------

http://git-wip-us.apache.org/repos/asf/accumulo/blob/6d7288dd/server/master/pom.xml
----------------------------------------------------------------------

http://git-wip-us.apache.org/repos/asf/accumulo/blob/6d7288dd/server/monitor/pom.xml
----------------------------------------------------------------------

http://git-wip-us.apache.org/repos/asf/accumulo/blob/6d7288dd/server/tracer/pom.xml
----------------------------------------------------------------------

http://git-wip-us.apache.org/repos/asf/accumulo/blob/6d7288dd/server/tserver/pom.xml
----------------------------------------------------------------------

http://git-wip-us.apache.org/repos/asf/accumulo/blob/6d7288dd/start/pom.xml
----------------------------------------------------------------------

http://git-wip-us.apache.org/repos/asf/accumulo/blob/6d7288dd/test/pom.xml
----------------------------------------------------------------------

http://git-wip-us.apache.org/repos/asf/accumulo/blob/6d7288dd/trace/pom.xml
----------------------------------------------------------------------


[05/11] git commit: Merge branch '1.4.5-SNAPSHOT' into 1.5.1-SNAPSHOT

Posted by ec...@apache.org.
Merge branch '1.4.5-SNAPSHOT' into 1.5.1-SNAPSHOT


Project: http://git-wip-us.apache.org/repos/asf/accumulo/repo
Commit: http://git-wip-us.apache.org/repos/asf/accumulo/commit/e9423ae3
Tree: http://git-wip-us.apache.org/repos/asf/accumulo/tree/e9423ae3
Diff: http://git-wip-us.apache.org/repos/asf/accumulo/diff/e9423ae3

Branch: refs/heads/1.6.0-SNAPSHOT
Commit: e9423ae350ecd2e8f7bbafc9380713e8fd7fd644
Parents: 7655de6 ff29f08
Author: Eric Newton <er...@gmail.com>
Authored: Wed Dec 11 09:15:40 2013 -0500
Committer: Eric Newton <er...@gmail.com>
Committed: Wed Dec 11 09:15:40 2013 -0500

----------------------------------------------------------------------
 docs/administration.html                        | 23 ++++++++++++++
 .../chapters/administration.tex                 | 32 ++++++++++++++++++++
 2 files changed, 55 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/accumulo/blob/e9423ae3/docs/administration.html
----------------------------------------------------------------------

http://git-wip-us.apache.org/repos/asf/accumulo/blob/e9423ae3/docs/src/main/latex/accumulo_user_manual/chapters/administration.tex
----------------------------------------------------------------------
diff --cc docs/src/main/latex/accumulo_user_manual/chapters/administration.tex
index cc7697c,0000000..086f41c
mode 100644,000000..100644
--- a/docs/src/main/latex/accumulo_user_manual/chapters/administration.tex
+++ b/docs/src/main/latex/accumulo_user_manual/chapters/administration.tex
@@@ -1,362 -1,0 +1,394 @@@
 +
 +% Licensed to the Apache Software Foundation (ASF) under one or more
 +% contributor license agreements. See the NOTICE file distributed with
 +% this work for additional information regarding copyright ownership.
 +% The ASF licenses this file to You under the Apache License, Version 2.0
 +% (the "License"); you may not use this file except in compliance with
 +% the License. You may obtain a copy of the License at
 +%
 +%     http://www.apache.org/licenses/LICENSE-2.0
 +%
 +% Unless required by applicable law or agreed to in writing, software
 +% distributed under the License is distributed on an "AS IS" BASIS,
 +% WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 +% See the License for the specific language governing permissions and
 +% limitations under the License.
 +
 +\chapter{Administration}
 +
 +\section{Hardware}
 +
 +Because we are running essentially two or three systems simultaneously layered
 +across the cluster: HDFS, Accumulo and MapReduce, it is typical for hardware to
 +consist of 4 to 8 cores, and 8 to 32 GB RAM. This is so each running process can have
 +at least one core and 2 - 4 GB each.
 +
 +One core running HDFS can typically keep 2 to 4 disks busy, so each machine may
 +typically have as little as 2 x 300GB disks and as much as 4 x 1TB or 2TB disks.
 +
 +It is possible to do with less than this, such as with 1u servers with 2 cores and 4GB
 +each, but in this case it is recommended to only run up to two processes per
 +machine - i.e. DataNode and TabletServer or DataNode and MapReduce worker but
 +not all three. The constraint here is having enough available heap space for all the
 +processes on a machine.
 +
 +\section{Network}
 +
 +Accumulo communicates via remote procedure calls over TCP/IP for both passing
 +data and control messages. In addition, Accumulo uses HDFS clients to
 +communicate with HDFS. To achieve good ingest and query performance, sufficient
 +network bandwidth must be available between any two machines.
 +
 +\section{Installation}
 +Choose a directory for the Accumulo installation. This directory will be referenced
 +by the environment variable \texttt{\$ACCUMULO\_HOME}. Run the following:
 +
 +\small
 +\begin{verbatim}
 +$ tar xzf accumulo-1.5.0-bin.tar.gz    # unpack to subdirectory
 +$ mv accumulo-1.5.0 $ACCUMULO_HOME # move to desired location
 +\end{verbatim}
 +\normalsize
 +
 +Repeat this step at each machine within the cluster. Usually all machines have the
 +same \texttt{\$ACCUMULO\_HOME}.
 +
 +\section{Dependencies}
 +Accumulo requires HDFS and ZooKeeper to be configured and running
 +before starting. Password-less SSH should be configured between at least the
 +Accumulo master and TabletServer machines. It is also a good idea to run Network
 +Time Protocol (NTP) within the cluster to ensure nodes' clocks don't get too out of
 +sync, which can cause problems with automatically timestamped data. 
 +
 +\section{Configuration}
 +
 +Accumulo is configured by editing several Shell and XML files found in
 +\texttt{\$ACCUMULO\_HOME/conf}. The structure closely resembles Hadoop's configuration
 +files.
 +
 +\subsection{Edit conf/accumulo-env.sh}
 +
 +Accumulo needs to know where to find the software it depends on. Edit accumulo-env.sh 
 +and specify the following:
 +
 +\begin{enumerate}
 +\item{Enter the location of the installation directory of Accumulo for \texttt{\$ACCUMULO\_HOME}}
 +\item{Enter your system's Java home for \texttt{\$JAVA\_HOME}}
 +\item{Enter the location of Hadoop for \texttt{\$HADOOP\_PREFIX}}
 +\item{Choose a location for Accumulo logs and enter it for \texttt{\$ACCUMULO\_LOG\_DIR}}
 +\item{Enter the location of ZooKeeper for \texttt{\$ZOOKEEPER\_HOME}}
 +\end{enumerate}
 +
 +By default Accumulo TabletServers are set to use 1GB of memory. You may change
 +this by altering the value of \texttt{\$ACCUMULO\_TSERVER\_OPTS}. Note the syntax is that of
 +the Java JVM command line options. This value should be less than the physical
 +memory of the machines running TabletServers.
 +
 +There are similar options for the master's memory usage and the garbage collector
 +process. Reduce these if they exceed the physical RAM of your hardware and
 +increase them, within the bounds of the physical RAM, if a process fails because of
 +insufficient memory.
 +
 +Note that you will be specifying the Java heap space in accumulo-env.sh. You should
 +make sure that the total heap space used for the Accumulo tserver and the Hadoop
 +DataNode and TaskTracker is less than the available memory on each slave node in
 +the cluster. On large clusters, it is recommended that the Accumulo master, Hadoop
 +NameNode, secondary NameNode, and Hadoop JobTracker all be run on separate
 +machines to allow them to use more heap space. If you are running these on the
 +same machine on a small cluster, likewise make sure their heap space settings fit
 +within the available memory.
 +
 +\subsection{Native Map}
 +
 +The tablet server uses a data structure called a MemTable to store sorted key/value
 +pairs in memory when they are first received from the client. When a minor compaction
 +occurs, this data structure is written to HDFS. The MemTable will default to using
 +memory in the JVM but a JNI version, called the native map, can be used to significantly
 +speed up performance by utilizing the memory space of the native operating system. The
 +native map also avoids the performance implications brought on by garbage collection
 +in the JVM by causing it to pause much less frequently.
 +
 +32-bit and 64-bit Linux versions of the native map ship with the Accumulo dist package.
 +For other operating systems, the native map can be built from the codebase in two ways-
 +from maven or from the Makefile.
 +
 +\begin{enumerate}
 +\item{Build from maven using the following command: \texttt{mvn clean package -Pnative.}}
 +\item{Build from the c++ source by running \texttt{make} in the \texttt{\$ACCUMULO\_HOME/server/src/main/c++} directory.}
 +\end{enumerate}
 +
 +After building the native map from the source, you will find the artifact in
 +\texttt{\$ACCUMULO\_HOME/lib/native.} Upon starting up, the tablet server will look
 +in this directory for the map library. If the file is renamed or moved from its
 +target directory, the tablet server may not be able to find it.
 +
 +\subsection{Cluster Specification}
 +
 +On the machine that will serve as the Accumulo master:
 +
 +\begin{enumerate}
 +\item{Write the IP address or domain name of the Accumulo Master to the\\\texttt{\$ACCUMULO\_HOME/conf/masters} file.}
 +\item{Write the IP addresses or domain name of the machines that will be TabletServers in\\\texttt{\$ACCUMULO\_HOME/conf/slaves}, one per line.}
 +\end{enumerate}
 +
 +Note that if using domain names rather than IP addresses, DNS must be configured
 +properly for all machines participating in the cluster. DNS can be a confusing source
 +of errors.
 +
 +\subsection{Accumulo Settings}
 +Specify appropriate values for the following settings in\\
 +\texttt{\$ACCUMULO\_HOME/conf/accumulo-site.xml} :
 +
 +\small
 +\begin{verbatim}
 +<property>
 +    <name>zookeeper</name>
 +    <value>zooserver-one:2181,zooserver-two:2181</value>
 +    <description>list of zookeeper servers</description>
 +</property>
 +\end{verbatim}
 +\normalsize
 +
 +This enables Accumulo to find ZooKeeper. Accumulo uses ZooKeeper to coordinate
 +settings between processes and helps finalize TabletServer failure.
 +
 +
 +\small
 +\begin{verbatim}
 +<property>
 +    <name>instance.secret</name>
 +    <value>DEFAULT</value>
 +</property>
 +\end{verbatim}
 +\normalsize
 +
 +The instance needs a secret to enable secure communication between servers. Configure your
 +secret and make sure that the \texttt{accumulo-site.xml} file is not readable to other users.
 +
 +Some settings can be modified via the Accumulo shell and take effect immediately, but
 +some settings require a process restart to take effect. See the configuration documentation
 +(available on the monitor web pages) for details.
 +
 +\subsection{Deploy Configuration}
 +
 +Copy the masters, slaves, accumulo-env.sh, and if necessary, accumulo-site.xml
 +from the\\\texttt{\$ACCUMULO\_HOME/conf/} directory on the master to all the machines
 +specified in the slaves file.
 +
 +\section{Initialization}
 +
 +Accumulo must be initialized to create the structures it uses internally to locate
 +data across the cluster. HDFS is required to be configured and running before
 +Accumulo can be initialized.
 +
 +Once HDFS is started, initialization can be performed by executing\\
 +\texttt{\$ACCUMULO\_HOME/bin/accumulo init} . This script will prompt for a name
 +for this instance of Accumulo. The instance name is used to identify a set of tables
 +and instance-specific settings. The script will then write some information into
 +HDFS so Accumulo can start properly.
 +
 +The initialization script will prompt you to set a root password. Once Accumulo is
 +initialized it can be started.
 +
 +\section{Running}
 +
 +\subsection{Starting Accumulo}
 +
 +Make sure Hadoop is configured on all of the machines in the cluster, including
 +access to a shared HDFS instance. Make sure HDFS and ZooKeeper are running.
 +Make sure ZooKeeper is configured and running on at least one machine in the
 +cluster.
 +Start Accumulo using the \texttt{bin/start-all.sh} script.
 +
 +To verify that Accumulo is running, check the Status page as described under
 +\emph{Monitoring}. In addition, the Shell can provide some information about the status of
 +tables via reading the !METADATA table.
 +
 +\subsection{Stopping Accumulo}
 +
 +To shutdown cleanly, run \texttt{bin/stop-all.sh} and the master will orchestrate the
 +shutdown of all the tablet servers. Shutdown waits for all minor compactions to finish, so it may
 +take some time for particular configurations.
 +
++\subsection{Adding a Node}
++
++Update your \texttt{\$ACCUMULO_HOME/conf/slaves} (or \texttt{\$ACCUMULO_CONF_DIR/slaves}) file to account for the addition.
++
++\begin{verbatim}
++$ACCUMULO_HOME/bin/accumulo admin start <host(s)> {<host> ...}
++\end{verbatim}
++
++Alternatively, you can ssh to each of the hosts you want to add and run 
++\texttt{\$ACCUMULO_HOME/bin/start-here.sh}.
++
++Make sure the host in question has the new configuration, or else the tablet 
++server won't start; at a minimum this needs to be on the host(s) being added, 
++but in practice it's good to ensure consistent configuration across all nodes.
++
++\subsection{Decomissioning a Node}
++
++If you need to take a node out of operation, you can trigger a graceful shutdown of a tablet 
++server. Accumulo will automatically rebalance the tablets across the available tablet servers.
++
++\begin{verbatim}
++$ACCUMULO_HOME/bin/accumulo admin stop <host(s)> {<host> ...}
++\end{verbatim}
++
++Alternatively, you can ssh to each of the hosts you want to remove and run 
++\texttt{\$ACCUMULO_HOME/bin/stop-here.sh}.
++
++Be sure to update your \texttt{\$ACCUMULO_HOME/conf/slaves} (or \texttt{\$ACCUMULO_CONF_DIR/slaves}) file to 
++account for the removal of these hosts. Bear in mind that the monitor will not re-read the 
++slaves file automatically, so it will report the decomissioned servers as down; it's 
++recommended that you restart the monitor so that the node list is up to date.
++
 +\section{Monitoring}
 +
 +The Accumulo Master provides an interface for monitoring the status and health of
 +Accumulo components. This interface can be accessed by pointing a web browser to\\
 +\texttt{http://accumulomaster:50095/status}
 +
 +\section{Tracing}
 +It can be difficult to determine why some operations are taking longer
 +than expected. For example, you may be looking up items with very low
 +latency, but sometimes the lookups take much longer. Determining the
 +cause of the delay is difficult because the system is distributed, and
 +the typical lookup is fast.
 +
 +Accumulo has been instrumented to record the time that various
 +operations take when tracing is turned on. The fact that tracing is
 +enabled follows all the requests made on behalf of the user throughout
 +the distributed infrastructure of accumulo, and across all threads of
 +execution.
 +
 +These time spans will be inserted into the \texttt{trace} table in
 +Accumulo. You can browse recent traces from the Accumulo monitor
 +page. You can also read the \texttt{trace} table directly like any
 +other table.
 +
 +The design of Accumulo's distributed tracing follows that of
 +\href{http://research.google.com/pubs/pub36356.html}{Google's Dapper}.
 +
 +\subsection{Tracers}
 +To collect traces, Accumulo needs at least one server listed in
 +\\\texttt{\$ACCUMULO\_HOME/conf/tracers}. The server collects traces
 +from clients and writes them to the \texttt{trace} table. The Accumulo
 +user that the tracer connects to Accumulo with can be configured with
 +the following properties
 +
 +\begin{verbatim}
 +trace.user
 +trace.token.property.password
 +\end{verbatim}
 +
 +\subsection{Instrumenting a Client}
 +Tracing can be used to measure a client operation, such as a scan, as
 +the operation traverses the distributed system. To enable tracing for
 +your application call
 +
 +\begin{verbatim}
 +DistributedTrace.enable(instance, new ZooReader(instance), hostname, "myApplication");
 +\end{verbatim}
 +
 +Once tracing has been enabled, a client can wrap an operation in a trace.
 +
 +\begin{verbatim}
 +Trace.on("Client Scan");
 +BatchScanner scanner = conn.createBatchScanner(...);
 +// Configure your scanner
 +for (Entry entry : scanner) {
 +}
 +Trace.off();
 +\end{verbatim}
 +
 +Additionally, the user can create additional Spans within a Trace.
 +\begin{verbatim}
 +Trace.on("Client Update");
 +...
 +Span readSpan = Trace.start("Read");
 +...
 +readSpan.stop();
 +...
 +Span writeSpan = Trace.start("Write");
 +...
 +writeSpan.stop();
 +Trace.off();
 +\end{verbatim}
 +
 +Like Dapper, Accumulo tracing supports user defined annotations to associate additional data with a Trace.
 +\begin{verbatim}
 +...
 +int numberOfEntriesRead = 0;
 +Span readSpan = Trace.start("Read");
 +// Do the read, update the counter
 +...
 +readSpan.data("Number of Entries Read", String.valueOf(numberOfEntriesRead));
 +\end{verbatim}
 +
 +Some client operations may have a high volume within your
 +application. As such, you may wish to only sample a percentage of
 +operations for tracing. As seen below, the CountSampler can be used to
 +help enable tracing for 1-in-1000 operations
 +\begin{verbatim}
 +Sampler sampler = new CountSampler(1000);
 +...
 +if (sampler.next()) {
 +  Trace.on("Read");
 +}
 +...
 +Trace.offNoFlush();
 +\end{verbatim}
 +
 +It should be noted that it is safe to turn off tracing even if it
 +isn't currently active. The Trace.offNoFlush() should be used if the
 +user does not wish to have Trace.off() block while flushing trace
 +data.
 +
 +\subsection{Viewing Collected Traces}
 +To view collected traces, use the "Recent Traces" link on the Monitor
 +UI. You can also programmatically access and print traces using the
 +\texttt{TraceDump} class.
 +
 +\subsection{Tracing from the Shell}
 +You can enable tracing for operations run from the shell by using the
 +\texttt{trace on} and \texttt{trace off} commands.
 +
 +\begin{verbatim}
 +root@test test> trace on
 +root@test test> scan
 +a b:c []    d
 +root@test test> trace off
 +Waiting for trace information
 +Waiting for trace information
 +Trace started at 2013/08/26 13:24:08.332
 +Time  Start  Service@Location       Name
 + 3628+0      shell@localhost shell:root
 +    8+1690     shell@localhost scan
 +    7+1691       shell@localhost scan:location
 +    6+1692         tserver@localhost startScan
 +    5+1692           tserver@localhost tablet read ahead 6
 +\end{verbatim}
 +
 +\section{Logging}
 +Accumulo processes each write to a set of log files. By default these are found under\\
 +\texttt{\$ACCUMULO/logs/}.
 +
 +\section{Recovery}
 +
 +In the event of TabletServer failure or error on shutting Accumulo down, some
 +mutations may not have been minor compacted to HDFS properly. In this case,
 +Accumulo will automatically reapply such mutations from the write-ahead log
 +either when the tablets from the failed server are reassigned by the Master, in the
 +case of a single TabletServer failure or the next time Accumulo starts, in the event of
 +failure during shutdown.
 +
 +Recovery is performed by asking a tablet server to sort the logs so that tablets can easily find their missing
 +updates. The sort status of each file is displayed on
 +Accumulo monitor status page. Once the recovery is complete any
 +tablets involved should return to an ``online" state. Until then those tablets will be
 +unavailable to clients.
 +
 +The Accumulo client library is configured to retry failed mutations and in many
 +cases clients will be able to continue processing after the recovery process without
 +throwing an exception.
 +