You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@accumulo.apache.org by ec...@apache.org on 2014/06/09 16:54:08 UTC
[3/4] git commit: ACCUMULO-2874 update the docs to match the actual
API
ACCUMULO-2874 update the docs to match the actual API
Project: http://git-wip-us.apache.org/repos/asf/accumulo/repo
Commit: http://git-wip-us.apache.org/repos/asf/accumulo/commit/63b9685c
Tree: http://git-wip-us.apache.org/repos/asf/accumulo/tree/63b9685c
Diff: http://git-wip-us.apache.org/repos/asf/accumulo/diff/63b9685c
Branch: refs/heads/master
Commit: 63b9685ca7097c7e3fc006f1ae4a5b03c719599b
Parents: 0e097ac 0038df1
Author: Eric C. Newton <er...@gmail.com>
Authored: Mon Jun 9 10:52:06 2014 -0400
Committer: Eric C. Newton <er...@gmail.com>
Committed: Mon Jun 9 10:52:06 2014 -0400
----------------------------------------------------------------------
docs/src/main/asciidoc/chapters/clients.txt | 6 ++--
.../system/continuous/continuous-env.sh.example | 30 ++++++++++++++------
test/system/continuous/start-agitator.sh | 12 ++++----
3 files changed, 30 insertions(+), 18 deletions(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/accumulo/blob/63b9685c/docs/src/main/asciidoc/chapters/clients.txt
----------------------------------------------------------------------
diff --cc docs/src/main/asciidoc/chapters/clients.txt
index 3e0b845,0000000..45f1fe9
mode 100644,000000..100644
--- a/docs/src/main/asciidoc/chapters/clients.txt
+++ b/docs/src/main/asciidoc/chapters/clients.txt
@@@ -1,320 -1,0 +1,320 @@@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements. See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+== Writing Accumulo Clients
+
+=== Running Client Code
+
+There are multiple ways to run Java code that uses Accumulo. Below is a list
+of the different ways to execute client code.
+
+* using java executable
+* using the accumulo script
+* using the tool script
+
+In order to run client code written to run against Accumulo, you will need to
+include the jars that Accumulo depends on in your classpath. Accumulo client
+code depends on Hadoop and Zookeeper. For Hadoop add the hadoop client jar, all
+of the jars in the Hadoop lib directory, and the conf directory to the
+classpath. For Zookeeper 3.3 you only need to add the Zookeeper jar, and not
+what is in the Zookeeper lib directory. You can run the following command on a
+configured Accumulo system to see what its using for its classpath.
+
+ $ACCUMULO_HOME/bin/accumulo classpath
+
+Another option for running your code is to put a jar file in
++$ACCUMULO_HOME/lib/ext+. After doing this you can use the accumulo
+script to execute your code. For example if you create a jar containing the
+class +com.foo.Client+ and placed that in +lib/ext+, then you could use the command
++$ACCUMULO_HOME/bin/accumulo com.foo.Client+ to execute your code.
+
+If you are writing map reduce job that access Accumulo, then you can use the
+bin/tool.sh script to run those jobs. See the map reduce example.
+
+=== Connecting
+
+All clients must first identify the Accumulo instance to which they will be
+communicating. Code to do this is as follows:
+
+[source,java]
+----
+String instanceName = "myinstance";
+String zooServers = "zooserver-one,zooserver-two"
+Instance inst = new ZooKeeperInstance(instanceName, zooServers);
+
+Connector conn = inst.getConnector("user", new PasswordToken("passwd"));
+----
+
+=== Writing Data
+
+Data are written to Accumulo by creating Mutation objects that represent all the
+changes to the columns of a single row. The changes are made atomically in the
+TabletServer. Clients then add Mutations to a BatchWriter which submits them to
+the appropriate TabletServers.
+
+Mutations can be created thus:
+
+[source,java]
+----
+Text rowID = new Text("row1");
+Text colFam = new Text("myColFam");
+Text colQual = new Text("myColQual");
+ColumnVisibility colVis = new ColumnVisibility("public");
+long timestamp = System.currentTimeMillis();
+
+Value value = new Value("myValue".getBytes());
+
+Mutation mutation = new Mutation(rowID);
+mutation.put(colFam, colQual, colVis, timestamp, value);
+----
+
+==== BatchWriter
+The BatchWriter is highly optimized to send Mutations to multiple TabletServers
+and automatically batches Mutations destined for the same TabletServer to
+amortize network overhead. Care must be taken to avoid changing the contents of
+any Object passed to the BatchWriter since it keeps objects in memory while
+batching.
+
+Mutations are added to a BatchWriter thus:
+
+[source,java]
+----
+// BatchWriterConfig has reasonable defaults
+BatchWriterConfig config = new BatchWriterConfig();
+config.setMaxMemory(10000000L); // bytes available to batchwriter for buffering mutations
+
+BatchWriter writer = conn.createBatchWriter("table", config)
+
+writer.add(mutation);
+
+writer.close();
+----
+
+An example of using the batch writer can be found at
++accumulo/docs/examples/README.batch+.
+
+==== ConditionalWriter
+The ConditionalWriter enables efficient, atomic read-modify-write operations on
+rows. The ConditionalWriter writes special Mutations which have a list of per
+column conditions that must all be met before the mutation is applied. The
+conditions are checked in the tablet server while a row lock is
+held\footnote{Mutations written by the BatchWriter will not obtain a row
+lock.}. The conditions that can be checked for a column are equality and
+absence. For example a conditional mutation can require that column A is
+absent inorder to be applied. Iterators can be applied when checking
+conditions. Using iterators, many other operations besides equality and
+absence can be checked. For example, using an iterator that converts values
+less than 5 to 0 and everything else to 1, its possible to only apply a
+mutation when a column is less than 5.
+
+In the case when a tablet server dies after a client sent a conditional
+mutation, its not known if the mutation was applied or not. When this happens
+the ConditionalWriter reports a status of UNKNOWN for the ConditionalMutation.
+In many cases this situation can be dealt with by simply reading the row again
+and possibly sending another conditional mutation. If this is not sufficient,
+then a higher level of abstraction can be built by storing transactional
+information within a row.
+
+An example of using the batch writer can be found at
++accumulo/docs/examples/README.reservations+.
+
+=== Reading Data
+
+Accumulo is optimized to quickly retrieve the value associated with a given key, and
+to efficiently return ranges of consecutive keys and their associated values.
+
+==== Scanner
+
+To retrieve data, Clients use a Scanner, which acts like an Iterator over
+keys and values. Scanners can be configured to start and stop at particular keys, and
+to return a subset of the columns available.
+
+[source,java]
+----
+// specify which visibilities we are allowed to see
+Authorizations auths = new Authorizations("public");
+
+Scanner scan =
+ conn.createScanner("table", auths);
+
+scan.setRange(new Range("harry","john"));
- scan.fetchFamily("attributes");
++scan.fetchColumnFamily(new Text("attributes"));
+
+for(Entry<Key,Value> entry : scan) {
- String row = entry.getKey().getRow();
++ Text row = entry.getKey().getRow();
+ Value value = entry.getValue();
+}
+----
+
+==== Isolated Scanner
+
+Accumulo supports the ability to present an isolated view of rows when
+scanning. There are three possible ways that a row could change in Accumulo :
+
+* a mutation applied to a table
+* iterators executed as part of a minor or major compaction
+* bulk import of new files
+
+Isolation guarantees that either all or none of the changes made by these
+operations on a row are seen. Use the IsolatedScanner to obtain an isolated
+view of an Accumulo table. When using the regular scanner it is possible to see
+a non isolated view of a row. For example if a mutation modifies three
+columns, it is possible that you will only see two of those modifications.
+With the isolated scanner either all three of the changes are seen or none.
+
+The IsolatedScanner buffers rows on the client side so a large row will not
+crash a tablet server. By default rows are buffered in memory, but the user
+can easily supply their own buffer if they wish to buffer to disk when rows are
+large.
+
+For an example, look at the following
+
+ examples/simple/src/main/java/org/apache/accumulo/examples/simple/isolation/InterferenceTest.java
+
+==== BatchScanner
+
+For some types of access, it is more efficient to retrieve several ranges
+simultaneously. This arises when accessing a set of rows that are not consecutive
+whose IDs have been retrieved from a secondary index, for example.
+
+The BatchScanner is configured similarly to the Scanner; it can be configured to
+retrieve a subset of the columns available, but rather than passing a single Range,
+BatchScanners accept a set of Ranges. It is important to note that the keys returned
+by a BatchScanner are not in sorted order since the keys streamed are from multiple
+TabletServers in parallel.
+
+[source,java]
+----
+ArrayList<Range> ranges = new ArrayList<Range>();
+// populate list of ranges ...
+
+BatchScanner bscan =
+ conn.createBatchScanner("table", auths, 10);
+bscan.setRanges(ranges);
- bscan.fetchFamily("attributes");
++bscan.fetchColumnFamily("attributes");
+
+for(Entry<Key,Value> entry : scan) {
+ System.out.println(entry.getValue());
+}
+----
+
+An example of the BatchScanner can be found at
++accumulo/docs/examples/README.batch+.
+
+=== Proxy
+
+The proxy API allows the interaction with Accumulo with languages other than Java.
+A proxy server is provided in the codebase and a client can further be generated.
+
+==== Prequisites
+
+The proxy server can live on any node in which the basic client API would work. That
+means it must be able to communicate with the Master, ZooKeepers, NameNode, and the
+DataNodes. A proxy client only needs the ability to communicate with the proxy server.
+
+
+==== Configuration
+
+The configuration options for the proxy server live inside of a properties file. At
+the very least, you need to supply the following properties:
+
+ protocolFactory=org.apache.thrift.protocol.TCompactProtocol$Factory
+ tokenClass=org.apache.accumulo.core.client.security.tokens.PasswordToken
+ port=42424
+ instance=test
+ zookeepers=localhost:2181
+
+You can find a sample configuration file in your distribution:
+
+ $ACCUMULO_HOME/proxy/proxy.properties.
+
+This sample configuration file further demonstrates an ability to back the proxy server
+by MockAccumulo or the MiniAccumuloCluster.
+
+==== Running the Proxy Server
+
+After the properties file holding the configuration is created, the proxy server
+can be started using the following command in the Accumulo distribution (assuming
+your properties file is named +config.properties+):
+
+ $ACCUMULO_HOME/bin/accumulo proxy -p config.properties
+
+==== Creating a Proxy Client
+
+Aside from installing the Thrift compiler, you will also need the language-specific library
+for Thrift installed to generate client code in that language. Typically, your operating
+system's package manager will be able to automatically install these for you in an expected
+location such as +/usr/lib/python/site-packages/thrift+.
+
+You can find the thrift file for generating the client:
+
+ $ACCUMULO_HOME/proxy/proxy.thrift.
+
+After a client is generated, the port specified in the configuration properties above will be
+used to connect to the server.
+
+==== Using a Proxy Client
+
+The following examples have been written in Java and the method signatures may be
+slightly different depending on the language specified when generating client with
+the Thrift compiler. After initiating a connection to the Proxy (see Apache Thrift's
+documentation for examples of connecting to a Thrift service), the methods on the
+proxy client will be available. The first thing to do is log in:
+
+[source,java]
+Map password = new HashMap<String,String>();
+password.put("password", "secret");
+ByteBuffer token = client.login("root", password);
+
+Once logged in, the token returned will be used for most subsequent calls to the client.
+Let's create a table, add some data, scan the table, and delete it.
+
+
+First, create a table.
+
+[source,java]
+client.createTable(token, "myTable", true, TimeType.MILLIS);
+
+
+Next, add some data:
+
+[source,java]
+----
+// first, create a writer on the server
+String writer = client.createWriter(token, "myTable", new WriterOptions());
+
+// build column updates
+Map<ByteBuffer, List<ColumnUpdate> cells> cellsToUpdate = //...
+
+// send updates to the server
+client.updateAndFlush(writer, "myTable", cellsToUpdate);
+
+client.closeWriter(writer);
+----
+
+
+Scan for the data and batch the return of the results on the server:
+
+[source,java]
+----
+String scanner = client.createScanner(token, "myTable", new ScanOptions());
+ScanResult results = client.nextK(scanner, 100);
+
+for(KeyValue keyValue : results.getResultsIterator()) {
+ // do something with results
+}
+
+client.closeScanner(scanner);
+----