You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@accumulo.apache.org by el...@apache.org on 2013/12/05 18:34:41 UTC
[3/5] git commit: Merge branch '1.5.1-SNAPSHOT' into 1.6.0-SNAPSHOT
Merge branch '1.5.1-SNAPSHOT' into 1.6.0-SNAPSHOT
Project: http://git-wip-us.apache.org/repos/asf/accumulo/repo
Commit: http://git-wip-us.apache.org/repos/asf/accumulo/commit/1bddc574
Tree: http://git-wip-us.apache.org/repos/asf/accumulo/tree/1bddc574
Diff: http://git-wip-us.apache.org/repos/asf/accumulo/diff/1bddc574
Branch: refs/heads/master
Commit: 1bddc574086129aca3484a6070aee257c8622085
Parents: 0d49819 00fb08b
Author: Christopher Tubbs <ct...@apache.org>
Authored: Thu Dec 5 11:55:58 2013 -0500
Committer: Christopher Tubbs <ct...@apache.org>
Committed: Thu Dec 5 11:55:58 2013 -0500
----------------------------------------------------------------------
.../apache/accumulo/examples/simple/filedata/FileDataIngest.java | 2 +-
.../apache/accumulo/examples/simple/filedata/FileDataQuery.java | 2 +-
server/monitor/src/main/resources/docs/examples/README.filedata | 4 ++--
3 files changed, 4 insertions(+), 4 deletions(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/accumulo/blob/1bddc574/examples/simple/src/main/java/org/apache/accumulo/examples/simple/filedata/FileDataQuery.java
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/accumulo/blob/1bddc574/server/monitor/src/main/resources/docs/examples/README.filedata
----------------------------------------------------------------------
diff --cc server/monitor/src/main/resources/docs/examples/README.filedata
index 946ca8c,0000000..9f0016e
mode 100644,000000..100644
--- a/server/monitor/src/main/resources/docs/examples/README.filedata
+++ b/server/monitor/src/main/resources/docs/examples/README.filedata
@@@ -1,47 -1,0 +1,47 @@@
+Title: Apache Accumulo File System Archive Example (Data Only)
+Notice: Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+ .
+ http://www.apache.org/licenses/LICENSE-2.0
+ .
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied. See the License for the
+ specific language governing permissions and limitations
+ under the License.
+
+This example archives file data into an Accumulo table. Files with duplicate data are only stored once.
+The example has the following classes:
+
+ * CharacterHistogram - A MapReduce that computes a histogram of byte frequency for each file and stores the histogram alongside the file data. An example use of the ChunkInputFormat.
+ * ChunkCombiner - An Iterator that dedupes file data and sets their visibilities to a combined visibility based on current references to the file data.
+ * ChunkInputFormat - An Accumulo InputFormat that provides keys containing file info (List<Entry<Key,Value>>) and values with an InputStream over the file (ChunkInputStream).
+ * ChunkInputStream - An input stream over file data stored in Accumulo.
- * FileDataIngest - Takes a list of files and archives them into Accumulo keyed on the SHA1 hashes of the files.
- * FileDataQuery - Retrieves file data based on the SHA1 hash of the file. (Used by the dirlist.Viewer.)
++ * FileDataIngest - Takes a list of files and archives them into Accumulo keyed on hashes of the files.
++ * FileDataQuery - Retrieves file data based on the hash of the file. (Used by the dirlist.Viewer.)
+ * KeyUtil - A utility for creating and parsing null-byte separated strings into/from Text objects.
+ * VisibilityCombiner - A utility for merging visibilities into the form (VIS1)|(VIS2)|...
+
+This example is coupled with the dirlist example. See README.dirlist for instructions.
+
+If you haven't already run the README.dirlist example, ingest a file with FileDataIngest.
+
+ $ ./bin/accumulo org.apache.accumulo.examples.simple.filedata.FileDataIngest -i instance -z zookeepers -u username -p password -t dataTable --auths exampleVis --chunk 1000 $ACCUMULO_HOME/README
+
+Open the accumulo shell and look at the data. The row is the MD5 hash of the file, which you can verify by running a command such as 'md5sum' on the file.
+
+ > scan -t dataTable
+
+Run the CharacterHistogram MapReduce to add some information about the file.
+
+ $ bin/tool.sh lib/accumulo-examples-simple.jar org.apache.accumulo.examples.simple.filedata.CharacterHistogram -i instance -z zookeepers -u username -p password -t dataTable --auths exampleVis --vis exampleVis
+
+Scan again to see the histogram stored in the 'info' column family.
+
+ > scan -t dataTable