You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@accumulo.apache.org by el...@apache.org on 2013/12/05 18:34:41 UTC

[3/5] git commit: Merge branch '1.5.1-SNAPSHOT' into 1.6.0-SNAPSHOT

Merge branch '1.5.1-SNAPSHOT' into 1.6.0-SNAPSHOT


Project: http://git-wip-us.apache.org/repos/asf/accumulo/repo
Commit: http://git-wip-us.apache.org/repos/asf/accumulo/commit/1bddc574
Tree: http://git-wip-us.apache.org/repos/asf/accumulo/tree/1bddc574
Diff: http://git-wip-us.apache.org/repos/asf/accumulo/diff/1bddc574

Branch: refs/heads/master
Commit: 1bddc574086129aca3484a6070aee257c8622085
Parents: 0d49819 00fb08b
Author: Christopher Tubbs <ct...@apache.org>
Authored: Thu Dec 5 11:55:58 2013 -0500
Committer: Christopher Tubbs <ct...@apache.org>
Committed: Thu Dec 5 11:55:58 2013 -0500

----------------------------------------------------------------------
 .../apache/accumulo/examples/simple/filedata/FileDataIngest.java | 2 +-
 .../apache/accumulo/examples/simple/filedata/FileDataQuery.java  | 2 +-
 server/monitor/src/main/resources/docs/examples/README.filedata  | 4 ++--
 3 files changed, 4 insertions(+), 4 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/accumulo/blob/1bddc574/examples/simple/src/main/java/org/apache/accumulo/examples/simple/filedata/FileDataQuery.java
----------------------------------------------------------------------

http://git-wip-us.apache.org/repos/asf/accumulo/blob/1bddc574/server/monitor/src/main/resources/docs/examples/README.filedata
----------------------------------------------------------------------
diff --cc server/monitor/src/main/resources/docs/examples/README.filedata
index 946ca8c,0000000..9f0016e
mode 100644,000000..100644
--- a/server/monitor/src/main/resources/docs/examples/README.filedata
+++ b/server/monitor/src/main/resources/docs/examples/README.filedata
@@@ -1,47 -1,0 +1,47 @@@
 +Title: Apache Accumulo File System Archive Example (Data Only)
 +Notice:    Licensed to the Apache Software Foundation (ASF) under one
 +           or more contributor license agreements.  See the NOTICE file
 +           distributed with this work for additional information
 +           regarding copyright ownership.  The ASF licenses this file
 +           to you under the Apache License, Version 2.0 (the
 +           "License"); you may not use this file except in compliance
 +           with the License.  You may obtain a copy of the License at
 +           .
 +             http://www.apache.org/licenses/LICENSE-2.0
 +           .
 +           Unless required by applicable law or agreed to in writing,
 +           software distributed under the License is distributed on an
 +           "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 +           KIND, either express or implied.  See the License for the
 +           specific language governing permissions and limitations
 +           under the License.
 +
 +This example archives file data into an Accumulo table.  Files with duplicate data are only stored once.
 +The example has the following classes:
 +
 + * CharacterHistogram - A MapReduce that computes a histogram of byte frequency for each file and stores the histogram alongside the file data.  An example use of the ChunkInputFormat.
 + * ChunkCombiner - An Iterator that dedupes file data and sets their visibilities to a combined visibility based on current references to the file data.
 + * ChunkInputFormat - An Accumulo InputFormat that provides keys containing file info (List<Entry<Key,Value>>) and values with an InputStream over the file (ChunkInputStream).
 + * ChunkInputStream - An input stream over file data stored in Accumulo.
-  * FileDataIngest - Takes a list of files and archives them into Accumulo keyed on the SHA1 hashes of the files.
-  * FileDataQuery - Retrieves file data based on the SHA1 hash of the file. (Used by the dirlist.Viewer.)
++ * FileDataIngest - Takes a list of files and archives them into Accumulo keyed on hashes of the files.
++ * FileDataQuery - Retrieves file data based on the hash of the file. (Used by the dirlist.Viewer.)
 + * KeyUtil - A utility for creating and parsing null-byte separated strings into/from Text objects.
 + * VisibilityCombiner - A utility for merging visibilities into the form (VIS1)|(VIS2)|...
 +
 +This example is coupled with the dirlist example.  See README.dirlist for instructions.
 +
 +If you haven't already run the README.dirlist example, ingest a file with FileDataIngest.
 +
 +    $ ./bin/accumulo org.apache.accumulo.examples.simple.filedata.FileDataIngest -i instance -z zookeepers -u username -p password -t dataTable --auths exampleVis --chunk 1000 $ACCUMULO_HOME/README
 +
 +Open the accumulo shell and look at the data.  The row is the MD5 hash of the file, which you can verify by running a command such as 'md5sum' on the file.
 +
 +    > scan -t dataTable
 +
 +Run the CharacterHistogram MapReduce to add some information about the file.
 +
 +    $ bin/tool.sh lib/accumulo-examples-simple.jar org.apache.accumulo.examples.simple.filedata.CharacterHistogram -i instance -z zookeepers -u username -p password -t dataTable --auths exampleVis --vis exampleVis
 +
 +Scan again to see the histogram stored in the 'info' column family.
 +
 +    > scan -t dataTable