You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@accumulo.apache.org by do...@apache.org on 2022/04/04 13:16:48 UTC

[accumulo-examples] branch main updated: Fix and improve several examples (#94)

This is an automated email from the ASF dual-hosted git repository.

domgarguilo pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/accumulo-examples.git


The following commit(s) were added to refs/heads/main by this push:
     new ac2ec84  Fix and improve several examples (#94)
ac2ec84 is described below

commit ac2ec84b87910fbb656751fb927995396798029c
Author: Dom G <do...@apache.org>
AuthorDate: Mon Apr 4 09:16:43 2022 -0400

    Fix and improve several examples (#94)
---
 docs/bloom.md              |  2 +-
 docs/classpath.md          |  2 +-
 docs/compactionStrategy.md | 10 +++++-----
 docs/shard.md              |  2 +-
 docs/tabletofile.md        |  6 ++----
 docs/terasort.md           |  4 ++--
 docs/wordcount.md          |  6 ++++--
 7 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/docs/bloom.md b/docs/bloom.md
index 8a38df5..da3a974 100644
--- a/docs/bloom.md
+++ b/docs/bloom.md
@@ -24,7 +24,7 @@ do not exist in a table.
 
 Accumulo data is divided into tablets and each tablet has multiple r-files.
 Lookup performance of a tablet with 3 r-files can be 3 times slower than
-a tablet with one r-file. However if the files contain unique sets of data,
+a tablet with one r-file. However, if the files contain unique sets of data,
 then bloom filters can help with performance.
 
 Run the example below to create two identical tables. One table has bloom
diff --git a/docs/classpath.md b/docs/classpath.md
index efd37bc..e12df09 100644
--- a/docs/classpath.md
+++ b/docs/classpath.md
@@ -66,7 +66,7 @@ use cx1.
     root@uno examples.nofootwo> setiter -n foofilter -p 10 -scan -minc -majc -class org.apache.accumulo.test.FooFilter
         2013-05-03 12:49:35,943 [shell.Shell] ERROR: org.apache.accumulo.shell.ShellCommandException: Command could 
     not be initialized (Unable to load org.apache.accumulo.test.FooFilter; class not found.)
-    root@uno examples.nofootwo> config -t nofootwo -s table.class.loader.context=cx1
+    root@uno examples.nofootwo> config -t examples.nofootwo -s table.class.loader.context=cx1
     root@uno examples.nofootwo> setiter -n foofilter -p 10 -scan -minc -majc -class org.apache.accumulo.test.FooFilter
     Filter accepts or rejects each Key/Value pair
     ----------> set FooFilter parameter negate, default false keeps k/v that pass accept method, true rejects k/v that pass accept method: false
diff --git a/docs/compactionStrategy.md b/docs/compactionStrategy.md
index 8ae0908..b0be2fa 100644
--- a/docs/compactionStrategy.md
+++ b/docs/compactionStrategy.md
@@ -45,10 +45,10 @@ The commands below will configure the BasicCompactionStrategy to:
  
 ```bash
  $ accumulo shell -u <username> -p <password> -e "config -t examples.test1 -s table.file.compress.type=snappy"
- $ accumulo shell -u <username> -p <password> -e "config -t examples.test1 -s examples.table.majc.compaction.strategy=org.apache.accumulo.tserver.compaction.strategies.BasicCompactionStrategy"
- $ accumulo shell -u <username> -p <password> -e "config -t examples.test1 -s examples.table.majc.compaction.strategy.opts.filter.size=250M"
- $ accumulo shell -u <username> -p <password> -e "config -t examples.test1 -s examples.table.majc.compaction.strategy.opts.large.compress.threshold=100M"
- $ accumulo shell -u <username> -p <password> -e "config -t examples.test1 -s examples.table.majc.compaction.strategy.opts.large.compress.type=gz"
+ $ accumulo shell -u <username> -p <password> -e "config -t examples.test1 -s table.majc.compaction.strategy=org.apache.accumulo.tserver.compaction.strategies.BasicCompactionStrategy"
+ $ accumulo shell -u <username> -p <password> -e "config -t examples.test1 -s table.majc.compaction.strategy.opts.filter.size=250M"
+ $ accumulo shell -u <username> -p <password> -e "config -t examples.test1 -s table.majc.compaction.strategy.opts.large.compress.threshold=100M"
+ $ accumulo shell -u <username> -p <password> -e "config -t examples.test1 -s table.majc.compaction.strategy.opts.large.compress.type=gz"
 ```
 
 Generate some data and files in order to test the strategy:
@@ -64,7 +64,7 @@ $ ./bin/runex client.SequentialBatchWriter -t examples.test1 --start 0 --num 130
 $ accumulo shell -u <username> -p <password> -e "flush -t examples.test1"
 ```
 
-View the tserver log in <accumulo_home>/logs for the compaction and find the name of the <rfile> that was compacted for your table. Print info about this file using the PrintInfo tool:
+View the tserver log in <accumulo_home>/logs for the compaction and find the name of the `rfile` that was compacted for your table. Print info about this file using the PrintInfo tool:
 
 ```bash
 $ accumulo rfile-info <rfile>
diff --git a/docs/shard.md b/docs/shard.md
index f6f6848..97a9d40 100644
--- a/docs/shard.md
+++ b/docs/shard.md
@@ -43,7 +43,7 @@ The following command queries the index to find all files containing 'foo' and '
     /local/username/workspace/accumulo/src/core/src/test/java/accumulo/core/data/KeyExtentTest.java
     /local/username/workspace/accumulo/src/core/src/test/java/accumulo/core/iterators/WholeRowIteratorTest.java
 
-In order to run ContinuousQuery, we need to run Reverse.java to populate doc2term.
+In order to run ContinuousQuery, we need to run Reverse.java to populate the `examples.doc2term` table.
 
     $ ./bin/runex shard.Reverse --shardTable examples.shard --doc2Term examples.doc2term
 
diff --git a/docs/tabletofile.md b/docs/tabletofile.md
index c72d5b8..5968e29 100644
--- a/docs/tabletofile.md
+++ b/docs/tabletofile.md
@@ -30,7 +30,7 @@ put a trivial amount of data into accumulo using the accumulo shell:
     root@instance examples.input> quit
 
 The TableToFile class configures a map-only job to read the specified columns and
-write the key/value pairs to a file in HDFS.
+writes the key/value pairs to a file in HDFS.
 
 The following will extract the rows containing the column "cf:cq":
 
@@ -45,6 +45,4 @@ We can see the output of our little map-reduce job:
 
     $ hadoop fs -text /tmp/output/part-m-00000
     catrow cf:cq []	catvalue
-    dogrow cf:cq []	dogvalue
-    $
-
+    dogrow cf:cq []	dogvalue
\ No newline at end of file
diff --git a/docs/terasort.md b/docs/terasort.md
index 16f2ea1..5539883 100644
--- a/docs/terasort.md
+++ b/docs/terasort.md
@@ -25,10 +25,10 @@ ignored.
 
     $ accumulo shell -u root -p secret -e 'createnamespace examples'   
 
-To run this example you run it with arguments describing the amount of data:
+This example is run with arguments describing the amount of data:
 
     $ ./bin/runmr mapreduce.TeraSortIngest --count 10 --minKeySize 10 --maxKeySize 10 \
-        --minValueSize 78 --maxValueSize 78 --table examples.sort --splits 10 \
+        --minValueSize 78 --maxValueSize 78 --table examples.sort --splits 10
 
 After the map reduce job completes, scan the data:
 
diff --git a/docs/wordcount.md b/docs/wordcount.md
index 4c5a27f..fca4af0 100644
--- a/docs/wordcount.md
+++ b/docs/wordcount.md
@@ -55,10 +55,12 @@ information like passwords. A more secure option is store accumulo-client.proper
 in HDFS and run the job with the `-D` options.  This will configure the MapReduce job
 to obtain the client properties from HDFS:
 
-    $ hdfs dfs -copyFromLocal ./conf/accumulo-client.properties /user/myuser/
+    $ hdfs dfs -mkdir /user
+    $ hdfs dfs -mkdir /user/myuser
+    $ hdfs dfs -copyFromLocal /path/to/accumulo/conf/accumulo-client.properties /user/myuser/
     $ ./bin/runmr mapreduce.WordCount -i /wc -t examples.wordcount2 -d /user/myuser/accumulo-client.properties
 
-After the MapReduce job completes, query the `wordcount2` table. The results should
+After the MapReduce job completes, query the `examples.wordcount2` table. The results should
 be the same as before:
 
     $ accumulo shell