You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@accumulo.apache.org by ec...@apache.org on 2011/11/30 20:41:51 UTC

svn commit: r1208728 - in /incubator/accumulo/branches/1.4: ./ docs/ docs/examples/ docs/src/user_manual/chapters/ src/core/src/main/java/org/apache/accumulo/core/conf/

Author: ecn
Date: Wed Nov 30 19:41:50 2011
New Revision: 1208728

URL: http://svn.apache.org/viewvc?rev=1208728&view=rev
Log:
ACCUMULO-165: merge to 1.4 branch

Modified:
    incubator/accumulo/branches/1.4/   (props changed)
    incubator/accumulo/branches/1.4/docs/administration.html
    incubator/accumulo/branches/1.4/docs/bulkIngest.html
    incubator/accumulo/branches/1.4/docs/examples/README.dirlist
    incubator/accumulo/branches/1.4/docs/isolation.html
    incubator/accumulo/branches/1.4/docs/metrics.html
    incubator/accumulo/branches/1.4/docs/src/user_manual/chapters/administration.tex
    incubator/accumulo/branches/1.4/docs/src/user_manual/chapters/analytics.tex
    incubator/accumulo/branches/1.4/docs/src/user_manual/chapters/high_speed_ingest.tex
    incubator/accumulo/branches/1.4/docs/src/user_manual/chapters/security.tex
    incubator/accumulo/branches/1.4/src/core/src/main/java/org/apache/accumulo/core/conf/Property.java

Propchange: incubator/accumulo/branches/1.4/
------------------------------------------------------------------------------
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Nov 30 19:41:50 2011
@@ -1,3 +1,3 @@
 /incubator/accumulo/branches/1.3:1190280,1190413,1190420,1190427,1190500,1195622,1195625,1195629,1195635,1196044,1196054,1196057,1196071-1196072,1196106,1197066,1198935,1199383,1203683,1204625,1205547,1205880,1206169,1208031
 /incubator/accumulo/branches/1.4:1205476
-/incubator/accumulo/trunk:1205476,1205570
+/incubator/accumulo/trunk:1205476,1205570,1208726

Modified: incubator/accumulo/branches/1.4/docs/administration.html
URL: http://svn.apache.org/viewvc/incubator/accumulo/branches/1.4/docs/administration.html?rev=1208728&r1=1208727&r2=1208728&view=diff
==============================================================================
--- incubator/accumulo/branches/1.4/docs/administration.html (original)
+++ incubator/accumulo/branches/1.4/docs/administration.html Wed Nov 30 19:41:50 2011
@@ -28,8 +28,8 @@
 <p>For the most part, accumulo is ready to go out of the box. To start it, first you must distribute and install
 the accumulo software to each machine in the cloud that you wish to run on. The software should be installed
 in the same directory on each machine and configured identically (or at least similarly... see the configuration
-sections for more details). Select one machine to be your boostrap machine, the one that you will start accumulo
-with. Note that you must have passphraseless ssh access to each machine from your bootstrap machine. On this machine,
+sections for more details). Select one machine to be your bootstrap machine, the one that you will start accumulo
+with. Note that you must have passphrase-less ssh access to each machine from your bootstrap machine. On this machine,
 create a conf/masters and conf/slaves file. In the masters file, type the hostname of the machine you wish to run the master on (probably localhost).
 In the slaves file, type the hostnames, separated by newlines of each machine you wish to participate in accumulo as a tablet server. If you neglect
 to create these files, the startup scripts will assume you are trying to run on localhost only, and will instantiate a single-node instance only.
@@ -88,7 +88,7 @@ servers use.  In accumulo-env.sh there i
 ACCUMULO_TSERVER_OPTS.  By default this is set to something like "-Xmx512m
 -Xms512m".  These are Java jvm options asking Java to use 512 megabytes of
 memory.  By default accumulo stores data written to it outside of the Java
-memory space inorder to avoid pauses caused by the Java garbage collector.  The
+memory space in order to avoid pauses caused by the Java garbage collector.  The
 amount of memory it uses for this data is determined by the accumulo setting
 "tserver.memory.maps.max".  Since this memory is outside of the Java managed
 memory, the process can grow larger than the -Xmx setting.  So if -Xmx is set

Modified: incubator/accumulo/branches/1.4/docs/bulkIngest.html
URL: http://svn.apache.org/viewvc/incubator/accumulo/branches/1.4/docs/bulkIngest.html?rev=1208728&r1=1208727&r2=1208728&view=diff
==============================================================================
--- incubator/accumulo/branches/1.4/docs/bulkIngest.html (original)
+++ incubator/accumulo/branches/1.4/docs/bulkIngest.html Wed Nov 30 19:41:50 2011
@@ -57,7 +57,7 @@ when range partitioning using a tables s
 get one map file.
 
 <P>Any set of cut points for range partitioning can be used in a map
-reduce job, but using Accumulos current splits is probably the most
+reduce job, but using Accumulo's current splits is probably the most
 optimal thing to do.  However in some case there may be too many
 splits.  For example if there are 2000 splits, you would need to run
 2001 reducers.  To overcome this problem use the

Modified: incubator/accumulo/branches/1.4/docs/examples/README.dirlist
URL: http://svn.apache.org/viewvc/incubator/accumulo/branches/1.4/docs/examples/README.dirlist?rev=1208728&r1=1208727&r2=1208728&view=diff
==============================================================================
--- incubator/accumulo/branches/1.4/docs/examples/README.dirlist (original)
+++ incubator/accumulo/branches/1.4/docs/examples/README.dirlist Wed Nov 30 19:41:50 2011
@@ -54,7 +54,7 @@ To perform searches on file or directory
     $ ./bin/accumulo org.apache.accumulo.examples.dirlist.QueryUtil instance zookeepers username password indexTable exampleVis '*jar' -search
     $ ./bin/accumulo org.apache.accumulo.examples.dirlist.QueryUtil instance zookeepers username password indexTable exampleVis filename*jar -search
 
-To count the number of direct children (directories and files) and descendants (children and children's descendents, directories and files), run the FileCount over the dirTable table.
+To count the number of direct children (directories and files) and descendants (children and children's descendants, directories and files), run the FileCount over the dirTable table.
 The results are written back to the same table.
 
     $ ./bin/accumulo org.apache.accumulo.examples.dirlist.FileCount instance zookeepers username password dirTable exampleVis exampleVis

Modified: incubator/accumulo/branches/1.4/docs/isolation.html
URL: http://svn.apache.org/viewvc/incubator/accumulo/branches/1.4/docs/isolation.html?rev=1208728&r1=1208727&r2=1208728&view=diff
==============================================================================
--- incubator/accumulo/branches/1.4/docs/isolation.html (original)
+++ incubator/accumulo/branches/1.4/docs/isolation.html Wed Nov 30 19:41:50 2011
@@ -31,9 +31,9 @@
  <li>iterators executed as part of a minor or major compaction 
  <li>bulk import of new files
 </ul>
-Isolation garuantees that either all or none of the changes made by these operations on a row are seen.  Use the <a href='apidocs/org/apache/accumulo/core/client/IsolatedScanner.html'>IsolatedScanner</a> to obtain an isolated view of a accumulo table.  When using the regular scanner it is possible to see a non isolated view of a row.  For example if a mutation modifies three columns, it is possible that you will only see two of those modifications.  With the isolated scanner either all three of the changes are seen or none.  For an example of this try running the <a href='apidocs/org/apache/accumulo/examples/isolation/InterferenceTest.html'>InterferenceTest</a> example.  
+Isolation guarantees that either all or none of the changes made by these operations on a row are seen.  Use the <a href='apidocs/org/apache/accumulo/core/client/IsolatedScanner.html'>IsolatedScanner</a> to obtain an isolated view of a accumulo table.  When using the regular scanner it is possible to see a non isolated view of a row.  For example if a mutation modifies three columns, it is possible that you will only see two of those modifications.  With the isolated scanner either all three of the changes are seen or none.  For an example of this try running the <a href='apidocs/org/apache/accumulo/examples/isolation/InterferenceTest.html'>InterferenceTest</a> example.  
 
 <p>At this time there is no client side isolation support for the <a href='apidocs/org/apache/accumulo/core/client/BatchScanner.html'>BatchScanner</a>.  You may consider using the <a href='apidocs/org/apache/accumulo/core/iterators/WholeRowIterator.html'>WholeRowIterator</a> with the  <a href='apidocs/org/apache/accumulo/core/client/BatchScanner.html'>BatchScanner</a> to achieve isolation though. This drawback of doing this is that entire rows are read into memory on the server side.  If a row is too big, it may crash a tablet server.  The <a href='apidocs/org/apache/accumulo/core/client/IsolatedScanner.html'>IsolatedScanner</a> buffers rows on the client side so a large row will not crash a tablet server.
 
 <h3>Iterators</h3>
-<p>When writing server side iterators for accumulo isolation is something to be aware of.  A scan time iterator in accumulo reads from a set of data sources.  While an iterator is reading data it has an isolated view.  However, after it returns a key/value it is possible that accumulo may switch data sources and re-seek the iterator.  This is done so that resources may be reclaimed.  When the user does not request isolation this can occur after any key is returned.  When a user request isolation this will only occur after a new row is returned, in which case it will reseek to the very beginning of the next possible row.
+<p>When writing server side iterators for accumulo isolation is something to be aware of.  A scan time iterator in accumulo reads from a set of data sources.  While an iterator is reading data it has an isolated view.  However, after it returns a key/value it is possible that accumulo may switch data sources and re-seek the iterator.  This is done so that resources may be reclaimed.  When the user does not request isolation this can occur after any key is returned.  When a user request isolation this will only occur after a new row is returned, in which case it will re-seek to the very beginning of the next possible row.

Modified: incubator/accumulo/branches/1.4/docs/metrics.html
URL: http://svn.apache.org/viewvc/incubator/accumulo/branches/1.4/docs/metrics.html?rev=1208728&r1=1208727&r2=1208728&view=diff
==============================================================================
--- incubator/accumulo/branches/1.4/docs/metrics.html (original)
+++ incubator/accumulo/branches/1.4/docs/metrics.html Wed Nov 30 19:41:50 2011
@@ -32,7 +32,7 @@ Except where specified all time values a
 	</thead>
 	<tbody>
 		<tr class="highlight"><td>public long getPingCount();</td><td>Number of pings to tablet servers</td></tr>
-		<tr><td>public long getPingAvgTime();</td><td>Avergage time for each ping</td></tr>
+		<tr><td>public long getPingAvgTime();</td><td>Average time for each ping</td></tr>
 		<tr class="highlight"><td>public long getPingMinTime();</td><td>Minimum time for each ping</td></tr>
 		<tr><td>public long getPingMaxTime();</td><td>Maximum time for each ping</td></tr>
 		<tr class="highlight"><td>public String getTServerWithHighestPingTime();</td><td>tablet server with highest ping</td></tr>

Modified: incubator/accumulo/branches/1.4/docs/src/user_manual/chapters/administration.tex
URL: http://svn.apache.org/viewvc/incubator/accumulo/branches/1.4/docs/src/user_manual/chapters/administration.tex?rev=1208728&r1=1208727&r2=1208728&view=diff
==============================================================================
--- incubator/accumulo/branches/1.4/docs/src/user_manual/chapters/administration.tex (original)
+++ incubator/accumulo/branches/1.4/docs/src/user_manual/chapters/administration.tex Wed Nov 30 19:41:50 2011
@@ -69,7 +69,7 @@ files.
 
 \subsection{Edit conf/accumulo-env.sh}
 
-Accumulo needs to know where to find the software it depends on. Edit accumuloenv.
+Accumulo needs to know where to find the software it depends on. Edit accumulo-env.
 sh and specify the following:
 
 \begin{enumerate}

Modified: incubator/accumulo/branches/1.4/docs/src/user_manual/chapters/analytics.tex
URL: http://svn.apache.org/viewvc/incubator/accumulo/branches/1.4/docs/src/user_manual/chapters/analytics.tex?rev=1208728&r1=1208727&r2=1208728&view=diff
==============================================================================
--- incubator/accumulo/branches/1.4/docs/src/user_manual/chapters/analytics.tex (original)
+++ incubator/accumulo/branches/1.4/docs/src/user_manual/chapters/analytics.tex Wed Nov 30 19:41:50 2011
@@ -164,7 +164,7 @@ MapReduce jobs.
 
 All that is needed to aggregate values of a table is to identify the fields over which
 values will be grouped, insert mutations with those fields as the key, and configure
-the table with a combining iterator that supports the summarization operation
+the table with a combining iterator that supports the summarizing operation
 desired.
 
 The only restriction on an combining iterator is that the combiner developer

Modified: incubator/accumulo/branches/1.4/docs/src/user_manual/chapters/high_speed_ingest.tex
URL: http://svn.apache.org/viewvc/incubator/accumulo/branches/1.4/docs/src/user_manual/chapters/high_speed_ingest.tex?rev=1208728&r1=1208727&r2=1208728&view=diff
==============================================================================
--- incubator/accumulo/branches/1.4/docs/src/user_manual/chapters/high_speed_ingest.tex (original)
+++ incubator/accumulo/branches/1.4/docs/src/user_manual/chapters/high_speed_ingest.tex Wed Nov 30 19:41:50 2011
@@ -122,7 +122,7 @@ file is imported, but whenever it is rea
 time is obtained and always used by the specialized system iterator to set that
 time.
 
-The timestamp asigned by accumulo will be the same for every key in the file.
+The timestamp assigned by accumulo will be the same for every key in the file.
 This could cause problems if the file contains multiple keys that are identical
 except for the timestamp.  In this case, the sort order of the keys will be
 undefined. This could occur if an insert and an update were in the same bulk

Modified: incubator/accumulo/branches/1.4/docs/src/user_manual/chapters/security.tex
URL: http://svn.apache.org/viewvc/incubator/accumulo/branches/1.4/docs/src/user_manual/chapters/security.tex?rev=1208728&r1=1208727&r2=1208728&view=diff
==============================================================================
--- incubator/accumulo/branches/1.4/docs/src/user_manual/chapters/security.tex (original)
+++ incubator/accumulo/branches/1.4/docs/src/user_manual/chapters/security.tex Wed Nov 30 19:41:50 2011
@@ -126,7 +126,7 @@ config -t table -s table.constraint.1=or
 
 Any user with the alter table permission can add or remove this constraint.
 This constraint is not applied to bulk imported data, if this a concern then
-disable the bulk import pesmission.
+disable the bulk import permission.
 
 \section{Secure Authorizations Handling}
 
@@ -134,7 +134,7 @@ For applications serving many users, it 
 will be created for each application user.  In this case a accumulo user with
 all authorizations needed by any of the applications users must be created.  To
 service queries, the application should create a scanner with the application
-users authorizations.  These authorizations could be obtined from a trusted 3rd
+users authorizations.  These authorizations could be obtained from a trusted 3rd
 party.
 
 Often production systems will integrate with Public-Key Infrastructure (PKI) and

Modified: incubator/accumulo/branches/1.4/src/core/src/main/java/org/apache/accumulo/core/conf/Property.java
URL: http://svn.apache.org/viewvc/incubator/accumulo/branches/1.4/src/core/src/main/java/org/apache/accumulo/core/conf/Property.java?rev=1208728&r1=1208727&r2=1208728&view=diff
==============================================================================
--- incubator/accumulo/branches/1.4/src/core/src/main/java/org/apache/accumulo/core/conf/Property.java (original)
+++ incubator/accumulo/branches/1.4/src/core/src/main/java/org/apache/accumulo/core/conf/Property.java Wed Nov 30 19:41:50 2011
@@ -68,7 +68,7 @@ public enum Property {
   MASTER_BULK_SERVERS("master.bulk.server.max", "4", PropertyType.COUNT, "The number of servers to use during a bulk load"),
   MASTER_BULK_RETRIES("master.bulk.retries", "3", PropertyType.COUNT, "The number of attempts to bulk-load a file before giving up."),
   MASTER_BULK_THREADPOOL_SIZE("master.bulk.threadpool.size", "5", PropertyType.COUNT, "The number of threads to use when coordinating a bulk-import."),
-  MASTER_MINTHREADS("master.server.threads.minimum", "2", PropertyType.COUNT, "The miniumum number of threads to use to handle incoming requests."),
+  MASTER_MINTHREADS("master.server.threads.minimum", "2", PropertyType.COUNT, "The minimum number of threads to use to handle incoming requests."),
   MASTER_THREADCHECK("master.server.threadcheck.time", "1s", PropertyType.TIMEDURATION, "The time between adjustments of the server thread pool."),
   
   // properties that are specific to tablet server behavior
@@ -102,7 +102,7 @@ public enum Property {
   TSERV_SESSION_MAXIDLE("tserver.session.idle.max", "1m", PropertyType.TIMEDURATION, "maximum idle time for a session"),
   TSERV_READ_AHEAD_MAXCONCURRENT("tserver.readahead.concurrent.max", "16", PropertyType.COUNT,
       "The maximum number of concurrent read ahead that will execute.  This effectively"
-          + "limits the number of long running scans that can run concurrently per tserver."),
+          + " limits the number of long running scans that can run concurrently per tserver."),
   TSERV_METADATA_READ_AHEAD_MAXCONCURRENT("tserver.metadata.readahead.concurrent.max", "8", PropertyType.COUNT,
       "The maximum number of concurrent metadata read ahead that will execute."),
   TSERV_MIGRATE_MAXCONCURRENT("tserver.migrations.concurrent.max", "1", PropertyType.COUNT,
@@ -142,7 +142,7 @@ public enum Property {
           + " the file to the appropriate tablets on all servers.  This property controls the number of threads used to communicate to the other servers."),
   TSERV_BULK_RETRY("tserver.bulk.retry.max", "3", PropertyType.COUNT,
       "The number of times the tablet server will attempt to assign a file to a tablet as it migrates and splits."),
-  TSERV_MINTHREADS("tserver.server.threads.minimum", "2", PropertyType.COUNT, "The miniumum number of threads to use to handle incoming requests."),
+  TSERV_MINTHREADS("tserver.server.threads.minimum", "2", PropertyType.COUNT, "The minimum number of threads to use to handle incoming requests."),
   TSERV_THREADCHECK("tserver.server.threadcheck.time", "1s", PropertyType.TIMEDURATION, "The time between adjustments of the server thread pool."),
   TSERV_HOLD_TIME_SUICIDE("tserver.hold.time.max", "5m", PropertyType.TIMEDURATION,
       "The maximum time for a tablet server to be in the \"memory full\" state.  If the tablet server cannot write out memory"
@@ -199,13 +199,13 @@ public enum Property {
       "table.compaction.major.ratio",
       "3",
       PropertyType.FRACTION,
-      "minimum ratio of total input size to maximum input file size for running a major compaction.   When adjusting this property you may want to also adjust table.file.max.  Want to avoid the situation where only merging minor compactions occurr."),
+      "minimum ratio of total input size to maximum input file size for running a major compaction.   When adjusting this property you may want to also adjust table.file.max.  Want to avoid the situation where only merging minor compactions occur."),
   TABLE_MAJC_COMPACTALL_IDLETIME("table.compaction.major.everything.idle", "1h", PropertyType.TIMEDURATION,
       "After a tablet has been idle (no mutations) for this time period it may have all "
           + "of its map file compacted into one.  There is no guarantee an idle tablet will be compacted. "
           + "Compactions of idle tablets are only started when regular compactions are not running. Idle "
           + "compactions only take place for tablets that have one or more map files."),
-  TABLE_SPLIT_THRESHOLD("table.split.threshold", "1G", PropertyType.MEMORY, "When combined size of mapfiles exceeds this amount a tablet is split."),
+  TABLE_SPLIT_THRESHOLD("table.split.threshold", "1G", PropertyType.MEMORY, "When combined size of files exceeds this amount a tablet is split."),
   TABLE_MINC_LOGS_MAX("table.compaction.minor.logs.threshold", "3", PropertyType.COUNT,
       "When there are more than this many write-ahead logs against a tablet, it will be minor compacted."),
   TABLE_MINC_COMPACT_IDLETIME("table.compaction.minor.idle", "5m", PropertyType.TIMEDURATION,
@@ -222,7 +222,7 @@ public enum Property {
       "Overrides the hadoop io.seqfile.compress.blocksize setting so that map files have better query performance. " + "The maximum value for this is "
           + Integer.MAX_VALUE),
   TABLE_FILE_COMPRESSED_BLOCK_SIZE_INDEX("table.file.compress.blocksize.index", "128K", PropertyType.MEMORY,
-      "Determines how large index blocks can be in files that support multilevel indexes" + "The maximum value for this is " + Integer.MAX_VALUE),
+      "Determines how large index blocks can be in files that support multilevel indexes. The maximum value for this is " + Integer.MAX_VALUE),
   TABLE_FILE_BLOCK_SIZE("table.file.blocksize", "0B", PropertyType.MEMORY,
       "Overrides the hadoop dfs.block.size setting so that map files have better query performance. " + "The maximum value for this is " + Integer.MAX_VALUE),
   TABLE_FILE_REPLICATION("table.file.replication", "0", PropertyType.COUNT, "Determines how many replicas to keep of a tables map files in HDFS. "