You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@accumulo.apache.org by el...@apache.org on 2015/05/20 00:43:11 UTC
svn commit: r1680414 -
/accumulo/site/trunk/content/release_notes/1.7.0.mdtext
Author: elserj
Date: Tue May 19 22:43:10 2015
New Revision: 1680414
URL: http://svn.apache.org/r1680414
Log:
CMS commit to accumulo by elserj
Modified:
accumulo/site/trunk/content/release_notes/1.7.0.mdtext
Modified: accumulo/site/trunk/content/release_notes/1.7.0.mdtext
URL: http://svn.apache.org/viewvc/accumulo/site/trunk/content/release_notes/1.7.0.mdtext?rev=1680414&r1=1680413&r2=1680414&view=diff
==============================================================================
--- accumulo/site/trunk/content/release_notes/1.7.0.mdtext (original)
+++ accumulo/site/trunk/content/release_notes/1.7.0.mdtext Tue May 19 22:43:10 2015
@@ -224,18 +224,18 @@ with at least Apache Hadoop 2.6.0.
### Configurable Threadpool Size for Assignments
One of the primary tasks that the Accumulo Master is responsible for is the
-assignment of Tablets to TabletServers. Before a TabletServer can be brought online,
+assignment of Tablets to TabletServers. Before a Tablet can be brought online,
the tablet must not have any outstanding logs as this represents a need to perform
recovery (the tablet was not unloaded cleanly). This process can take some time for
large write-ahead log files and is performed on a TabletServer to keep the Master
light and agile.
-Assignments, whether the Tablets need to perform recovery or not, share the same
+Assignment of Tablets, whether those Tablets need to perform recovery or not, share the same
threadpool in the Master. This means that when a large number of TabletServers are
available, too few threads dedicated to assignment can restrict the speed at which
assignments can be performed. [ACCUMULO-1085][ACCUMULO-1085] allows the size of the
threadpool used in the Master for assignments to be configurable which can be
-dynamically altered to remove the artificial limitation when sufficient servers are available.
+dynamically altered to remove the limitation when sufficient servers are available.
### Group-Commit Threshold as a Factor of Data Size
@@ -244,24 +244,27 @@ log. As such, this is a common place tha
is the notion of "group-commit". When multiple clients are writing data to the same
Accumulo Tablet, it is not efficient for each of them to synchronize the WAL, flush their
updates to disk for durability, and then release the lock. The idea of group-commit
-is that multiple writers can queue their write their mutations to the WAL and perform
-then wait for a sync that could satisfy the durability constraints of multiple clients
-instead of just one. This has a drastic improvement on performance.
+is that multiple writers can queue their write their mutations to the WAL and
+then wait for a sync that will satisfy the durability constraints of their batch of
+updates. This has a drastic improvement on performance as many threads writing batches
+concurrently can "share" the same `fsync`.
In previous versions, Accumulo controlled the frequency in which this group-commit
-sync was performed as a factor of clients writing to Accumulo. This was both confusing
-to correctly configure and also encouraged sub-par performance with fewer writers.
+sync was performed as a factor of the number of clients writing to Accumulo. This was both confusing
+to correctly configure and also encouraged sub-par performance with few write threads.
[ACCUMULO-1950][ACCUMULO-1950] introduced a new configuration property `tserver.total.mutation.queue.max`
which defines the amount of data that is queued before a group-commit is performed
in such a way that is agnostic of the number of writers. This new configuration property
-is much easier to reason about than the previous, now deprecated, `tserver.mutation.queue.max`.
+is much easier to reason about than the previous (now deprecated) `tserver.mutation.queue.max`.
+Users who have altered `tserver.mutation.queue.max` in the past are encouraged to start
+using the new `tserver.total.mutation.queue.max` property.
## Notable Bug Fixes
### SourceSwitchingIterator Deadlock
An instance of SourceSwitchingIterator, the Accumulo iterator which transparently
-manages whether data for a Tablet is in memory (the in-memory map) or disk (HDFS
+manages whether data for a Tablet read from memory (the in-memory map) or disk (HDFS
after a minor compaction), was found deadlocked in a production system.
This deadlock prevented the scan and the minor compaction from ever successfully
@@ -269,13 +272,15 @@ completing without restarting the Tablet
fixes the inconsistent synchronization inside of the SourceSwitchingIterator
to prevent this deadlock from happening in the future.
+The only mitigation of this bug is to restart the TabletServer that is deadlocked.
+
### Table flush blocked indefinitely
While running the Accumulo Randomwalk distributed test, it was observed
that all activity in Accumulo had stopped and there was an offline
-Accumulo metadata table tablet. The system first tried to flush a user
-tablet but the metadata table was not online (likely due to the agitation
+Accumulo Metadata table tablet. The system first tried to flush a user
+tablet, but the metadata table was not online (likely due to the agitation
process which stops and starts Accumulo processes during the test). After
this call, a call to load the metadata tablet was queued but could not
complete until the previous flush call. Thus, a deadlock occurred.
@@ -284,23 +289,27 @@ This deadlock happened because the synch
before the load tablet call completed, but the load tablet call couldn't
run because of connection caching we perform in Accumulo's RPC layer
to reduce the quantity of sockets we need to create to send data.
-[ACCUMULO-3597][ACCUMULO-3597] prevents this dealock by forcing a
-non-cached connection for the message requesting loads of metadata tablets,
-we can ensure that this deadlock won't occur.
+[ACCUMULO-3597][ACCUMULO-3597] prevents this deadlock by forcing the use of a
+non-cached connection for the RPCs requesting a load of a metadata tablet. While
+this feature does result in additional network resources to be used, the concern is minimal
+because the number of metadata tablets is typically very small with respect to the
+total number of tablets in the system.
+
+The only mitigation of this bug is to restart the TabletServer that is hung.
## Other changes
### VERSIONS file present in binary distribution
-In the pre-built binary distribution, or distributions built by users from the
+In the pre-built binary distribution or distributions built by users from the
official source release, users will now see a `VERSIONS` file present in the lib
directory alongside the Accumulo server-side jars. Because the created tarball
-strips off versions from the jar file names, it can be extra work to actually
-find what the version of the deployed jars is.
+strips off versions from the jar file names, it can require extra work to actually
+find what the version of each dependent jar.
-[ACCUMULO-2863][ACCUMULO-2863] adds this `VERSIONS` file to the `lib/` directory
+[ACCUMULO-2863][ACCUMULO-2863] adds a `VERSIONS` file to the `lib/` directory
which contains the Maven groupId, artifactId, and verison (GAV) information for
-each jar file.
+each jar file included in the distribution.
### Per-Table Volume Chooser
@@ -310,7 +319,7 @@ HDFS instances are available. By default
is used to evenly balance files across all HDFS instances.
Previously, this VolumeChooser logic was instance-wide which meant that it would
-affect system tables. This is potentially undesirable as it might unintentionally
+affect all tables. This is potentially undesirable as it might unintentionally
impact other users in a multi-tenant system. [ACCUMULO-3177][ACCUMULO-3177] introduces
a new per-table property which supports configuration of a `VolumeChooser`. This
ensures that the implementation to choose how HDFS utilization happens when multiple