You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@accumulo.apache.org by el...@apache.org on 2014/09/19 23:10:48 UTC

svn commit: r1626335 - /accumulo/site/trunk/content/release_notes/1.5.2.mdtext

Author: elserj
Date: Fri Sep 19 21:10:47 2014
New Revision: 1626335

URL: http://svn.apache.org/r1626335
Log:
More 1.5.2 release note additions

Modified:
    accumulo/site/trunk/content/release_notes/1.5.2.mdtext

Modified: accumulo/site/trunk/content/release_notes/1.5.2.mdtext
URL: http://svn.apache.org/viewvc/accumulo/site/trunk/content/release_notes/1.5.2.mdtext?rev=1626335&r1=1626334&r2=1626335&view=diff
==============================================================================
--- accumulo/site/trunk/content/release_notes/1.5.2.mdtext (original)
+++ accumulo/site/trunk/content/release_notes/1.5.2.mdtext Fri Sep 19 21:10:47 2014
@@ -31,14 +31,12 @@ who cannot or do not want to upgrade to 
 over earlier versions in the 1.5 line.
 
 
-## Notable Improvements
+## Performance Improvements
 
-While new features are typically not added in a bug-fix release as 1.5.2, the
-community does create a variety of improvements that are API compatible. Contained
-here are some of the more notable improvements.
+Apache Accumulo 1.5.2 includes a number of performance-related fixes over previous versions.
 
 
-### Performance improvements
+### Write-Ahead Log sync performance
 
 The Write-Ahead Log (WAL) files are used to ensure durability of updates made to Accumulo.
 A "sync" is called on the file in HDFS to make sure that the changes to the WAL are persisted
@@ -46,6 +44,50 @@ to disk, which allows Accumulo to recove
 an issue where an operation against a WAL would unnecessarily wait for multiple syncs, slowing
 down the ingest on the system.
 
+### Minor-Compactions not aggressive enough
+
+On a system with ample memory provided to Accumulo, long hold-times were observed which
+blocks the ingest of new updates. Trying to free more server-side memory by running minor
+compactions more frequently increased the overall throughput on the node. These changes
+were made in [ACCUMULO-2905][10].
+
+### HeapIterator optimization
+
+Iterators, a notable feature of Accumulo, are provided to users as a server-side programming
+construct, but are also used internally for numerous server operations. One of these system iterator 
+is the HeapIterator which implements a PriorityQueue of other Iterators. One way this iterator is
+used is to merge multiple files in HDFS to present a single, sorted stream of Key-Value pairs. [ACCUMULO-2827][11]
+introduces a performance optimization to the HeapIterator which can improve the speed of the
+HeapIterator in common cases.
+
+### Write-Ahead log sync implementation
+
+In Hadoop-2, two implementation of "sync" are provider: hflush and hsync. Both of these
+methods provide a way to request that the datanodes write the data to the underlying
+medium and not just hold it in memory (the 'fsync' syscall). While both of these methods
+inform the Datanodes to sync the relevant block(s), hflush does not wait for acknowledgement
+from the Datanodes that the sync finished, where hsync does. To provide the most reliable system
+"out of the box", Accumulo defaults to hsync so that your data is as secure as possible in 
+a variety of situations (notably, unexpected power outages).
+
+The downside is that performance tends to suffer because waiting for a sync to disk is a very
+expensive operation. [ACCUMULO-2842][12] introduces a new system property, tserver.wal.sync.method,
+that lets users to change the HDFS sync implementation from 'hsync' to 'hflush'. Using 'hflush' instead
+of 'hsync' should result in about a 30% increase in ingest performance.
+
+For users upgrading from Hadoop-1 or Hadoop-0.20 releases, "hflush" is the equivalent of how
+sync was implemented and should give equivalent performance.
+
+### Server-side mutation queue size
+
+When users desire writes to be as durable as possible, using 'hsync', the ingest performance
+of the system can be improved by increasing the tserver.mutation.queue.max property. The cost
+of this change is that it will cause TabletServers to use additional memory per writer. In 1.5.1,
+the value of this parameter defaulted to a conservative 256K, which resulted in sub-par ingest
+performance.
+
+1.5.2 and [ACCUMULO-3018][13] increases this buffer to 1M which has a noticeable impact on
+ingest performance with a minimal increase in TabletServer memory usage.
 
 ## Notable Bug Fixes
 
@@ -84,6 +126,13 @@ The Writable interface methods on the Ra
 calls to serialize the IteratorSettings configured for the Job. [ACCUMULO-2962][8]
 fixes the serialization and adds some additional tests.
 
+### Constraint violation causes hung scans
+
+A failed bulk import transaction had the ability to create an infinitely retrying
+loop due to a constraint violation. This directly prevents scans from completing,
+but will also hang compactions. [ACCUMULO-3096][14] fixes the issue so that the
+constraint no longer hangs the entire system.
+
 ## Documentation
 
 The following documentation updates were made: 
@@ -130,4 +179,9 @@ and, in HDFS High-Availability instances
 [6]: https://issues.apache.org/jira/browse/ACCUMULO-2985
 [7]: https://issues.apache.org/jira/browse/ACCUMULO-3055
 [8]: https://issues.apache.org/jira/browse/ACCUMULO-2962
-[9]: https://issues.apache.org/jira/browse/ACCUMULO-2766
\ No newline at end of file
+[9]: https://issues.apache.org/jira/browse/ACCUMULO-2766
+[10]: https://issues.apache.org/jira/browse/ACCUMULO-2905
+[11]: https://issues.apache.org/jira/browse/ACCUMULO-2827
+[12]: https://issues.apache.org/jira/browse/ACCUMULO-2842
+[13]: https://issues.apache.org/jira/browse/ACCUMULO-3018
+[14]: https://issues.apache.org/jira/browse/ACCUMULO-3096
\ No newline at end of file