You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@accumulo.apache.org by kt...@apache.org on 2013/06/20 15:31:28 UTC
svn commit: r1494983 - /accumulo/site/trunk/content/notable_features.mdtext
Author: kturner
Date: Thu Jun 20 13:31:28 2013
New Revision: 1494983
URL: http://svn.apache.org/r1494983
Log:
updated w/ some info about 1.5 improvements
Modified:
accumulo/site/trunk/content/notable_features.mdtext
Modified: accumulo/site/trunk/content/notable_features.mdtext
URL: http://svn.apache.org/viewvc/accumulo/site/trunk/content/notable_features.mdtext?rev=1494983&r1=1494982&r2=1494983&view=diff
==============================================================================
--- accumulo/site/trunk/content/notable_features.mdtext (original)
+++ accumulo/site/trunk/content/notable_features.mdtext Thu Jun 20 13:31:28 2013
@@ -136,7 +136,8 @@ Scans will not see data inserted into a
If consecutive keys have identical portions (row, colf, colq, or colvis), there
is a flag to indicate that a portion is the same as that of the previous key.
This is applied when keys are stored on disk and when transferred over the
-network.
+network. Starting with 1.5, prefix erasure is supported. When its cost
+effective, prefixes repeated in subsequent key fields are not repeated.
### Native In-Memory Map
@@ -170,6 +171,16 @@ written. When an index block exceeds the
written out between data blocks. The size of index blocks is configurable on a
per table basis.
+### Binary search in RFile blocks (1.5)
+
+RFile uses its index to locate a block of key values. Once it reaches a block
+it performs a linear scan to find a key on interest. Starting with 1.5, Accumulo
+will generate indexes of cached blocks in an adaptive manner. Accumulo indexes
+the blocks that are read most frequently. When a block is read a few times, a
+small index is generated. As a block is read more, larger indexes are generated
+making future seeks faster. This strategy allows Accumulo to dynamically respond
+to read patterns without precomputing block indexes when RFiles are written.
+
## Testing <a id="testing"></a>
### Mock
@@ -177,6 +188,13 @@ per table basis.
The Accumulo client API has a mock implementation that is useful writing unit
test against Accumulo. Mock Accumulo is in memory and in process.
+### Mini Accumulo Cluster (1.5 & 1.4.4)
+
+Mini Accumulo cluster is a set of utility code that makes it easy to spin up
+a local Accumulo instance running against the local filesystem. Mini Accumulo
+is slower than Mock Accumulo, but its behavior is mirrors a real Accumulo
+instance more closely.
+
### Functional Test
Small, system-level tests of basic Accumulo features run in a test harness,
@@ -236,6 +254,13 @@ could be different from the Accumulo nod
Accumulo can be a source and/or sink for map reduce jobs.
+### Thrift Proxy (1.5 & 1.4.4)
+
+The Accumulo client code contains a lot of complexity. For example, the
+client code locates tablets, retries in the case of failures, and supports
+concurrent reading and writing. All of this is written in Java. The thrift
+proxy wraps the Accumulo client API with thrift, making this API easily
+available to other languages like Python, Ruby, C++, etc.
## Extensible Behaviors <a id="behaviors"></a>
@@ -327,6 +352,12 @@ was growing. Without this feature, inge
constant rate, even as scan performance decreases because tablets have too many
files.
+### Loading jars using VFS (1.5)
+
+User written iterators are a useful way to manipulate data in data in Accumulo.
+Before 1.5., users had to copy their iterators to each tablet server. Starting
+with 1.5 Accumulo can load iterators from HDFS using Apache commons VFS.
+
## On-demand Data Management <a id="ondemand_dm"></a>
### Compactions
@@ -335,7 +366,8 @@ Ability to force tablets to compact to o
compacted. This is useful for improving query performance, permanently
applying iterators, or using a new locality group configuration. One example
of using iterators is applying a filtering iterator to remove data from a
-table.
+table. As of 1.5, users can initiate a compaction with iterators only applied to
+that compaction event.
### Split points
@@ -356,6 +388,11 @@ mutated independently. Testing was the m
feature. For example to test a new filtering iterator, clone the table, add the
filter to the clone, and force a major compaction.
+### Import/Export Table (1.5)
+
+An offline tables metadata and files can easily be copied to another cluster and
+imported.
+
### Compact Range (1.4)
Compact each tablet that falls within a row range down to a single file.