You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by GitBox <gi...@apache.org> on 2021/06/23 15:47:53 UTC

[GitHub] [accumulo-website] keith-turner commented on a change in pull request #286: External Comapaction Blog post

keith-turner commented on a change in pull request #286:
URL: https://github.com/apache/accumulo-website/pull/286#discussion_r657239352



##########
File path: _posts/blog/2021-06-21-external-compactions.md
##########
@@ -0,0 +1,490 @@
+---
+title: External Compactions
+author: Dave Marion, Keith Turner
+---
+
+External compactions are a new feature in Accumulo 2.1.0 which allows
+compaction work to run outside of Tablet Servers.
+
+## Overview
+
+There are two types of [compactions][1] in Accumulo - Minor and Major. Minor
+compactions flush recently written data from memory to a new file. Major
+compactions merge two or more Tablet files together into one new file. Starting
+in 2.1 Tablet Servers can run multiple major compactions for a Tablet
+concurrently; there is no longer a single thread pool per Tablet Server that
+runs compactions. Major compactions can be resource intensive and may run for a
+long time depending on several factors, to include the number and size of the
+input files, and the iterators configured to run during major compaction.
+Additionally, the Tablet Server does not currently have a mechanism in place to
+stop a major compaction that is taking too long or using too many resources.
+There is a mechanism to throttle the read and write speed of major compactions
+as a way to reduce the resource contention on a Tablet Server where many
+concurrent compactions are running. However, throttling compactions on a busy
+system will just lead to an increasing amount of queued compactions. Finally,
+major compaction work can be wasted in the event of an untimely death of the
+Tablet Server or if a Tablet is migrated to another Tablet Server.
+
+
+An external compaction is a major compaction that occurs outside of a Tablet
+Server. The external compaction feature is an extension of the major compaction
+service in the Tablet Server and is configured as part of the systems
+compaction service configuration. Thus, it is an optional feature. The goal of
+the external compaction feature is to overcome some of the drawbacks of the
+Major compactions that happen inside the Tablet Server. Specifically, external
+compactions:
+
+ * Allow major compactions to continue when the originating TabletServer dies
+ * Allow major compactions to occur while a Tablet migrates to a new Tablet Server
+ * Reduce the load on the TabletServer, giving it more cycles to insert mutations and respond to scans (assuming it’s running on different hosts).  Map reduce jobs and compactions can lower the effectiveness of processor caches for scans, so moving them off can be beneficial.

Review comment:
       Could say something like `processor caches (L1,L2,L3, and TLB)`. Maybe this covers data and page caches.. or maybe `processor data and page caches`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org