You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by GitBox <gi...@apache.org> on 2019/12/03 00:10:57 UTC

[GitHub] [accumulo] etseidl opened a new pull request #1445: WIP add erasure coding and storage policy to table/namespace settings

etseidl opened a new pull request #1445: WIP add erasure coding and storage policy to table/namespace settings
URL: https://github.com/apache/accumulo/pull/1445
 
 
   First attempt at adding two table properties: table.hdfs.policy.encoding and table.hdfs.policy.storage.  The first is used to control the erasure coding policy used for tablet directories, the second controls the HDFS storage policy.  The intent is to make it easy to tune HDFS performance on a per-table basis.  For instance, frequently used tables with strict row lookup latency requirements could be configured to use HOT storage and default replication, while tables with archival data could be marked COLD and use erasure coding to save space.
   
   The basic concept of operation here is to replace calls to VolumeManager.mkdirs(Path) with a new version of mkdirs that takes the table's storage and encoding policies as arguments.  After creating the directory, a new method VolumeManager.checkDirPolicies() is called which will do the necessary HDFS calls to set the policies on the tablet directories.  When the policy properties are changed via the client, a "property changed" FATE operation is sent to the master, which then calls checkDirPolicies() for the affected tablet directories.
    
   This is very much a work-in-progress, and I'm not at all sure this is a) desired, or b) the correct approach.  I'm particularly worried about my use of FATE to enact the policy changes.  In an earlier iteration I used the ConfigurationObserver.propertiesChanged() override to let each tablet keep the policies on disk synced with the properties.  This mechanism disappeared in 2.1, so that's when I switched to using FATE.  I'm not sure this is the right way to go about this, and I'm also not sure the way I'm doing the table and namespace locking is correct.
   
   This also lacks any sort of testing code. I'm assuming this would need a mix of unit tests in the accumulo tree, and then some live testing in accumulo-tests.  Any tips on this would be appreciated.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services