You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@accumulo.apache.org by "Keith Turner (JIRA)" <ji...@apache.org> on 2012/05/22 23:01:41 UTC

[jira] [Commented] (ACCUMULO-420) Allow per compaction iterator settings

    [ https://issues.apache.org/jira/browse/ACCUMULO-420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13281209#comment-13281209 ] 

Keith Turner commented on ACCUMULO-420:
---------------------------------------

This feature needs to work correctly in the case of concurrent compaction request.  If two user request compaction of a table at around the same time and one passes iterators and one does not, then iterators should be applied to all relevant tablets for the user that passed iterators.   Also, tablet servers can decide to compact all files in a tablet at any time.  This another form of concurrency that must be considered.

The current design is that there is a compaction counter in zookeeper.  A compaction operation will increment this.  As each tablet is compacted it records what the compaction counter was when it started.  Compaction operations wait for the compaction counters for each relevant tablet in the metadata table to reach at least its count.

The following is a possible new design.

compact FATE op
 # set compaction iterators in zookeeper
 # increment compaction counter in zookeeper
 # wait until all tablets in compaction range have at least this operations compaction counter
 # remove compaction iterators in zookeeper

TServer compact all operation
 # get compaction iterators from zookeeper
 # get compaction counter in zookeeper
 # compact with current compaction iterators
 # write compaction counter to metadata table

Considerations
 * setting compaction iterators in zookeeper must accomodate concurrent request
 * the iterators are intentionally read from zookeeper before the counter.  The iterators are written before the counter.  Does zookeeper guarantee that iterators written before the counter will always be seen?  If not there is a race condition. 
 * If two users submit compaction request with iterators, this design could run compactions with both users iterators settings.  Does anyone see an issue w/ this?
 * Iterators from one user compaction request could be applied to multiple compactions of a single tablet.
                
> Allow per compaction iterator settings
> --------------------------------------
>
>                 Key: ACCUMULO-420
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-420
>             Project: Accumulo
>          Issue Type: New Feature
>            Reporter: Keith Turner
>            Assignee: Sapan Shah
>             Fix For: 1.5.0
>
>
> It may be useful to allow the compact command to specify an iterator to be used for that compaction.  For example if someone wanted to apply a filter once to a table, they could force a compaction with that filter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira