You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Gary Helmling (JIRA)" <ji...@apache.org> on 2011/08/02 02:58:27 UTC
[jira] [Issue Comment Edited] (HBASE-3842) Refactor Coprocessor Compaction API

    [ https://issues.apache.org/jira/browse/HBASE-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13075985#comment-13075985 ] 

Gary Helmling edited comment on HBASE-3842 at 8/2/11 12:57 AM:
---------------------------------------------------------------

I think the stacking issue is key here:  are we expecting the common case to be loading a single "CompactionObserver" that overrides the compaction implementation, or loading multiple that each override/customize compaction policy but not necessarily behavior?

I agree on the one hand that having a {{KeyValue}} oriented interface for {{preCompactWrite()}} and {{postCompactWrite()}} may not be sufficient.  At the same time, I don't think we want to force the implementations to write their own {{StoreFiles}} though, as that will be massively inefficient -- for N loaded coprocessors this becomes N compactions being written (assuming we bypass the core compaction code at the end of chaining).

One alternative would be to have {{preCompact}} take the scanner to be used as a parameter, as suggested, and return a scanner instance that would allow overriding policy and mutating KVs, while still relying on the core writer functionality.  This would allow wrapping the store scanner with a custom scanner that inspects and emits KVs as needed on the fly.  In this case, {{preCompact}} would look like:

{code}
StoreScanner preCompact(ObserverContext<~> context, Store store, StoreScanner scanner);
{code}

Wrapping the scanner seems much easier for chaining multiple observers.  On the other hand we lose the clean {{boolean}} return to indicate that core compaction processing should be skipped.  Are there cases that would still want to handling the store file writing portion of the implementation entirely in the coprocessor?  If so, can we still emit a flag to skip normal processing another way?  We could skip normal processing if {{null}} is returned.  Seems a little clunky, but it could work with appropriate documentation.

      was (Author: ghelmling):
    I think the stacking issue is key here:  are we expecting the common case to be loading a single "CompactionObserver" that overrides the compaction implementation, or loading multiple that each override/customize compaction policy but not necessarily behavior?

I agree on the one hand that having a {{KeyValue}} oriented interface for {{preCompactWrite()}} and {{postCompactWrite()}} may not be sufficient.  At the same time, I don't think we want to force the implementations to write their own {{StoreFiles}} though, as that will be massively inefficient -- for N loaded coprocessors this becomes N compactions being written (assuming we bypass the core compaction code at the end of chaining).

One alternative would be to have {{preCompact}} take the scanner to be used as a parameter, as suggested, and return a scanner instance that would allow overriding policy and mutating KVs, while still relying on the core writer functionality.  This would allow wrapping the store scanner with a custom scanner that inspects and emits KVs as needed on the fly.  In this case, {{preCompact}} would look like:

{{code}}
StoreScanner preCompact(ObserverContext<~> context, Store store, StoreScanner scanner);
{{code}}

Wrapping the scanner seems much easier for chaining multiple observers.  On the other hand we lose the clean {{boolean}} return to indicate that core compaction processing should be skipped.  Are there cases that would still want to handling the store file writing portion of the implementation entirely in the coprocessor?  If so, can we still emit a flag to skip normal processing another way?  We could skip normal processing if {{null}} is returned.  Seems a little clunky, but it could work with appropriate documentation.
  
> Refactor Coprocessor Compaction API
> -----------------------------------
>
>                 Key: HBASE-3842
>                 URL: https://issues.apache.org/jira/browse/HBASE-3842
>             Project: HBase
>          Issue Type: Improvement
>          Components: coprocessors, regionserver
>    Affects Versions: 0.92.0
>            Reporter: Nicolas Spiegelberg
>            Assignee: Nicolas Spiegelberg
>              Labels: compaction
>             Fix For: 0.92.0
>
>
> After HBASE-3797, the compaction logic flow has been significantly altered.  Because of this, the current compaction coprocessor API is insufficient for gaining full insight into compaction requests/results.  Refactor coprocessor API after HBASE-3797 is committed to be more extensible and increase visibility.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira