You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Sergey Shelukhin (JIRA)" <ji...@apache.org> on 2012/11/29 00:51:59 UTC

[jira] [Created] (HBASE-7236) add per-table/per-cf compaction configuration via metadata

Sergey Shelukhin created HBASE-7236:
---------------------------------------

             Summary: add per-table/per-cf compaction configuration via metadata
                 Key: HBASE-7236
                 URL: https://issues.apache.org/jira/browse/HBASE-7236
             Project: HBase
          Issue Type: New Feature
          Components: Compaction
    Affects Versions: 0.96.0
            Reporter: Sergey Shelukhin
            Assignee: Sergey Shelukhin


Regardless of the compaction policy, it makes sense to have separate configuration for compactions for different tables and column families, as their access patterns and workloads can be different. In particular, for tiered compactions that are being ported from 0.89-fb branch it is necessary to have, to use these properly.
We might want to add support for compaction configuration via metadata on table/cf.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7236) add per-table/per-cf compaction configuration via metadata

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507019#comment-13507019 ] 

Ted Yu commented on HBASE-7236:
-------------------------------

I saw quite some changes which only affect white space.
Can you simplify your patch so that it is easier to review ?

Thanks
                
> add per-table/per-cf compaction configuration via metadata
> ----------------------------------------------------------
>
>                 Key: HBASE-7236
>                 URL: https://issues.apache.org/jira/browse/HBASE-7236
>             Project: HBase
>          Issue Type: New Feature
>          Components: Compaction
>    Affects Versions: 0.96.0
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HBASE-7236-PROTOTYPE.patch
>
>
> Regardless of the compaction policy, it makes sense to have separate configuration for compactions for different tables and column families, as their access patterns and workloads can be different. In particular, for tiered compactions that are being ported from 0.89-fb branch it is necessary to have, to use it properly.
> We might want to add support for compaction configuration via metadata on table/cf.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7236) add per-table/per-cf compaction configuration via metadata

Posted by "Sergey Shelukhin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sergey Shelukhin updated HBASE-7236:
------------------------------------

    Attachment: HBASE-7236-PROTOTYPE-v1.patch
    
> add per-table/per-cf compaction configuration via metadata
> ----------------------------------------------------------
>
>                 Key: HBASE-7236
>                 URL: https://issues.apache.org/jira/browse/HBASE-7236
>             Project: HBase
>          Issue Type: New Feature
>          Components: Compaction
>    Affects Versions: 0.96.0
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HBASE-7236-PROTOTYPE.patch, HBASE-7236-PROTOTYPE.patch, HBASE-7236-PROTOTYPE-v1.patch
>
>
> Regardless of the compaction policy, it makes sense to have separate configuration for compactions for different tables and column families, as their access patterns and workloads can be different. In particular, for tiered compactions that are being ported from 0.89-fb branch it is necessary to have, to use it properly.
> We might want to add support for compaction configuration via metadata on table/cf.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (HBASE-7236) add per-table/per-cf compaction configuration via metadata

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507920#comment-13507920 ] 

Andrew Purtell edited comment on HBASE-7236 at 12/1/12 10:28 AM:
-----------------------------------------------------------------

The per-CF settings/overrides are kept in the descriptor, which is the right place for that IMO. The below are points I don't feel particularly strongly about but think should be raised.

Rightly descriptor attribute convention is called out as sloppy. That should be cleaned up. However I'm not sure adding the concept of "configuration override" to either CompoundConfiguration or descriptor attributes is better.

Regards descriptor attributes, a "configuration override" is just another attribute. Does it make sense to go in the other direction and fix where descriptors have metadata which are configuration overrides with custom names, meaning: rename them to the convention for Configuration? Otherwise now we have not only attributes, some of which override settings in the XML configuration, but now also "configuration overrides" that also do so?

Regards CompoundConfiguration, as an API user why should I care about tagging if something I add to CompoundConfiguration is an 'override' or not. Seems any .add() should simply override values added to the configuration by a previous .add() ? Or are some overrides special that will continue to override values even if they are provided in a subsequent .add(), so some of those values in the .add() will override previous values from an earlier .add() as I would expect but there are these other values changed with an .addOverride() that I don't know about? Will an second addOverride override the previous addOverride overrides? Confusing -- See? 

                
      was (Author: apurtell):
    The per-CF settings/overrides are kept in the descriptor, which is the right place for that IMO. The below are points I don't feel particularly strongly about but think should be raised.

Rightly descriptor attribute convention is called out as sloppy. That should be cleaned up. However I'm not sure adding the concept of "configuration override" to either CompoundConfiguration or descriptor attributes is better.

Regards descriptor attributes, a "configuration override" is just another attribute. Does it make sense to go in the other direction and fix where descriptors have metadata which are configuration overrides with custom names, meaning: rename them to the convention for Configuration? Otherwise now we have not only attributes, some of which override settings in the XML configuration, but now also "configuration overrides" that also do so?

Regards CompoundConfiguration, as an API user why should I care about tagging if something I add to CompoundConfiguration is an 'override' or not. Seems any .add() should simply override values added to the configuration by a previous .add() ? Or are some overrides special that will continue to override values even if they are provided in a subsequent .add(), so some of those values in the .add() will continue to override previous values from an earlier .add() but not .addOverride()? Will an second addOverride override the previous addOverride overrides? And/or any configuration .add()ed in between -- See? 

                  
> add per-table/per-cf compaction configuration via metadata
> ----------------------------------------------------------
>
>                 Key: HBASE-7236
>                 URL: https://issues.apache.org/jira/browse/HBASE-7236
>             Project: HBase
>          Issue Type: New Feature
>          Components: Compaction
>    Affects Versions: 0.96.0
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HBASE-7236-PROTOTYPE.patch, HBASE-7236-PROTOTYPE.patch
>
>
> Regardless of the compaction policy, it makes sense to have separate configuration for compactions for different tables and column families, as their access patterns and workloads can be different. In particular, for tiered compactions that are being ported from 0.89-fb branch it is necessary to have, to use it properly.
> We might want to add support for compaction configuration via metadata on table/cf.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7236) add per-table/per-cf compaction configuration via metadata

Posted by "Sergey Shelukhin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sergey Shelukhin updated HBASE-7236:
------------------------------------

    Description: 
Regardless of the compaction policy, it makes sense to have separate configuration for compactions for different tables and column families, as their access patterns and workloads can be different. In particular, for tiered compactions that are being ported from 0.89-fb branch it is necessary to have, to use it properly.
We might want to add support for compaction configuration via metadata on table/cf.

  was:
Regardless of the compaction policy, it makes sense to have separate configuration for compactions for different tables and column families, as their access patterns and workloads can be different. In particular, for tiered compactions that are being ported from 0.89-fb branch it is necessary to have, to use these properly.
We might want to add support for compaction configuration via metadata on table/cf.

    
> add per-table/per-cf compaction configuration via metadata
> ----------------------------------------------------------
>
>                 Key: HBASE-7236
>                 URL: https://issues.apache.org/jira/browse/HBASE-7236
>             Project: HBase
>          Issue Type: New Feature
>          Components: Compaction
>    Affects Versions: 0.96.0
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>
> Regardless of the compaction policy, it makes sense to have separate configuration for compactions for different tables and column families, as their access patterns and workloads can be different. In particular, for tiered compactions that are being ported from 0.89-fb branch it is necessary to have, to use it properly.
> We might want to add support for compaction configuration via metadata on table/cf.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7236) add per-table/per-cf compaction configuration via metadata

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13506761#comment-13506761 ] 

Andrew Purtell commented on HBASE-7236:
---------------------------------------

Encode compaction selection as JSON in the attribute I'd say.
                
> add per-table/per-cf compaction configuration via metadata
> ----------------------------------------------------------
>
>                 Key: HBASE-7236
>                 URL: https://issues.apache.org/jira/browse/HBASE-7236
>             Project: HBase
>          Issue Type: New Feature
>          Components: Compaction
>    Affects Versions: 0.96.0
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>
> Regardless of the compaction policy, it makes sense to have separate configuration for compactions for different tables and column families, as their access patterns and workloads can be different. In particular, for tiered compactions that are being ported from 0.89-fb branch it is necessary to have, to use it properly.
> We might want to add support for compaction configuration via metadata on table/cf.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7236) add per-table/per-cf compaction configuration via metadata

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13509298#comment-13509298 ] 

Hadoop QA commented on HBASE-7236:
----------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12555806/HBASE-7236-PROTOTYPE-v1.patch
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 3 new or modified tests.

    {color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 2.0 profile.

    {color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 101 warning messages.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of javac compiler warnings.

    {color:red}-1 findbugs{color}.  The patch appears to introduce 26 new Findbugs (version 1.3.9) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number of release audit warnings.

     {color:red}-1 core tests{color}.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.regionserver.TestSplitTransaction

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3440//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3440//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3440//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3440//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3440//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3440//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3440//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3440//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3440//console

This message is automatically generated.
                
> add per-table/per-cf compaction configuration via metadata
> ----------------------------------------------------------
>
>                 Key: HBASE-7236
>                 URL: https://issues.apache.org/jira/browse/HBASE-7236
>             Project: HBase
>          Issue Type: New Feature
>          Components: Compaction
>    Affects Versions: 0.96.0
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HBASE-7236-PROTOTYPE.patch, HBASE-7236-PROTOTYPE.patch, HBASE-7236-PROTOTYPE-v1.patch
>
>
> Regardless of the compaction policy, it makes sense to have separate configuration for compactions for different tables and column families, as their access patterns and workloads can be different. In particular, for tiered compactions that are being ported from 0.89-fb branch it is necessary to have, to use it properly.
> We might want to add support for compaction configuration via metadata on table/cf.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7236) add per-table/per-cf compaction configuration via metadata

Posted by "Sergey Shelukhin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507594#comment-13507594 ] 

Sergey Shelukhin commented on HBASE-7236:
-----------------------------------------

[~stack]
bq. IIUC, you are expanding CompoundConfiguration to do table and 'overrides'?
Currently it already has metadata, I am adding overrides. The method name is special only because Java won't allow two method differing only by typename part of the parameter. It's no different than other "add"-s :)

bq. How would this help us get to changing configuration on the fly? Looks like it doesn't change our current story. CompoundConfiguration is setup in HRegion or ColumnFamily setup still..
It doesn't (well, other than setting overrides after disabling the table, which is more dynamic than XML config replacement).
For XML config, there's HBASE-3909 to solve that problem; for column updates, I'd prefer to make online-alter "bulletproof" - that way we'd get this config to be fully dynamic as a side benefit.
Two ideas we had was tracking schema version on master and server, and validating on master what server has, but that's inconvenient right now given how it's stored, and may lead to other race conditions; or using new 2pc-ish barrier mechanism that is being introduced for snapshots, somehow.
Regardless, in this JIRA I do not aim to solve this problem...


bq. If we start to record metadata on a table, say column types, would we use this mechanism?
You mean CompoundConfiguration mechanism, or overrides mechanism?
Not overrides (they are intended for config), but as CompoundConfiguration is already used for cf metadata, I don't see why not use it also for table metadata if appropriate.

bq. How would 'overrides' be specified in the shell say? (Patch doesn't seem to say) We have means of altering table and column descriptors. Where would 'overrides' go?
Similarly to metadata. I am going to rename "CONFIG" to something indicating it's actually user metadata, and use name like CONFIGURATION_OVERRIDES - a separate Ruby hash with values.
Didn't want to go thru it before confirmation on the viability of this approach :)

bq. bq. however, making it explicit and separate from miscellaneous metadata would be cleaner imho.
bq. Can you say more on above?
See below.
bq. HTableDescriptor and HColumnDescriptor dictionaries are key/value maps that get persisted into the filesystem when changed. We read them them throughout the codebase and we list them in master UIs, etc. Will they blow up under this new use case? HTD and HCD are mostly schema with a little config. This direction would seem to be using these descriptors to add table or column scoped configs. Should we be working to undo schema and config conflation rather than compound it?
They may blow up (depending on how much config one wants to override), this is one of the reasons I want to keep it separate, to have finer grained control for things like UI or shell.
Do you suggest we keep them even more separate, e.g. separate class/serialization mechanism? It seems it would add complexity, esp. given that some of the existing METADATA is config additions/overrides with custom names, essentially.
Can you elaborate on example mechanism (e.g. meta-like table for tables/cfs, different file from tableinfo, ...)?

The other thing about the current map is that it's already multi-purpose (see HBASE-7237; in the discussion in the shell issue when I wanted to keep CONFIG separate from system stuff I didn't realize they were still stored in the same map in HTD).
Dumping config overrides there in addition to system metadata and user metadata is inelegant imho.

Finally, the third thing is that if it's separate we can have stricter validation - e.g. whitelist what can and cannot be overridden, or even validate values.
                
> add per-table/per-cf compaction configuration via metadata
> ----------------------------------------------------------
>
>                 Key: HBASE-7236
>                 URL: https://issues.apache.org/jira/browse/HBASE-7236
>             Project: HBase
>          Issue Type: New Feature
>          Components: Compaction
>    Affects Versions: 0.96.0
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HBASE-7236-PROTOTYPE.patch, HBASE-7236-PROTOTYPE.patch
>
>
> Regardless of the compaction policy, it makes sense to have separate configuration for compactions for different tables and column families, as their access patterns and workloads can be different. In particular, for tiered compactions that are being ported from 0.89-fb branch it is necessary to have, to use it properly.
> We might want to add support for compaction configuration via metadata on table/cf.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7236) add per-table/per-cf compaction configuration via metadata

Posted by "Sergey Shelukhin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sergey Shelukhin updated HBASE-7236:
------------------------------------

    Attachment: HBASE-7236-PROTOTYPE.patch

Hmm... most good diff tools have an option to ignore whitespace changes. Attaching the patch w/o whitespace cleanup.
                
> add per-table/per-cf compaction configuration via metadata
> ----------------------------------------------------------
>
>                 Key: HBASE-7236
>                 URL: https://issues.apache.org/jira/browse/HBASE-7236
>             Project: HBase
>          Issue Type: New Feature
>          Components: Compaction
>    Affects Versions: 0.96.0
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HBASE-7236-PROTOTYPE.patch, HBASE-7236-PROTOTYPE.patch
>
>
> Regardless of the compaction policy, it makes sense to have separate configuration for compactions for different tables and column families, as their access patterns and workloads can be different. In particular, for tiered compactions that are being ported from 0.89-fb branch it is necessary to have, to use it properly.
> We might want to add support for compaction configuration via metadata on table/cf.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7236) add per-table/per-cf compaction configuration via metadata

Posted by "Sergey Shelukhin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13506644#comment-13506644 ] 

Sergey Shelukhin commented on HBASE-7236:
-----------------------------------------

Yes; the main difference is that default compaction config can have several parameters, and tiered compaction selection can have up to 20 (several parameters x few tier); this will make metadata hard to manage. So it might make sense to make it nested in some form.

                
> add per-table/per-cf compaction configuration via metadata
> ----------------------------------------------------------
>
>                 Key: HBASE-7236
>                 URL: https://issues.apache.org/jira/browse/HBASE-7236
>             Project: HBase
>          Issue Type: New Feature
>          Components: Compaction
>    Affects Versions: 0.96.0
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>
> Regardless of the compaction policy, it makes sense to have separate configuration for compactions for different tables and column families, as their access patterns and workloads can be different. In particular, for tiered compactions that are being ported from 0.89-fb branch it is necessary to have, to use it properly.
> We might want to add support for compaction configuration via metadata on table/cf.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7236) add per-table/per-cf compaction configuration via metadata

Posted by "Sergey Shelukhin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13506060#comment-13506060 ] 

Sergey Shelukhin commented on HBASE-7236:
-----------------------------------------

Details: 
We want to do it in metadata as it makes logical sense; xml config usage would split table/CF configuration into two places. 
Given that the config for both current and tiered policy can be quite long, it probably makes sense to make it nested, as opposed to a set of flat values inside metadata. 
I immediately start thinking of generic xml config override mechanism for tables/cfs, but I am not sure this is necessary.

What do you think?
                
> add per-table/per-cf compaction configuration via metadata
> ----------------------------------------------------------
>
>                 Key: HBASE-7236
>                 URL: https://issues.apache.org/jira/browse/HBASE-7236
>             Project: HBase
>          Issue Type: New Feature
>          Components: Compaction
>    Affects Versions: 0.96.0
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>
> Regardless of the compaction policy, it makes sense to have separate configuration for compactions for different tables and column families, as their access patterns and workloads can be different. In particular, for tiered compactions that are being ported from 0.89-fb branch it is necessary to have, to use it properly.
> We might want to add support for compaction configuration via metadata on table/cf.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7236) add per-table/per-cf compaction configuration via metadata

Posted by "Sergey Shelukhin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sergey Shelukhin updated HBASE-7236:
------------------------------------

    Attachment: HBASE-7236-PROTOTYPE.patch

After poking around a bit, I think it's a good idea to have direct config key override.
Justification is that having separately names metadata fields will be hard to manage given the potential number of fields.
Ditto for JSON serialization - it would require special logic to validate/use overrides, and shell support would be painful.
Perhaps we can white-list configs that can be overridden this way, too, inside HTableDescriptor.

Here's the prototype for table level only, and with no shell support and tests for now.
Technically, this approach already works in Store for column family (CompoundConfiguration is created with family metadata overriding Configuration, so if someone adds a key with correct name to family metadata it will override xml config within Store); however, making it explicit and separate from miscellaneous metadata would be cleaner imho.
Please comment... Thanks!

                
> add per-table/per-cf compaction configuration via metadata
> ----------------------------------------------------------
>
>                 Key: HBASE-7236
>                 URL: https://issues.apache.org/jira/browse/HBASE-7236
>             Project: HBase
>          Issue Type: New Feature
>          Components: Compaction
>    Affects Versions: 0.96.0
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HBASE-7236-PROTOTYPE.patch
>
>
> Regardless of the compaction policy, it makes sense to have separate configuration for compactions for different tables and column families, as their access patterns and workloads can be different. In particular, for tiered compactions that are being ported from 0.89-fb branch it is necessary to have, to use it properly.
> We might want to add support for compaction configuration via metadata on table/cf.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7236) add per-table/per-cf compaction configuration via metadata

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507160#comment-13507160 ] 

stack commented on HBASE-7236:
------------------------------

[~sershe] IIUC, you are expanding CompoundConfiguration to do table and 'overrides'?

pros:
1. You go one place to get your config., the 'Configuration' (which could be a CompoundConfiguration w/ family or table specifics and overrides)?
2. Generalizes what Andrew did doing cache configuration over in hbase-6114?

How would this help us get to changing configuration on the fly?  Looks like it doesn't change our current story.  CompoundConfiguration is setup in HRegion or ColumnFamily setup still..

If we start to record metadata on a table, say column types, would we use this mechanism?

How would 'overrides' be specified in the shell say?  (Patch doesn't seem to say)  We have means of altering table and column descriptors.  Where would 'overrides' go?

bq. however, making it explicit and separate from miscellaneous metadata would be cleaner imho.

Can you say more on above?

HTableDescriptor and HColumnDescriptor dictionaries are key/value maps that get persisted into the filesystem when changed.   We read them them throughout the codebase and we list them in master UIs, etc.  Will they blow up under this new use case?  HTD and HCD are mostly schema with a little config.  This direction would seem to be using these descriptors to add table or column scoped configs.  Should we be working to undo schema and config conflation rather than compound it?
                
> add per-table/per-cf compaction configuration via metadata
> ----------------------------------------------------------
>
>                 Key: HBASE-7236
>                 URL: https://issues.apache.org/jira/browse/HBASE-7236
>             Project: HBase
>          Issue Type: New Feature
>          Components: Compaction
>    Affects Versions: 0.96.0
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HBASE-7236-PROTOTYPE.patch, HBASE-7236-PROTOTYPE.patch
>
>
> Regardless of the compaction policy, it makes sense to have separate configuration for compactions for different tables and column families, as their access patterns and workloads can be different. In particular, for tiered compactions that are being ported from 0.89-fb branch it is necessary to have, to use it properly.
> We might want to add support for compaction configuration via metadata on table/cf.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7236) add per-table/per-cf compaction configuration via metadata

Posted by "Sergey Shelukhin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sergey Shelukhin updated HBASE-7236:
------------------------------------

    Status: Patch Available  (was: Open)
    
> add per-table/per-cf compaction configuration via metadata
> ----------------------------------------------------------
>
>                 Key: HBASE-7236
>                 URL: https://issues.apache.org/jira/browse/HBASE-7236
>             Project: HBase
>          Issue Type: New Feature
>          Components: Compaction
>    Affects Versions: 0.96.0
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HBASE-7236-PROTOTYPE.patch, HBASE-7236-PROTOTYPE.patch
>
>
> Regardless of the compaction policy, it makes sense to have separate configuration for compactions for different tables and column families, as their access patterns and workloads can be different. In particular, for tiered compactions that are being ported from 0.89-fb branch it is necessary to have, to use it properly.
> We might want to add support for compaction configuration via metadata on table/cf.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7236) add per-table/per-cf compaction configuration via metadata

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13506220#comment-13506220 ] 

Andrew Purtell commented on HBASE-7236:
---------------------------------------

[~sershe] We moved per-table/per-cf blockcache configuration into the descriptors in HBASE-6114. This is a quite similar case, right?
                
> add per-table/per-cf compaction configuration via metadata
> ----------------------------------------------------------
>
>                 Key: HBASE-7236
>                 URL: https://issues.apache.org/jira/browse/HBASE-7236
>             Project: HBase
>          Issue Type: New Feature
>          Components: Compaction
>    Affects Versions: 0.96.0
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>
> Regardless of the compaction policy, it makes sense to have separate configuration for compactions for different tables and column families, as their access patterns and workloads can be different. In particular, for tiered compactions that are being ported from 0.89-fb branch it is necessary to have, to use it properly.
> We might want to add support for compaction configuration via metadata on table/cf.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7236) add per-table/per-cf compaction configuration via metadata

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507920#comment-13507920 ] 

Andrew Purtell commented on HBASE-7236:
---------------------------------------

The per-CF settings/overrides are kept in the descriptor, which is the right place for that IMO. The below are points I don't feel particularly strongly about but think should be raised.

Rightly descriptor attribute convention is called out as sloppy. That should be cleaned up. However I'm not sure adding the concept of "configuration override" to either CompoundConfiguration or descriptor attributes is better.

Regards descriptor attributes, a "configuration override" is just another attribute. Does it make sense to go in the other direction and fix where descriptors have metadata which are configuration overrides with custom names, meaning: rename them to the convention for Configuration? Otherwise now we have not only attributes, some of which override settings in the XML configuration, but now also "configuration overrides" that also do so?

Regards CompoundConfiguration, as an API user why should I care about tagging if something I add to CompoundConfiguration is an 'override' or not. Seems any .add() should simply override values added to the configuration by a previous .add() ? Or are some overrides special that will continue to override values even if they are provided in a subsequent .add(), so some of those values in the .add() will continue to override previous values from an earlier .add() but not .addOverride()? Will an second addOverride override the previous addOverride overrides? And/or any configuration .add()ed in between -- See? 

                
> add per-table/per-cf compaction configuration via metadata
> ----------------------------------------------------------
>
>                 Key: HBASE-7236
>                 URL: https://issues.apache.org/jira/browse/HBASE-7236
>             Project: HBase
>          Issue Type: New Feature
>          Components: Compaction
>    Affects Versions: 0.96.0
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HBASE-7236-PROTOTYPE.patch, HBASE-7236-PROTOTYPE.patch
>
>
> Regardless of the compaction policy, it makes sense to have separate configuration for compactions for different tables and column families, as their access patterns and workloads can be different. In particular, for tiered compactions that are being ported from 0.89-fb branch it is necessary to have, to use it properly.
> We might want to add support for compaction configuration via metadata on table/cf.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7236) add per-table/per-cf compaction configuration via metadata

Posted by "Sergey Shelukhin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13510973#comment-13510973 ] 

Sergey Shelukhin commented on HBASE-7236:
-----------------------------------------

Hi; are there any responses/objections? I'd like to go ahead with making commit-ready patch asap :)
                
> add per-table/per-cf compaction configuration via metadata
> ----------------------------------------------------------
>
>                 Key: HBASE-7236
>                 URL: https://issues.apache.org/jira/browse/HBASE-7236
>             Project: HBase
>          Issue Type: New Feature
>          Components: Compaction
>    Affects Versions: 0.96.0
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HBASE-7236-PROTOTYPE.patch, HBASE-7236-PROTOTYPE.patch, HBASE-7236-PROTOTYPE-v1.patch
>
>
> Regardless of the compaction policy, it makes sense to have separate configuration for compactions for different tables and column families, as their access patterns and workloads can be different. In particular, for tiered compactions that are being ported from 0.89-fb branch it is necessary to have, to use it properly.
> We might want to add support for compaction configuration via metadata on table/cf.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7236) add per-table/per-cf compaction configuration via metadata

Posted by "Sergey Shelukhin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13508937#comment-13508937 ] 

Sergey Shelukhin commented on HBASE-7236:
-----------------------------------------

bq. Regards descriptor attributes, a "configuration override" is just another attribute. Does it make sense to go in the other direction and fix where descriptors have metadata which are configuration overrides with custom names, meaning: rename them to the convention for Configuration? Otherwise now we have not only attributes, some of which override settings in the XML configuration, but now also "configuration overrides" that also do so?
Do you want to store them in the attributes dictionary though? Some attributes are not config overrides (e.g. IS_ROOT/IS_META, in-memory, etc.), there are also user attributes; above problems with having the same dictionary will remain.
I can move the existing attributes into config overrides instead; some backward compat might be necessary.

bq. Regards CompoundConfiguration, as an API user why should I care about tagging if something I add to CompoundConfiguration is an 'override' or not. Seems any .add() should simply override values added to the configuration by a previous .add() ? Or are some overrides special that will continue to override values even if they are provided in a subsequent .add(), so some of those values in the .add() will override previous values from an earlier .add() as I would expect but there are these other values changed with an .addOverride() that I don't know about? Will an second addOverride override the previous addOverride overrides? Confusing – See? 
The problem here is that you cannot have .add(List<A>) and .add(List<B>) due to type erasure on generics. I will rename both methods to more elaborate, semantically similar names.
                
> add per-table/per-cf compaction configuration via metadata
> ----------------------------------------------------------
>
>                 Key: HBASE-7236
>                 URL: https://issues.apache.org/jira/browse/HBASE-7236
>             Project: HBase
>          Issue Type: New Feature
>          Components: Compaction
>    Affects Versions: 0.96.0
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HBASE-7236-PROTOTYPE.patch, HBASE-7236-PROTOTYPE.patch, HBASE-7236-PROTOTYPE-v1.patch
>
>
> Regardless of the compaction policy, it makes sense to have separate configuration for compactions for different tables and column families, as their access patterns and workloads can be different. In particular, for tiered compactions that are being ported from 0.89-fb branch it is necessary to have, to use it properly.
> We might want to add support for compaction configuration via metadata on table/cf.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira