You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by "Christopher Tubbs (JIRA)" <ji...@apache.org> on 2013/04/03 21:53:15 UTC

[jira] [Commented] (ACCUMULO-1228) Allow clients to disable column families and locality groups

    [ https://issues.apache.org/jira/browse/ACCUMULO-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13621261#comment-13621261 ] 

Christopher Tubbs commented on ACCUMULO-1228:
---------------------------------------------

For the first proposal, I think it might be useful to add to the API the ability to exclude columns or column families.

For the second proposal, however, allowing the client to specify details of locality groups means making locality groups a client-viewable feature, instead of a server-side optimization. Users don't "fetch a locality group", they query for "data identified by (in this case) a set of column families", regardless of how it is stored. Blurring the lines between storage optimization and the data model is confusing and risky, I think it's a bad idea. Further, it's very poorly defined... would you use the locality group as determined by the current configuration, or the locality groups specified in the RFiles on disk prior to the last locality group change? After all, you can get 10 different files with 10 different locality group characteristics... and this works fine if you ask for the data, but it would be... confusing and unpredictable from a user's perspective if you ask for a locality group. This is a bit like asking "give me all data in my files stored with snappy compression vs. gzip".
                
> Allow clients to disable column families and locality groups
> ------------------------------------------------------------
>
>                 Key: ACCUMULO-1228
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-1228
>             Project: Accumulo
>          Issue Type: New Feature
>          Components: client, tserver
>    Affects Versions: 1.5.0
>            Reporter: William Slacum
>            Priority: Minor
>             Fix For: 1.6.0
>
>
> There's an inconsistency between what a server is capable of and what a client can tell it to do with respect to fetching column families.
> Currently, a user can tell a {{Scanner}} to fetch some set of column families. The iterators support not only this, but also the converse where a user does not want to retrieve column families. An iterator implementation can do this by hand, but a client cannot specifically tell a Scanner to not return data from a set of column families. Clients should be able to specify this option.
> There also seems to be an inconsistency with how locality groups are defined and then utilized. If I want to specify a set of column families as being part of a locality group, I have to provide a mapping of locality group name to a list of column families. If I want to fetch a locality group, I have to get the mapping first, rather than just set which locality group I want to use. It'd be more convenient to tell the scanner just to fetch which locality groups I want, and have the server know which column families that means.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira