You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Eran Kampf (Created) (JIRA)" <ji...@apache.org> on 2011/11/18 08:51:52 UTC

[jira] [Created] (HBASE-4818) HBase Shell - Add support for formatting row keys before output

HBase Shell - Add support for formatting row keys before output
---------------------------------------------------------------

                 Key: HBASE-4818
                 URL: https://issues.apache.org/jira/browse/HBASE-4818
             Project: HBase
          Issue Type: Improvement
          Components: shell
            Reporter: Eran Kampf
            Priority: Trivial


As many HBase users use binary row keys rather than strings to optimize memory consumption displaying an escaped string in the HBase shell isn't useful (and takes a lot of screen space)
Allowing user to provide a row key formatter as part of the scan\get commands would allow developers to display the row key in a way thats makes sense for them.

Example:
scan 'stats', { ROWFORMATTER => MyRowFormatter.new }

The row formatter simply gets the bytes array key and formats it to a string.
Its an easy change tomake with simple monkey-patching of the shell commands but I would be happy to see it as part of the shell itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4818) HBase Shell - Add support for formatting row keys before output

Posted by "Ben West (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240451#comment-13240451 ] 

Ben West commented on HBASE-4818:
---------------------------------

The ReverseIDFormatter in that patch overrides the default formatter to display row keys in reverse order.

Something which we will have to think about is how we can maintain usability with these new formatters. Scans, for example, might not go in the order the user predicts because the stored format is different from the displayed one. Similarly with where regions split and so forth. Maybe we should require sort order to be constant across formatted and unformatted row keys (which would make the ReverseIDFormatter and probably most formatters impossible).

I'm not super familiar with the web UI, but it looks like the only spots we display row keys is when we specify the start and end rows of each region, and when we issue splits/compactions. So that shouldn't be too bad to change.
                
> HBase Shell - Add support for formatting row keys before output
> ---------------------------------------------------------------
>
>                 Key: HBASE-4818
>                 URL: https://issues.apache.org/jira/browse/HBASE-4818
>             Project: HBase
>          Issue Type: Improvement
>          Components: shell
>            Reporter: Eran Kampf
>            Priority: Trivial
>         Attachments: format3.patch, hbase-4818.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> As many HBase users use binary row keys rather than strings to optimize memory consumption displaying an escaped string in the HBase shell isn't useful (and takes a lot of screen space)
> Allowing user to provide a row key formatter as part of the scan\get commands would allow developers to display the row key in a way thats makes sense for them.
> Example:
> scan 'stats', { ROWFORMATTER => MyRowFormatter.new }
> The row formatter simply gets the bytes array key and formats it to a string.
> Its an easy change tomake with simple monkey-patching of the shell commands but I would be happy to see it as part of the shell itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4818) HBase Shell - Add support for formatting row keys before output

Posted by "Ben West (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ben West updated HBASE-4818:
----------------------------

    Attachment: hbase-4818.patch

Attaching a patch which includes:

1. The ability to specify a custom formatter on the command line
2. A sample custom formatter which reverses the keys before printing them in a scan 

You can use the new formatter by doing {code}
hbase shell --format=Shell::Formatter::ReverseID.new
{code}

We have an existing shell variable (JRUBY_OPTS) which you can set in your config script to persist your options, as Lars suggested. I'm not sure how to implement Eran's suggestion of per-table formatters using the command line; maybe we should deprecate the command line option since it doesn't do anything anyway and store this in .irbrc.

Also, the reverse ID formatter works by a kind of hack. 

I'd like to hear from people more familiar with the shell on how to make this better. 
                
> HBase Shell - Add support for formatting row keys before output
> ---------------------------------------------------------------
>
>                 Key: HBASE-4818
>                 URL: https://issues.apache.org/jira/browse/HBASE-4818
>             Project: HBase
>          Issue Type: Improvement
>          Components: shell
>            Reporter: Eran Kampf
>            Priority: Trivial
>         Attachments: hbase-4818.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> As many HBase users use binary row keys rather than strings to optimize memory consumption displaying an escaped string in the HBase shell isn't useful (and takes a lot of screen space)
> Allowing user to provide a row key formatter as part of the scan\get commands would allow developers to display the row key in a way thats makes sense for them.
> Example:
> scan 'stats', { ROWFORMATTER => MyRowFormatter.new }
> The row formatter simply gets the bytes array key and formats it to a string.
> Its an easy change tomake with simple monkey-patching of the shell commands but I would be happy to see it as part of the shell itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4818) HBase Shell - Add support for formatting row keys before output

Posted by "Ben West (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13247246#comment-13247246 ] 

Ben West commented on HBASE-4818:
---------------------------------

I can work on adding this to the web UI if someone can suggest a place to store the formatter preference. 

Should it just be in hbase-site.xml?
                
> HBase Shell - Add support for formatting row keys before output
> ---------------------------------------------------------------
>
>                 Key: HBASE-4818
>                 URL: https://issues.apache.org/jira/browse/HBASE-4818
>             Project: HBase
>          Issue Type: Improvement
>          Components: shell
>            Reporter: Eran Kampf
>            Priority: Trivial
>         Attachments: format3.patch, hbase-4818.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> As many HBase users use binary row keys rather than strings to optimize memory consumption displaying an escaped string in the HBase shell isn't useful (and takes a lot of screen space)
> Allowing user to provide a row key formatter as part of the scan\get commands would allow developers to display the row key in a way thats makes sense for them.
> Example:
> scan 'stats', { ROWFORMATTER => MyRowFormatter.new }
> The row formatter simply gets the bytes array key and formats it to a string.
> Its an easy change tomake with simple monkey-patching of the shell commands but I would be happy to see it as part of the shell itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4818) HBase Shell - Add support for formatting row keys before output

Posted by "Lars George (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152725#comment-13152725 ] 

Lars George commented on HBASE-4818:
------------------------------------

I would also like to see this persisted then, i.e. a simply text property file, or in the .irbrc where you can define this per table, so that these classes are loaded implicitly.
                
> HBase Shell - Add support for formatting row keys before output
> ---------------------------------------------------------------
>
>                 Key: HBASE-4818
>                 URL: https://issues.apache.org/jira/browse/HBASE-4818
>             Project: HBase
>          Issue Type: Improvement
>          Components: shell
>            Reporter: Eran Kampf
>            Priority: Trivial
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> As many HBase users use binary row keys rather than strings to optimize memory consumption displaying an escaped string in the HBase shell isn't useful (and takes a lot of screen space)
> Allowing user to provide a row key formatter as part of the scan\get commands would allow developers to display the row key in a way thats makes sense for them.
> Example:
> scan 'stats', { ROWFORMATTER => MyRowFormatter.new }
> The row formatter simply gets the bytes array key and formats it to a string.
> Its an easy change tomake with simple monkey-patching of the shell commands but I would be happy to see it as part of the shell itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4818) HBase Shell - Add support for formatting row keys before output

Posted by "Ben West (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ben West updated HBASE-4818:
----------------------------

    Attachment: format3.patch

New patch is a lot cleaner. It moves some formatting from table.rb to HTableFormatter.java like Todd suggested, so it can be used elsewhere. 

There is also scope creep: it parses input as well as formats output (so if you do a get it will translate the rowkey into an internal format first). This is just because it made my head hurt to have the output of scans be one format but the input another.

Right now there is only one formatter which is set via a shell param, but could be set at a table level - just wasn't sure if putting it in .irbrc was best or if there was a way we could do it in Java so non-shell would work too. Todd said to make it a "table property", but I don't know what this means.
                
> HBase Shell - Add support for formatting row keys before output
> ---------------------------------------------------------------
>
>                 Key: HBASE-4818
>                 URL: https://issues.apache.org/jira/browse/HBASE-4818
>             Project: HBase
>          Issue Type: Improvement
>          Components: shell
>            Reporter: Eran Kampf
>            Priority: Trivial
>         Attachments: format3.patch, hbase-4818.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> As many HBase users use binary row keys rather than strings to optimize memory consumption displaying an escaped string in the HBase shell isn't useful (and takes a lot of screen space)
> Allowing user to provide a row key formatter as part of the scan\get commands would allow developers to display the row key in a way thats makes sense for them.
> Example:
> scan 'stats', { ROWFORMATTER => MyRowFormatter.new }
> The row formatter simply gets the bytes array key and formats it to a string.
> Its an easy change tomake with simple monkey-patching of the shell commands but I would be happy to see it as part of the shell itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4818) HBase Shell - Add support for formatting row keys before output

Posted by "stack (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239943#comment-13239943 ] 

stack commented on HBASE-4818:
------------------------------

Patch looks nice Ben.  You have illustrations of it in action?  It looks like it keeps default behavior.  The default htformatter just does the bytes thing we currently have and doing formatting of HRegionInfo cells such as happens up in .META.

What would it take to have this htformatter work in the ui too as per Todd suggestion?
                
> HBase Shell - Add support for formatting row keys before output
> ---------------------------------------------------------------
>
>                 Key: HBASE-4818
>                 URL: https://issues.apache.org/jira/browse/HBASE-4818
>             Project: HBase
>          Issue Type: Improvement
>          Components: shell
>            Reporter: Eran Kampf
>            Priority: Trivial
>         Attachments: format3.patch, hbase-4818.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> As many HBase users use binary row keys rather than strings to optimize memory consumption displaying an escaped string in the HBase shell isn't useful (and takes a lot of screen space)
> Allowing user to provide a row key formatter as part of the scan\get commands would allow developers to display the row key in a way thats makes sense for them.
> Example:
> scan 'stats', { ROWFORMATTER => MyRowFormatter.new }
> The row formatter simply gets the bytes array key and formats it to a string.
> Its an easy change tomake with simple monkey-patching of the shell commands but I would be happy to see it as part of the shell itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4818) HBase Shell - Add support for formatting row keys before output

Posted by "Todd Lipcon (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226311#comment-13226311 ] 

Todd Lipcon commented on HBASE-4818:
------------------------------------

I think this should be a table property, and refer to a Java class name, rather than doing it in ruby. Doing it in ruby only helps with shell, but doing it in Java means we can also use it in the UIs, etc. ACCUMULO-303 is helpful reference material.
                
> HBase Shell - Add support for formatting row keys before output
> ---------------------------------------------------------------
>
>                 Key: HBASE-4818
>                 URL: https://issues.apache.org/jira/browse/HBASE-4818
>             Project: HBase
>          Issue Type: Improvement
>          Components: shell
>            Reporter: Eran Kampf
>            Priority: Trivial
>         Attachments: hbase-4818.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> As many HBase users use binary row keys rather than strings to optimize memory consumption displaying an escaped string in the HBase shell isn't useful (and takes a lot of screen space)
> Allowing user to provide a row key formatter as part of the scan\get commands would allow developers to display the row key in a way thats makes sense for them.
> Example:
> scan 'stats', { ROWFORMATTER => MyRowFormatter.new }
> The row formatter simply gets the bytes array key and formats it to a string.
> Its an easy change tomake with simple monkey-patching of the shell commands but I would be happy to see it as part of the shell itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4818) HBase Shell - Add support for formatting row keys before output

Posted by "Ben West (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226347#comment-13226347 ] 

Ben West commented on HBASE-4818:
---------------------------------

Todd: I think since we're using JRuby the formatter can be a java class, right? You'd just have --format=org.apache....

But I guess we could store it as a table property. 

(Btw: if the formatters are to be useful outside of shell, we'll need a revamp of how they work. Right now, it just formats text without much knowledge of what the text is - we'd probably want to have FormatKey() FormatColumn() etc. methods. Which is a good idea anyway.)
                
> HBase Shell - Add support for formatting row keys before output
> ---------------------------------------------------------------
>
>                 Key: HBASE-4818
>                 URL: https://issues.apache.org/jira/browse/HBASE-4818
>             Project: HBase
>          Issue Type: Improvement
>          Components: shell
>            Reporter: Eran Kampf
>            Priority: Trivial
>         Attachments: hbase-4818.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> As many HBase users use binary row keys rather than strings to optimize memory consumption displaying an escaped string in the HBase shell isn't useful (and takes a lot of screen space)
> Allowing user to provide a row key formatter as part of the scan\get commands would allow developers to display the row key in a way thats makes sense for them.
> Example:
> scan 'stats', { ROWFORMATTER => MyRowFormatter.new }
> The row formatter simply gets the bytes array key and formats it to a string.
> Its an easy change tomake with simple monkey-patching of the shell commands but I would be happy to see it as part of the shell itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4818) HBase Shell - Add support for formatting row keys before output

Posted by "Eran Kampf (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152766#comment-13152766 ] 

Eran Kampf commented on HBASE-4818:
-----------------------------------

Thats a good idea!
A simple global hash that maps a table name to a row key formatter and then all operations on the table use that formatter unless explicitly given one
                
> HBase Shell - Add support for formatting row keys before output
> ---------------------------------------------------------------
>
>                 Key: HBASE-4818
>                 URL: https://issues.apache.org/jira/browse/HBASE-4818
>             Project: HBase
>          Issue Type: Improvement
>          Components: shell
>            Reporter: Eran Kampf
>            Priority: Trivial
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> As many HBase users use binary row keys rather than strings to optimize memory consumption displaying an escaped string in the HBase shell isn't useful (and takes a lot of screen space)
> Allowing user to provide a row key formatter as part of the scan\get commands would allow developers to display the row key in a way thats makes sense for them.
> Example:
> scan 'stats', { ROWFORMATTER => MyRowFormatter.new }
> The row formatter simply gets the bytes array key and formats it to a string.
> Its an easy change tomake with simple monkey-patching of the shell commands but I would be happy to see it as part of the shell itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira