You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Todd Lipcon (JIRA)" <ji...@apache.org> on 2011/05/27 20:30:48 UTC

[jira] [Created] (HBASE-3929) Add option to HFile tool to produce basic stats

Add option to HFile tool to produce basic stats
-----------------------------------------------

                 Key: HBASE-3929
                 URL: https://issues.apache.org/jira/browse/HBASE-3929
             Project: HBase
          Issue Type: New Feature
          Components: io
    Affects Versions: 0.92.0
            Reporter: Todd Lipcon
            Assignee: Todd Lipcon
             Fix For: 0.92.0


In looking at HBASE-3241 I wrote a small tool to scan an HFile and produce some basic statistics about it:
- min/mean/max key size, value size (uncompressed)
- min/mean/max number of columns per row (uncompressed)
- min/mean/max number of bytes per row (uncompressed)
- the key of the largest row

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3929) Add option to HFile tool to produce basic stats

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-3929:
--------------------------

    Fix Version/s:     (was: 0.92.0)
                   0.94.0

> Add option to HFile tool to produce basic stats
> -----------------------------------------------
>
>                 Key: HBASE-3929
>                 URL: https://issues.apache.org/jira/browse/HBASE-3929
>             Project: HBase
>          Issue Type: New Feature
>          Components: io
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.94.0
>
>         Attachments: hbase-3929-draft.txt
>
>
> In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce some basic statistics about it:
> - min/mean/max key size, value size (uncompressed)
> - min/mean/max number of columns per row (uncompressed)
> - min/mean/max number of bytes per row (uncompressed)
> - the key of the largest row

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-3929) Add option to HFile tool to produce basic stats

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-3929:
-------------------------

    Fix Version/s: 0.92.0

> Add option to HFile tool to produce basic stats
> -----------------------------------------------
>
>                 Key: HBASE-3929
>                 URL: https://issues.apache.org/jira/browse/HBASE-3929
>             Project: HBase
>          Issue Type: New Feature
>          Components: io
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.92.0
>
>         Attachments: hbase-3929-draft.txt
>
>
> In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce some basic statistics about it:
> - min/mean/max key size, value size (uncompressed)
> - min/mean/max number of columns per row (uncompressed)
> - min/mean/max number of bytes per row (uncompressed)
> - the key of the largest row

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3929) Add option to HFile tool to produce basic stats

Posted by "Matteo Bertozzi (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13129088#comment-13129088 ] 

Matteo Bertozzi commented on HBASE-3929:
----------------------------------------

Currently HFilePrettyPrinter raise a couple of exceptions if the HFile is Empty, just because it doesn't check if seekTo() returns true or false, and the first call after seekTo() is a scanner.getKeyValue() so you get a NPE...

I've added a v2 patch with the pkv rename, count == 0 handled, and seekTo checked to fix the NPE.
                
> Add option to HFile tool to produce basic stats
> -----------------------------------------------
>
>                 Key: HBASE-3929
>                 URL: https://issues.apache.org/jira/browse/HBASE-3929
>             Project: HBase
>          Issue Type: New Feature
>          Components: io
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.94.0
>
>         Attachments: HBASE-3929-v2.patch, hbase-3929-draft.patch, hbase-3929-draft.txt
>
>
> In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce some basic statistics about it:
> - min/mean/max key size, value size (uncompressed)
> - min/mean/max number of columns per row (uncompressed)
> - min/mean/max number of bytes per row (uncompressed)
> - the key of the largest row

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-3929) Add option to HFile tool to produce basic stats

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13135115#comment-13135115 ] 

Hudson commented on HBASE-3929:
-------------------------------

Integrated in HBase-0.92 #79 (See [https://builds.apache.org/job/HBase-0.92/79/])
    HBASE-3929  Add option to HFile tool to produce basic stats

todd : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java

                
> Add option to HFile tool to produce basic stats
> -----------------------------------------------
>
>                 Key: HBASE-3929
>                 URL: https://issues.apache.org/jira/browse/HBASE-3929
>             Project: HBase
>          Issue Type: New Feature
>          Components: io
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: Matteo Bertozzi
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3929-v2.patch, HBASE-3929-v3.patch, hbase-3929-draft.patch, hbase-3929-draft.txt
>
>
> In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce some basic statistics about it:
> - min/mean/max key size, value size (uncompressed)
> - min/mean/max number of columns per row (uncompressed)
> - min/mean/max number of bytes per row (uncompressed)
> - the key of the largest row

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-3929) Add option to HFile tool to produce basic stats

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049577#comment-13049577 ] 

stack commented on HBASE-3929:
------------------------------

Makes sense.  np.

> Add option to HFile tool to produce basic stats
> -----------------------------------------------
>
>                 Key: HBASE-3929
>                 URL: https://issues.apache.org/jira/browse/HBASE-3929
>             Project: HBase
>          Issue Type: New Feature
>          Components: io
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.92.0
>
>         Attachments: hbase-3929-draft.txt
>
>
> In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce some basic statistics about it:
> - min/mean/max key size, value size (uncompressed)
> - min/mean/max number of columns per row (uncompressed)
> - min/mean/max number of bytes per row (uncompressed)
> - the key of the largest row

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-3929) Add option to HFile tool to produce basic stats

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated HBASE-3929:
-------------------------------

    Attachment: hbase-3929-draft.txt

here's the draft of the code.

before commit, I think we should refactor all of the HFile "Tool" stuff out of HFile into a new class.

> Add option to HFile tool to produce basic stats
> -----------------------------------------------
>
>                 Key: HBASE-3929
>                 URL: https://issues.apache.org/jira/browse/HBASE-3929
>             Project: HBase
>          Issue Type: New Feature
>          Components: io
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.92.0
>
>         Attachments: hbase-3929-draft.txt
>
>
> In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce some basic statistics about it:
> - min/mean/max key size, value size (uncompressed)
> - min/mean/max number of columns per row (uncompressed)
> - min/mean/max number of bytes per row (uncompressed)
> - the key of the largest row

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3929) Add option to HFile tool to produce basic stats

Posted by "Todd Lipcon (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated HBASE-3929:
-------------------------------

      Resolution: Fixed
    Hadoop Flags: Reviewed
          Status: Resolved  (was: Patch Available)

Committed to trunk and 92. Thanks Matteo
                
> Add option to HFile tool to produce basic stats
> -----------------------------------------------
>
>                 Key: HBASE-3929
>                 URL: https://issues.apache.org/jira/browse/HBASE-3929
>             Project: HBase
>          Issue Type: New Feature
>          Components: io
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.94.0
>
>         Attachments: HBASE-3929-v2.patch, HBASE-3929-v3.patch, hbase-3929-draft.patch, hbase-3929-draft.txt
>
>
> In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce some basic statistics about it:
> - min/mean/max key size, value size (uncompressed)
> - min/mean/max number of columns per row (uncompressed)
> - min/mean/max number of bytes per row (uncompressed)
> - the key of the largest row

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-3929) Add option to HFile tool to produce basic stats

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13040385#comment-13040385 ] 

stack commented on HBASE-3929:
------------------------------

This looks great. +1 on commit whether or not we wait on factoring out the cmd-line hfile stuff.

> Add option to HFile tool to produce basic stats
> -----------------------------------------------
>
>                 Key: HBASE-3929
>                 URL: https://issues.apache.org/jira/browse/HBASE-3929
>             Project: HBase
>          Issue Type: New Feature
>          Components: io
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.92.0
>
>         Attachments: hbase-3929-draft.txt
>
>
> In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce some basic statistics about it:
> - min/mean/max key size, value size (uncompressed)
> - min/mean/max number of columns per row (uncompressed)
> - min/mean/max number of bytes per row (uncompressed)
> - the key of the largest row

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3929) Add option to HFile tool to produce basic stats

Posted by "Todd Lipcon (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13129035#comment-13129035 ] 

Todd Lipcon commented on HBASE-3929:
------------------------------------

Thanks for updating the patch to trunk. A couple of comments (fun to look back over my own code from a few months back):

- let's rename {{pkv}} to {{prevKV}}
- in the case of an empty HFile, we would currently throw a divide-by-zero. In LongStats.toString, we should check for count == 0 and return "no data" or something

                
> Add option to HFile tool to produce basic stats
> -----------------------------------------------
>
>                 Key: HBASE-3929
>                 URL: https://issues.apache.org/jira/browse/HBASE-3929
>             Project: HBase
>          Issue Type: New Feature
>          Components: io
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.94.0
>
>         Attachments: hbase-3929-draft.patch, hbase-3929-draft.txt
>
>
> In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce some basic statistics about it:
> - min/mean/max key size, value size (uncompressed)
> - min/mean/max number of columns per row (uncompressed)
> - min/mean/max number of bytes per row (uncompressed)
> - the key of the largest row

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-3929) Add option to HFile tool to produce basic stats

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated HBASE-3929:
-------------------------------

    Description: 
In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce some basic statistics about it:
- min/mean/max key size, value size (uncompressed)
- min/mean/max number of columns per row (uncompressed)
- min/mean/max number of bytes per row (uncompressed)
- the key of the largest row

  was:
In looking at HBASE-3241 I wrote a small tool to scan an HFile and produce some basic statistics about it:
- min/mean/max key size, value size (uncompressed)
- min/mean/max number of columns per row (uncompressed)
- min/mean/max number of bytes per row (uncompressed)
- the key of the largest row


> Add option to HFile tool to produce basic stats
> -----------------------------------------------
>
>                 Key: HBASE-3929
>                 URL: https://issues.apache.org/jira/browse/HBASE-3929
>             Project: HBase
>          Issue Type: New Feature
>          Components: io
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.92.0
>
>
> In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce some basic statistics about it:
> - min/mean/max key size, value size (uncompressed)
> - min/mean/max number of columns per row (uncompressed)
> - min/mean/max number of bytes per row (uncompressed)
> - the key of the largest row

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3929) Add option to HFile tool to produce basic stats

Posted by "Todd Lipcon (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated HBASE-3929:
-------------------------------

    Fix Version/s:     (was: 0.94.0)
                   0.92.0
    
> Add option to HFile tool to produce basic stats
> -----------------------------------------------
>
>                 Key: HBASE-3929
>                 URL: https://issues.apache.org/jira/browse/HBASE-3929
>             Project: HBase
>          Issue Type: New Feature
>          Components: io
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: Matteo Bertozzi
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3929-v2.patch, HBASE-3929-v3.patch, hbase-3929-draft.patch, hbase-3929-draft.txt
>
>
> In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce some basic statistics about it:
> - min/mean/max key size, value size (uncompressed)
> - min/mean/max number of columns per row (uncompressed)
> - min/mean/max number of bytes per row (uncompressed)
> - the key of the largest row

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-3929) Add option to HFile tool to produce basic stats

Posted by "Matteo Bertozzi (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Matteo Bertozzi updated HBASE-3929:
-----------------------------------

    Attachment: HBASE-3929-v3.patch

Added v3 that applies to trunk after HBASE-4595 integration.
                
> Add option to HFile tool to produce basic stats
> -----------------------------------------------
>
>                 Key: HBASE-3929
>                 URL: https://issues.apache.org/jira/browse/HBASE-3929
>             Project: HBase
>          Issue Type: New Feature
>          Components: io
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.94.0
>
>         Attachments: HBASE-3929-v2.patch, HBASE-3929-v3.patch, hbase-3929-draft.patch, hbase-3929-draft.txt
>
>
> In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce some basic statistics about it:
> - min/mean/max key size, value size (uncompressed)
> - min/mean/max number of columns per row (uncompressed)
> - min/mean/max number of bytes per row (uncompressed)
> - the key of the largest row

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-3929) Add option to HFile tool to produce basic stats

Posted by "Matteo Bertozzi (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Matteo Bertozzi updated HBASE-3929:
-----------------------------------

    Attachment: HBASE-3929-v2.patch
    
> Add option to HFile tool to produce basic stats
> -----------------------------------------------
>
>                 Key: HBASE-3929
>                 URL: https://issues.apache.org/jira/browse/HBASE-3929
>             Project: HBase
>          Issue Type: New Feature
>          Components: io
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.94.0
>
>         Attachments: HBASE-3929-v2.patch, hbase-3929-draft.patch, hbase-3929-draft.txt
>
>
> In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce some basic statistics about it:
> - min/mean/max key size, value size (uncompressed)
> - min/mean/max number of columns per row (uncompressed)
> - min/mean/max number of bytes per row (uncompressed)
> - the key of the largest row

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-3929) Add option to HFile tool to produce basic stats

Posted by "Todd Lipcon (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132029#comment-13132029 ] 

Todd Lipcon commented on HBASE-3929:
------------------------------------

+1, looks good to me. I'll commit to 92 and trunk since this won't affect stability in any way.
                
> Add option to HFile tool to produce basic stats
> -----------------------------------------------
>
>                 Key: HBASE-3929
>                 URL: https://issues.apache.org/jira/browse/HBASE-3929
>             Project: HBase
>          Issue Type: New Feature
>          Components: io
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.94.0
>
>         Attachments: HBASE-3929-v2.patch, hbase-3929-draft.patch, hbase-3929-draft.txt
>
>
> In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce some basic statistics about it:
> - min/mean/max key size, value size (uncompressed)
> - min/mean/max number of columns per row (uncompressed)
> - min/mean/max number of bytes per row (uncompressed)
> - the key of the largest row

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-3929) Add option to HFile tool to produce basic stats

Posted by "Matteo Bertozzi (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128224#comment-13128224 ] 

Matteo Bertozzi commented on HBASE-3929:
----------------------------------------

>From 0.92 the HFile.main() contains just a call to HFilePrettyPrinter.run()
So there's no more the "Tool" code inside the HFile.java

Probably was not the refactor that todd has in mind, but it solve the first todd's thought:
'we should refactor all of the HFile "Tool" stuff out of HFile into a new class.'
                
> Add option to HFile tool to produce basic stats
> -----------------------------------------------
>
>                 Key: HBASE-3929
>                 URL: https://issues.apache.org/jira/browse/HBASE-3929
>             Project: HBase
>          Issue Type: New Feature
>          Components: io
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.94.0
>
>         Attachments: hbase-3929-draft.patch, hbase-3929-draft.txt
>
>
> In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce some basic statistics about it:
> - min/mean/max key size, value size (uncompressed)
> - min/mean/max number of columns per row (uncompressed)
> - min/mean/max number of bytes per row (uncompressed)
> - the key of the largest row

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-3929) Add option to HFile tool to produce basic stats

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13133050#comment-13133050 ] 

Hudson commented on HBASE-3929:
-------------------------------

Integrated in HBase-TRUNK #2352 (See [https://builds.apache.org/job/HBase-TRUNK/2352/])
    HBASE-3929  Add option to HFile tool to produce basic stats

todd : 
Files : 
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java

                
> Add option to HFile tool to produce basic stats
> -----------------------------------------------
>
>                 Key: HBASE-3929
>                 URL: https://issues.apache.org/jira/browse/HBASE-3929
>             Project: HBase
>          Issue Type: New Feature
>          Components: io
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: Matteo Bertozzi
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3929-v2.patch, HBASE-3929-v3.patch, hbase-3929-draft.patch, hbase-3929-draft.txt
>
>
> In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce some basic statistics about it:
> - min/mean/max key size, value size (uncompressed)
> - min/mean/max number of columns per row (uncompressed)
> - min/mean/max number of bytes per row (uncompressed)
> - the key of the largest row

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-3929) Add option to HFile tool to produce basic stats

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049544#comment-13049544 ] 

Todd Lipcon commented on HBASE-3929:
------------------------------------

My anti-commit rationale is that, if we commit this, then there won't be any motivation to refactor later. If we make this commit conditional on cleaning it up first, we're more likely to actually clean up!

> Add option to HFile tool to produce basic stats
> -----------------------------------------------
>
>                 Key: HBASE-3929
>                 URL: https://issues.apache.org/jira/browse/HBASE-3929
>             Project: HBase
>          Issue Type: New Feature
>          Components: io
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.92.0
>
>         Attachments: hbase-3929-draft.txt
>
>
> In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce some basic statistics about it:
> - min/mean/max key size, value size (uncompressed)
> - min/mean/max number of columns per row (uncompressed)
> - min/mean/max number of bytes per row (uncompressed)
> - the key of the largest row

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-3929) Add option to HFile tool to produce basic stats

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049517#comment-13049517 ] 

Todd Lipcon commented on HBASE-3929:
------------------------------------

We can always leave a 'main()' for one version which just calls through to the new tool after printing a deprecation method.

> Add option to HFile tool to produce basic stats
> -----------------------------------------------
>
>                 Key: HBASE-3929
>                 URL: https://issues.apache.org/jira/browse/HBASE-3929
>             Project: HBase
>          Issue Type: New Feature
>          Components: io
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.92.0
>
>         Attachments: hbase-3929-draft.txt
>
>
> In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce some basic statistics about it:
> - min/mean/max key size, value size (uncompressed)
> - min/mean/max number of columns per row (uncompressed)
> - min/mean/max number of bytes per row (uncompressed)
> - the key of the largest row

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-3929) Add option to HFile tool to produce basic stats

Posted by "Matteo Bertozzi (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Matteo Bertozzi updated HBASE-3929:
-----------------------------------

    Attachment: hbase-3929-draft.patch

patch updated for hbase trunk
                
> Add option to HFile tool to produce basic stats
> -----------------------------------------------
>
>                 Key: HBASE-3929
>                 URL: https://issues.apache.org/jira/browse/HBASE-3929
>             Project: HBase
>          Issue Type: New Feature
>          Components: io
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.94.0
>
>         Attachments: hbase-3929-draft.patch, hbase-3929-draft.txt
>
>
> In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce some basic statistics about it:
> - min/mean/max key size, value size (uncompressed)
> - min/mean/max number of columns per row (uncompressed)
> - min/mean/max number of bytes per row (uncompressed)
> - the key of the largest row

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-3929) Add option to HFile tool to produce basic stats

Posted by "Todd Lipcon (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132037#comment-13132037 ] 

Todd Lipcon commented on HBASE-3929:
------------------------------------

actually, looks like HBASE-4595 conflicts with this. Would you mind updating one last time for trunk?
                
> Add option to HFile tool to produce basic stats
> -----------------------------------------------
>
>                 Key: HBASE-3929
>                 URL: https://issues.apache.org/jira/browse/HBASE-3929
>             Project: HBase
>          Issue Type: New Feature
>          Components: io
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.94.0
>
>         Attachments: HBASE-3929-v2.patch, hbase-3929-draft.patch, hbase-3929-draft.txt
>
>
> In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce some basic statistics about it:
> - min/mean/max key size, value size (uncompressed)
> - min/mean/max number of columns per row (uncompressed)
> - min/mean/max number of bytes per row (uncompressed)
> - the key of the largest row

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (HBASE-3929) Add option to HFile tool to produce basic stats

Posted by "Todd Lipcon (Assigned) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon reassigned HBASE-3929:
----------------------------------

    Assignee: Matteo Bertozzi  (was: Todd Lipcon)
    
> Add option to HFile tool to produce basic stats
> -----------------------------------------------
>
>                 Key: HBASE-3929
>                 URL: https://issues.apache.org/jira/browse/HBASE-3929
>             Project: HBase
>          Issue Type: New Feature
>          Components: io
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: Matteo Bertozzi
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3929-v2.patch, HBASE-3929-v3.patch, hbase-3929-draft.patch, hbase-3929-draft.txt
>
>
> In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce some basic statistics about it:
> - min/mean/max key size, value size (uncompressed)
> - min/mean/max number of columns per row (uncompressed)
> - min/mean/max number of bytes per row (uncompressed)
> - the key of the largest row

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-3929) Add option to HFile tool to produce basic stats

Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128208#comment-13128208 ] 

Ted Yu commented on HBASE-3929:
-------------------------------

@Matteo:
Are you going to perform the refactoring Todd mentioned ?

Thanks
                
> Add option to HFile tool to produce basic stats
> -----------------------------------------------
>
>                 Key: HBASE-3929
>                 URL: https://issues.apache.org/jira/browse/HBASE-3929
>             Project: HBase
>          Issue Type: New Feature
>          Components: io
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.94.0
>
>         Attachments: hbase-3929-draft.patch, hbase-3929-draft.txt
>
>
> In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce some basic statistics about it:
> - min/mean/max key size, value size (uncompressed)
> - min/mean/max number of columns per row (uncompressed)
> - min/mean/max number of bytes per row (uncompressed)
> - the key of the largest row

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-3929) Add option to HFile tool to produce basic stats

Posted by "stack (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-3929:
-------------------------

    Status: Patch Available  (was: Open)
    
> Add option to HFile tool to produce basic stats
> -----------------------------------------------
>
>                 Key: HBASE-3929
>                 URL: https://issues.apache.org/jira/browse/HBASE-3929
>             Project: HBase
>          Issue Type: New Feature
>          Components: io
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.94.0
>
>         Attachments: HBASE-3929-v2.patch, hbase-3929-draft.patch, hbase-3929-draft.txt
>
>
> In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce some basic statistics about it:
> - min/mean/max key size, value size (uncompressed)
> - min/mean/max number of columns per row (uncompressed)
> - min/mean/max number of bytes per row (uncompressed)
> - the key of the largest row

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-3929) Add option to HFile tool to produce basic stats

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049498#comment-13049498 ] 

stack commented on HBASE-3929:
------------------------------

bq. before commit, I think we should refactor all of the HFile "Tool" stuff out of HFile into a new class.

How about doing this some other time in another issue and commit what we have here?

If we change the door into the HFile, we'll need to update docs, lars will have to change his book, the world will end, don't you think?

> Add option to HFile tool to produce basic stats
> -----------------------------------------------
>
>                 Key: HBASE-3929
>                 URL: https://issues.apache.org/jira/browse/HBASE-3929
>             Project: HBase
>          Issue Type: New Feature
>          Components: io
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.92.0
>
>         Attachments: hbase-3929-draft.txt
>
>
> In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce some basic statistics about it:
> - min/mean/max key size, value size (uncompressed)
> - min/mean/max number of columns per row (uncompressed)
> - min/mean/max number of bytes per row (uncompressed)
> - the key of the largest row

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-3929) Add option to HFile tool to produce basic stats

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049540#comment-13049540 ] 

stack commented on HBASE-3929:
------------------------------

@Todd Patch as is adds value.  The refactor adds value, but its minuscule compared.  You are busy.  I don't want the minuscule getting in the way of our getting value.  Thats my rationale.

> Add option to HFile tool to produce basic stats
> -----------------------------------------------
>
>                 Key: HBASE-3929
>                 URL: https://issues.apache.org/jira/browse/HBASE-3929
>             Project: HBase
>          Issue Type: New Feature
>          Components: io
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.92.0
>
>         Attachments: hbase-3929-draft.txt
>
>
> In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce some basic statistics about it:
> - min/mean/max key size, value size (uncompressed)
> - min/mean/max number of columns per row (uncompressed)
> - min/mean/max number of bytes per row (uncompressed)
> - the key of the largest row

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-3929) Add option to HFile tool to produce basic stats

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049535#comment-13049535 ] 

stack commented on HBASE-3929:
------------------------------

Yeah, you could, in a new JIRA? (smile)

> Add option to HFile tool to produce basic stats
> -----------------------------------------------
>
>                 Key: HBASE-3929
>                 URL: https://issues.apache.org/jira/browse/HBASE-3929
>             Project: HBase
>          Issue Type: New Feature
>          Components: io
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.92.0
>
>         Attachments: hbase-3929-draft.txt
>
>
> In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce some basic statistics about it:
> - min/mean/max key size, value size (uncompressed)
> - min/mean/max number of columns per row (uncompressed)
> - min/mean/max number of bytes per row (uncompressed)
> - the key of the largest row

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira