You are viewing a plain text version of this content. The canonical link for it is here.

Posted to notifications@accumulo.apache.org by "Sean Busbey (JIRA)" <ji...@apache.org> on 2014/06/09 15:37:01 UTC

[jira] [Commented] (ACCUMULO-2873) Create utility that generates single line tablet information

    [ https://issues.apache.org/jira/browse/ACCUMULO-2873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025186#comment-14025186 ] 

Sean Busbey commented on ACCUMULO-2873:
---------------------------------------

if we used Avro for this output, it would easily handle the binary/text issue. We could also use the existing avro-tools utilities to get a textual representation or to do projections of some subset of data.

Also it'd be super easy then to work with it programmatically.

It'd also leverage Avro's extensive schema evolution support, so changes would be easy to do.

> Create utility that generates single line tablet information
> ------------------------------------------------------------
>
>                 Key: ACCUMULO-2873
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-2873
>             Project: Accumulo
>          Issue Type: New Feature
>            Reporter: Keith Turner
>              Labels: newbie
>             Fix For: 1.7.0
>
>
> It would be very useful to have a utility that generates single line tablet info.  The output of this could be fed to sort, awk, grep, etc inorder to answer questions like which tablets have the most files.
> The output could look something like the following
> {noformat}
> $accumulo admin listTablets --table bigTable3
> #files #walogs #entries #size #status #location #tableid #endrow
> 6 2 40,001 50M ASSIGNED 10.1.9.9 4:9997[abc]  3 admin
> 3 1 50,002 40M ASSIGNED 10.1.9.9 5:9997[abc]  3 helpful
> {noformat}
> All of the information can be obtained by scanning the metadata table and looking into zookeeper.   Could possibly contact tablet servers to get info about entires in memory.
> The order of the columns in the example above is arbitrary, except for end row.  Maybe end row column should come last because it can be of arbitrary length.  Also the end row could contain any character, could look into using a CSV library.   It would be nice to design the utility so that columns can be added in future versions w/o impacting current scripts that use the utility.



--
This message was sent by Atlassian JIRA
(v6.2#6252)