You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Dan Burkert (Code Review)" <ge...@cloudera.org> on 2017/12/22 03:48:17 UTC

[kudu-CR] Add 'kudu fs list' tool

Dan Burkert has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8911


Change subject: Add 'kudu fs list' tool
......................................................................

Add 'kudu fs list' tool

This tool aims to replace most of the pure-metadata usages of
'kudu fs dump' and 'kudu local_replica dump'. 'kudu fs list' is more
flexible, easier to use, and can show more information.

Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
---
M src/kudu/tablet/rowset_metadata.h
M src/kudu/tools/tool_action.cc
M src/kudu/tools/tool_action_fs.cc
M src/kudu/tools/tool_action_tserver.cc
4 files changed, 450 insertions(+), 12 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/11/8911/1
-- 
To view, visit http://gerrit.cloudera.org:8080/8911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
Gerrit-Change-Number: 8911
Gerrit-PatchSet: 1
Gerrit-Owner: Dan Burkert <da...@apache.org>

[kudu-CR] Add 'kudu fs list' tool

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/8911 )

Change subject: Add 'kudu fs list' tool
......................................................................


Patch Set 6:

(2 comments)

One question and one typo, then I'm +2

http://gerrit.cloudera.org:8080/#/c/8911/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8911/6//COMMIT_MSG@9
PS6, Line 9: replace exploratory usages of 'kudu fs dump' and
           : 'kudu local_replica dump'
What's our stability policy for tools? Should we add (or plan to add after this tool has been around for a bit) a deprecation warning message to fs dump and local_replica dump?


http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/tool_action_fs.cc
File src/kudu/tools/tool_action_fs.cc:

http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/tool_action_fs.cc@85
PS6, Line 85: tablet
rowset



-- 
To view, visit http://gerrit.cloudera.org:8080/8911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
Gerrit-Change-Number: 8911
Gerrit-PatchSet: 6
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Wed, 27 Dec 2017 20:29:36 +0000
Gerrit-HasComments: Yes

[kudu-CR] Add 'kudu fs list' tool

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has posted comments on this change. ( http://gerrit.cloudera.org:8080/8911 )

Change subject: Add 'kudu fs list' tool
......................................................................


Patch Set 6:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/8911/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8911/6//COMMIT_MSG@9
PS6, Line 9: replace exploratory usages of 'kudu fs dump' and
           : 'kudu local_replica dump'
> What's our stability policy for tools? Should we add (or plan to add after 
I'm not sure.  I think they aren't considered as stable as other APIs, but I would definitely shy away from removing one without deprecation.  In this case the new tool doesn't completely replace all of the functionality of the dump tools, instead it's meant as a kind of go-to swiss army knife for exploring on-disk metadata. The dump tools are still useful when the exact data on disk needs to be accessed.


http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/kudu-tool-test.cc@404
PS6, Line 404:     const vector<string> kFsModeRegexes = {
> This should be updated.
Done


http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/tool_action_fs.cc
File src/kudu/tools/tool_action_fs.cc:

http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/tool_action_fs.cc@85
PS6, Line 85: tablet
> rowset
Done


http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/tool_action_fs.cc@396
PS6, Line 396:   switch (group) {
> What happens if I add an entry to FieldGroup but forgot to update this swit
We get a warning.  I've also added the LOG(FATAL) to make sure it doesn't go off the rails.


http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/tool_action_fs.cc@415
PS6, Line 415:   static const Field kFieldVariants[] = {
> Could you use the enum macros from gutil/casts.h to avoid this?
This doesn't appear to be possible with enum classes.


http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/tool_action_fs.cc@483
PS6, Line 483: // Returns rowset info for the field.
> Worth logging the min/max keys of the rowset?
It turns out that the min/max keys aren't part of the RowSetMetadata class, so it would require some non-trivial changes to expose them.  I'd prefer to leave that to a follow-up commit if we want it.  I have gone ahead and exposed per-cfile min/max key, since that was easier, as well as cfile delta stats.


http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/tool_action_fs.cc@584
PS6, Line 584:     CHECK_OK(fs_manager->OpenBlock(block, &readable_block));
> CHECK_OK seems wrong for a CLI tool; why not RETURN_NOT_OK and return an er
Done


http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/tool_action_fs.cc@620
PS6, Line 620:     tablet_ids.emplace_back(FLAGS_tablet_id);
> Do we need to ToLowerCase this?
Done


http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/tool_action_fs.cc@633
PS6, Line 633:     WARN_NOT_OK(TabletMetadata::Load(&fs_manager, tablet_id, &tablet_metadata),
> Why not RETURN_NOT_OK on this?
My reasoning was that if there is a corrupt tablet for whatever reason, we still want the tool to work. However, it can still work by filtering using the --tablet-id flag, so I've changed it to RETURN_NOT_OK.


http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/tool_action_fs.cc@652
PS6, Line 652:       if (FLAGS_rowset_id > 0 && FLAGS_rowset_id != rowset.id()) {
> Shouldn't this be FLAGS_rowset_id != -1 && FLAGS_rowset_id != rowset.id() ?
Done


http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/tool_action_fs.cc@688
PS6, Line 688:       // TODO(dan): should orphaned blocks be included, perhaps behind a flag?
> Perhaps, but this comment is slightly misplaced; orphaned blocks are a tabl
Done


http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/tool_action_fs.cc@786
PS6, Line 786:                                    "Possible values: table, table-id, tablet-id, partition, "
> Seems like it might be easy to accidentally omit an entry; could we constru
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/8911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
Gerrit-Change-Number: 8911
Gerrit-PatchSet: 6
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Sat, 06 Jan 2018 00:28:53 +0000
Gerrit-HasComments: Yes

[kudu-CR] Add 'kudu fs list' tool

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/8911 )

Change subject: Add 'kudu fs list' tool
......................................................................


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8911/8/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/8911/8/src/kudu/tools/kudu-tool-test.cc@409
PS8, Line 409:         "list.*List metadata for on-disk tablets, rowsets, blocks"
Nit: could you preserve the alphabetical ordering that was here previously?



-- 
To view, visit http://gerrit.cloudera.org:8080/8911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
Gerrit-Change-Number: 8911
Gerrit-PatchSet: 9
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Sat, 06 Jan 2018 00:33:41 +0000
Gerrit-HasComments: Yes

[kudu-CR] Add 'kudu fs list' tool

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8911 )

Change subject: Add 'kudu fs list' tool
......................................................................

Add 'kudu fs list' tool

This tool aims to replace exploratory usages of 'kudu fs dump' and
'kudu local_replica dump' with an improved, unified tool. 'kudu fs list' is
more flexible, easier to use, and can show more information.

Output is formatted using the DataTable abstraction, which gives it
good default pretty-printing, with options to output in CSV and JSON for
scripts. Results can easily be filtered to a specific table, tablet, column,
rowset, or block using flags.

The tool can output many different fields: table, table-id, tablet-id,
partition, rowset-id, block-id, block-kind, column, column-id,
cfile-data-type, cfile-encoding, cfile-compression, cfile-num-values,
cfile-size cfile-incompatible-features, cfile-compatible-features,
cfile-min-key, cfile-max-key, and cfile-delta-stats. More fields should
be straightforward to add.

The tool transparently joins information from tablet superblocks with
CFile footers, only materializing the metadata necessary to satisfy the
requested fields and filters.

Examples:

To get our bearings, let's look at what tablets are stored on a local
tablet server:

```bash
$ kudu fs list --fs-wal-dir=/data/kudu/tserver \
    --columns="table, table-id, tablet-id, partition"

                     table                     |             table-id             |            tablet-id             |                        partition
-----------------------------------------------+----------------------------------+----------------------------------+---------------------------------------------------------
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 2a631714f2d243ff92bf525630baa1ec | HASH (key) PARTITION 7, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 36827286a00049bc8b242243c6728157 | HASH (key) PARTITION 0, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 3880b30ccebd4ede867febd9c7d5580f | HASH (key) PARTITION 0, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 39436e9e17d84884b1cb689e88b8415f | HASH (key) PARTITION 5, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 44252efb9aaa4c2c963cf6dd5e875c04 | HASH (key) PARTITION 3, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 57c92ed8391b4d2bbfdeb339f9fb59fd | HASH (key) PARTITION 2, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 68a64aba5917499ebb7773f16bcd6f6d | HASH (key) PARTITION 7, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 6b5f0729a9bf454791239f77b0912f4e | HASH (key) PARTITION 1, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 8a2d120bd6984144ae963bfe8435206e | HASH (key) PARTITION 4, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 8b3ba4f415f945849a6a690a142cf1e4 | HASH (key) PARTITION 5, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 9656be3aa07248a69e3ad6edaa0048cb | HASH (key) PARTITION 1, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 9e8e444d079842a9b4a83ee9f8bed633 | HASH (key) PARTITION 6, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | a794a8e5d3f24e70a96b0beb5a355823 | HASH (key) PARTITION 3, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | bfb8f24b91cd4ecf924aacbb37125041 | HASH (key) PARTITION 2, RANGE (key) PARTITION UNBOUNDED
 foo                                           | e184a99893b44b17a7b2131123c6de0e | c3ce418c72ab4fea8548387f236dd1fa |
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | e00a284081ca468a994a3609a511e886 | HASH (key) PARTITION 4, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | efa22fc899a44bb2a16f620464a15c60 | HASH (key) PARTITION 6, RANGE (key) PARTITION UNBOUNDED
```

The 'foo' table looks interesting; let's drill down into its tablet, and
see what rowsets and blocks it has, and some of their associated metadata:

```bash
$ kudu fs list --fs-wal-dir=/data/kudu/tserver \
    --columns="rowset-id, column, column-id, block-kind, block-id" \
    --tablet-id=c3ce418c72ab4fea8548387f236dd1fa

 rowset-id | column | column-id | block-kind  |    block-id
-----------+--------+-----------+-------------+----------------
 0         | k1     | 10        | column      | 90680632611552
 0         | k2     | 11        | column      | 90680632611553
 0         | k3     | 12        | column      | 90680632611554
 0         | k4     | 13        | column      | 90680632611555
 0         | v1     | 14        | column      | 90680632611556
 0         | v2     | 15        | column      | 90680632611557
 0         | v3     | 16        | column      | 90680632611558
 0         | v4     | 17        | column      | 90680632611559
 0         |        |           | bloom       | 90680632611560
 0         |        |           | adhoc-index | 90680632611561
 1         | k1     | 10        | column      | 90680632611564
 1         | k2     | 11        | column      | 90680632611565
 1         | k3     | 12        | column      | 90680632611566
 1         | k4     | 13        | column      | 90680632611567
 1         | v1     | 14        | column      | 90680632611568
 1         | v2     | 15        | column      | 90680632611569
 1         | v3     | 16        | column      | 90680632611570
 1         | v4     | 17        | column      | 90680632611571
 1         |        |           | bloom       | 90680632611572
 1         |        |           | adhoc-index | 90680632611573
```

We can immediately see that this tablet has two rowsets, each of which
has 8 column blocks, a bloom block, and an ad-hoc index block. Lets
drill down futher and inspect the 'v4' column:

```bash
$ kudu fs list --fs-wal-dir=<> \
    --columns="block-id, cfile-data-type, cfile-encoding, cfile-compression, cfile-num-values, cfile-size" \
    --tablet-id=c3ce418c72ab4fea8548387f236dd1fa \
    --column-id=17

    block-id    | cfile-data-type | cfile-encoding | cfile-compression | cfile-num-values | cfile-size
----------------+-----------------+----------------+-------------------+------------------+------------
 90680632611555 | int64           | BIT_SHUFFLE    | NO_COMPRESSION    | 5.09M            | 782.6K
 90680632611567 | int64           | BIT_SHUFFLE    | NO_COMPRESSION    | 5.40M            | 830.1K
```

And we can immediately see the CFile's on-disk encoding and compression,
the number of cells, and the CFile/block size.

Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
Reviewed-on: http://gerrit.cloudera.org:8080/8911
Reviewed-by: Adar Dembo <ad...@cloudera.com>
Tested-by: Kudu Jenkins
---
M src/kudu/cfile/cfile_reader.cc
M src/kudu/cfile/cfile_reader.h
M src/kudu/gutil/strings/join.h
M src/kudu/tablet/rowset_metadata.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action.cc
M src/kudu/tools/tool_action_fs.cc
M src/kudu/tools/tool_action_tserver.cc
8 files changed, 587 insertions(+), 23 deletions(-)

Approvals:
  Adar Dembo: Looks good to me, approved
  Kudu Jenkins: Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/8911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
Gerrit-Change-Number: 8911
Gerrit-PatchSet: 11
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] Add 'kudu fs list' tool

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has posted comments on this change. ( http://gerrit.cloudera.org:8080/8911 )

Change subject: Add 'kudu fs list' tool
......................................................................


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8911/8/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/8911/8/src/kudu/tools/kudu-tool-test.cc@409
PS8, Line 409:         "list.*List metadata for on-disk tablets, rowsets, blocks"
> Nit: could you preserve the alphabetical ordering that was here previously?
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/8911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
Gerrit-Change-Number: 8911
Gerrit-PatchSet: 8
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Sat, 06 Jan 2018 00:40:03 +0000
Gerrit-HasComments: Yes

[kudu-CR] Add 'kudu fs list' tool

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/8911 )

Change subject: Add 'kudu fs list' tool
......................................................................


Patch Set 6:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/kudu-tool-test.cc@404
PS6, Line 404:     const vector<string> kFsModeRegexes = {
This should be updated.


http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/tool_action_fs.cc
File src/kudu/tools/tool_action_fs.cc:

http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/tool_action_fs.cc@396
PS6, Line 396:   switch (group) {
What happens if I add an entry to FieldGroup but forgot to update this switch? Do I get a compile error? A warning? Nothing?

If it's not a compile error, is there anything we should add here to guarantee good behavior? Like a default statement, a LOG(FATAL), or something like that?


http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/tool_action_fs.cc@415
PS6, Line 415:   static const Field kFieldVariants[] = {
Could you use the enum macros from gutil/casts.h to avoid this?


http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/tool_action_fs.cc@483
PS6, Line 483: // Returns rowset info for the field.
Worth logging the min/max keys of the rowset?


http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/tool_action_fs.cc@584
PS6, Line 584:     CHECK_OK(fs_manager->OpenBlock(block, &readable_block));
CHECK_OK seems wrong for a CLI tool; why not RETURN_NOT_OK and return an error from List if one of these fails?


http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/tool_action_fs.cc@620
PS6, Line 620:     tablet_ids.emplace_back(FLAGS_tablet_id);
Do we need to ToLowerCase this?


http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/tool_action_fs.cc@633
PS6, Line 633:     WARN_NOT_OK(TabletMetadata::Load(&fs_manager, tablet_id, &tablet_metadata),
Why not RETURN_NOT_OK on this?


http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/tool_action_fs.cc@652
PS6, Line 652:       if (FLAGS_rowset_id > 0 && FLAGS_rowset_id != rowset.id()) {
Shouldn't this be FLAGS_rowset_id != -1 && FLAGS_rowset_id != rowset.id() ?


http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/tool_action_fs.cc@688
PS6, Line 688:       // TODO(dan): should orphaned blocks be included, perhaps behind a flag?
Perhaps, but this comment is slightly misplaced; orphaned blocks are a tablet-level thing, not a rowset-level thing, so the comment should be outside the inner (rowset_metadata) loop.


http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/tool_action_fs.cc@786
PS6, Line 786:                                    "Possible values: table, table-id, tablet-id, partition, "
Seems like it might be easy to accidentally omit an entry; could we construct this list on-the-fly by iterating on the Field enum class?



-- 
To view, visit http://gerrit.cloudera.org:8080/8911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
Gerrit-Change-Number: 8911
Gerrit-PatchSet: 6
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Thu, 04 Jan 2018 00:22:53 +0000
Gerrit-HasComments: Yes

[kudu-CR] Add 'kudu fs list' tool

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Hello Will Berkeley, Tidy Bot, Mike Percy, Kudu Jenkins, Adar Dembo, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8911

to look at the new patch set (#7).

Change subject: Add 'kudu fs list' tool
......................................................................

Add 'kudu fs list' tool

This tool aims to replace exploratory usages of 'kudu fs dump' and
'kudu local_replica dump' with an improved, unified tool. 'kudu fs list' is
more flexible, easier to use, and can show more information.

Output is formatted using the DataTable abstraction, which gives it
good default pretty-printing, with options to output in CSV and JSON for
scripts. Results can easily be filtered to a specific table, tablet, column,
rowset, or block using flags.

The tool can output many different fields: table, table-id, tablet-id,
partition, rowset-id, block-id, block-kind, column, column-id,
cfile-data-type, cfile-encoding, cfile-compression, cfile-num-values,
cfile-size cfile-incompatible-features, cfile-compatible-features,
cfile-min-key, cfile-max-key, and cfile-delta-stats. More fields should
be straightforward to add.

The tool transparently joins information from tablet superblocks with
CFile footers, only materializing the metadata necessary to satisfy the
requested fields and filters.

Examples:

To get our bearings, let's look at what tablets are stored on a local
tablet server:

```bash
$ kudu fs list --fs-wal-dir=/data/kudu/tserver \
    --columns="table, table-id, tablet-id, partition"

                     table                     |             table-id             |            tablet-id             |                        partition
-----------------------------------------------+----------------------------------+----------------------------------+---------------------------------------------------------
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 2a631714f2d243ff92bf525630baa1ec | HASH (key) PARTITION 7, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 36827286a00049bc8b242243c6728157 | HASH (key) PARTITION 0, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 3880b30ccebd4ede867febd9c7d5580f | HASH (key) PARTITION 0, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 39436e9e17d84884b1cb689e88b8415f | HASH (key) PARTITION 5, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 44252efb9aaa4c2c963cf6dd5e875c04 | HASH (key) PARTITION 3, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 57c92ed8391b4d2bbfdeb339f9fb59fd | HASH (key) PARTITION 2, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 68a64aba5917499ebb7773f16bcd6f6d | HASH (key) PARTITION 7, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 6b5f0729a9bf454791239f77b0912f4e | HASH (key) PARTITION 1, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 8a2d120bd6984144ae963bfe8435206e | HASH (key) PARTITION 4, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 8b3ba4f415f945849a6a690a142cf1e4 | HASH (key) PARTITION 5, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 9656be3aa07248a69e3ad6edaa0048cb | HASH (key) PARTITION 1, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 9e8e444d079842a9b4a83ee9f8bed633 | HASH (key) PARTITION 6, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | a794a8e5d3f24e70a96b0beb5a355823 | HASH (key) PARTITION 3, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | bfb8f24b91cd4ecf924aacbb37125041 | HASH (key) PARTITION 2, RANGE (key) PARTITION UNBOUNDED
 foo                                           | e184a99893b44b17a7b2131123c6de0e | c3ce418c72ab4fea8548387f236dd1fa |
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | e00a284081ca468a994a3609a511e886 | HASH (key) PARTITION 4, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | efa22fc899a44bb2a16f620464a15c60 | HASH (key) PARTITION 6, RANGE (key) PARTITION UNBOUNDED
```

The 'foo' table looks interesting; let's drill down into its tablet, and
see what rowsets and blocks it has, and some of their associated metadata:

```bash
$ kudu fs list --fs-wal-dir=/data/kudu/tserver \
    --columns="rowset-id, column, column-id, block-kind, block-id" \
    --tablet-id=c3ce418c72ab4fea8548387f236dd1fa

 rowset-id | column | column-id | block-kind  |    block-id
-----------+--------+-----------+-------------+----------------
 0         | k1     | 10        | column      | 90680632611552
 0         | k2     | 11        | column      | 90680632611553
 0         | k3     | 12        | column      | 90680632611554
 0         | k4     | 13        | column      | 90680632611555
 0         | v1     | 14        | column      | 90680632611556
 0         | v2     | 15        | column      | 90680632611557
 0         | v3     | 16        | column      | 90680632611558
 0         | v4     | 17        | column      | 90680632611559
 0         |        |           | bloom       | 90680632611560
 0         |        |           | adhoc-index | 90680632611561
 1         | k1     | 10        | column      | 90680632611564
 1         | k2     | 11        | column      | 90680632611565
 1         | k3     | 12        | column      | 90680632611566
 1         | k4     | 13        | column      | 90680632611567
 1         | v1     | 14        | column      | 90680632611568
 1         | v2     | 15        | column      | 90680632611569
 1         | v3     | 16        | column      | 90680632611570
 1         | v4     | 17        | column      | 90680632611571
 1         |        |           | bloom       | 90680632611572
 1         |        |           | adhoc-index | 90680632611573
```

We can immediately see that this tablet has two rowsets, each of which
has 8 column blocks, a bloom block, and an ad-hoc index block. Lets
drill down futher and inspect the 'v4' column:

```bash
$ kudu fs list --fs-wal-dir=<> \
    --columns="block-id, cfile-data-type, cfile-encoding, cfile-compression, cfile-num-values, cfile-size" \
    --tablet-id=c3ce418c72ab4fea8548387f236dd1fa \
    --column-id=17

    block-id    | cfile-data-type | cfile-encoding | cfile-compression | cfile-num-values | cfile-size
----------------+-----------------+----------------+-------------------+------------------+------------
 90680632611555 | int64           | BIT_SHUFFLE    | NO_COMPRESSION    | 5.09M            | 782.6K
 90680632611567 | int64           | BIT_SHUFFLE    | NO_COMPRESSION    | 5.40M            | 830.1K
```

And we can immediately see the CFile's on-disk encoding and compression,
the number of cells, and the CFile/block size.

Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
---
M src/kudu/cfile/cfile_reader.cc
M src/kudu/cfile/cfile_reader.h
M src/kudu/gutil/strings/join.h
M src/kudu/tablet/rowset_metadata.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action.cc
M src/kudu/tools/tool_action_fs.cc
M src/kudu/tools/tool_action_tserver.cc
8 files changed, 583 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/11/8911/7
-- 
To view, visit http://gerrit.cloudera.org:8080/8911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
Gerrit-Change-Number: 8911
Gerrit-PatchSet: 7
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] Add 'kudu fs list' tool

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has posted comments on this change. ( http://gerrit.cloudera.org:8080/8911 )

Change subject: Add 'kudu fs list' tool
......................................................................


Patch Set 3:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/8911/2/src/kudu/tools/tool_action_fs.cc
File src/kudu/tools/tool_action_fs.cc:

http://gerrit.cloudera.org:8080/#/c/8911/2/src/kudu/tools/tool_action_fs.cc@580
PS2, Line 580: 
> Further below I got confused for a second because InfoRow returns a row to 
Done


http://gerrit.cloudera.org:8080/#/c/8911/2/src/kudu/tools/tool_action_fs.cc@612
PS2, Line 612: 
> nit: missing space
Done


http://gerrit.cloudera.org:8080/#/c/8911/2/src/kudu/tools/tool_action_fs.cc@645
PS2, Line 645:       table.AddRow(BuildInfoRow(TabletInfo, fields, tablet));
             :       continue;
             :    
> I think we can get rid of this and simplify a little bit since the for loop
Done


http://gerrit.cloudera.org:8080/#/c/8911/2/src/kudu/tools/tool_action_fs.cc@666
PS2, Line 666:       AddBlockInfoRow(&ta
> Would it be simpler to do
great idea


http://gerrit.cloudera.org:8080/#/c/8911/2/src/kudu/tools/tool_action_fs.cc@674
PS2, Line 674: for (const auto& block :
> ditto (mutatis mutandis)
Done


http://gerrit.cloudera.org:8080/#/c/8911/2/src/kudu/tools/tool_action_fs.cc@703
PS2, Line 703: ter("fs_wal_dir")
> I'd say only by setting a flag to show them
I'm going to leave this TODO here, since I don't think it's crucial for a V1, but it's maybe something we'd eventually want to add.



-- 
To view, visit http://gerrit.cloudera.org:8080/8911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
Gerrit-Change-Number: 8911
Gerrit-PatchSet: 3
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Fri, 22 Dec 2017 21:37:05 +0000
Gerrit-HasComments: Yes

[kudu-CR] Add 'kudu fs list' tool

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/8911 )

Change subject: Add 'kudu fs list' tool
......................................................................


Patch Set 2:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/8911/2/src/kudu/tools/tool_action_fs.cc
File src/kudu/tools/tool_action_fs.cc:

http://gerrit.cloudera.org:8080/#/c/8911/2/src/kudu/tools/tool_action_fs.cc@580
PS2, Line 580: BlockInfoRow
Further below I got confused for a second because InfoRow returns a row to add to a DataTable while BlockInfoRow adds the row to the DataTable (and returns nothing). Could you rename BlockInfoRow accordingly? Maybe "AddBlockInfoRow".


http://gerrit.cloudera.org:8080/#/c/8911/2/src/kudu/tools/tool_action_fs.cc@612
PS2, Line 612: ,"
nit: missing space


http://gerrit.cloudera.org:8080/#/c/8911/2/src/kudu/tools/tool_action_fs.cc@645
PS2, Line 645:   if (tablet_ids.empty()) {
             :     return table.PrintTo(cout);
             :   }
I think we can get rid of this and simplify a little bit since the for loop below will have 0 iterations if tablet_ids is empty.


http://gerrit.cloudera.org:8080/#/c/8911/2/src/kudu/tools/tool_action_fs.cc@666
PS2, Line 666: if (requires_rowset_info)
Would it be simpler to do

 if (!requires_rowset_info) {
   table.AddRow(InfoRow(TabletInfo, columns, tablet));
   continue;
}

so the level of indendation is reduce and it's more obvious that every below in the loop body depends on rowset info?


http://gerrit.cloudera.org:8080/#/c/8911/2/src/kudu/tools/tool_action_fs.cc@674
PS2, Line 674: if (requires_block_info)
ditto (mutatis mutandis)


http://gerrit.cloudera.org:8080/#/c/8911/2/src/kudu/tools/tool_action_fs.cc@703
PS2, Line 703: should the tablet's orphaned blocks be included
I'd say only by setting a flag to show them



-- 
To view, visit http://gerrit.cloudera.org:8080/8911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
Gerrit-Change-Number: 8911
Gerrit-PatchSet: 2
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Fri, 22 Dec 2017 17:46:36 +0000
Gerrit-HasComments: Yes

[kudu-CR] Add 'kudu fs list' tool

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/8911 )

Change subject: Add 'kudu fs list' tool
......................................................................


Patch Set 10: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/8911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
Gerrit-Change-Number: 8911
Gerrit-PatchSet: 10
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Sat, 06 Jan 2018 00:50:15 +0000
Gerrit-HasComments: No

[kudu-CR] Add 'kudu fs list' tool

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Hello Will Berkeley, Tidy Bot, Mike Percy, Kudu Jenkins, Adar Dembo, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8911

to look at the new patch set (#2).

Change subject: Add 'kudu fs list' tool
......................................................................

Add 'kudu fs list' tool

This tool aims to replace exploratory usages of 'kudu fs dump' and
'kudu local_replica dump' with an improved, unified tool. 'kudu fs list' is
more flexible, easier to use, and can show more information.

Output is formatted using the DataTable abstraction, which gives it
good default pretty-printing, with options to output in CSV and JSON for
scripts. Results can easily be filtered to a specific table, tablet, column,
rowset, or block using flags.

The tool can output many different fields: table, table-id, tablet-id,
partition, rowset-id, block-id, block-kind, column, column-id,
cfile-data-type, cfile-encoding, cfile-compression, cfile-num-values,
cfile-size cfile-incompatible-features, and cfile-compatible-features.
More fields should be straightforward to add.

The tool transparently joins information from tablet superblocks with
CFile footers, only materializing the metadata necessary to satisfy the
requested fields and filters.

Examples:

To get our bearings, let's look at what tablets are stored on a local
tablet server:

```bash
$ kudu fs list --fs-wal-dir=/data/kudu/tserver \
    --columns="table, table-id, tablet-id, partition"

                     table                     |             table-id             |            tablet-id             |                        partition
-----------------------------------------------+----------------------------------+----------------------------------+---------------------------------------------------------
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 2a631714f2d243ff92bf525630baa1ec | HASH (key) PARTITION 7, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 36827286a00049bc8b242243c6728157 | HASH (key) PARTITION 0, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 3880b30ccebd4ede867febd9c7d5580f | HASH (key) PARTITION 0, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 39436e9e17d84884b1cb689e88b8415f | HASH (key) PARTITION 5, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 44252efb9aaa4c2c963cf6dd5e875c04 | HASH (key) PARTITION 3, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 57c92ed8391b4d2bbfdeb339f9fb59fd | HASH (key) PARTITION 2, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 68a64aba5917499ebb7773f16bcd6f6d | HASH (key) PARTITION 7, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 6b5f0729a9bf454791239f77b0912f4e | HASH (key) PARTITION 1, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 8a2d120bd6984144ae963bfe8435206e | HASH (key) PARTITION 4, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 8b3ba4f415f945849a6a690a142cf1e4 | HASH (key) PARTITION 5, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 9656be3aa07248a69e3ad6edaa0048cb | HASH (key) PARTITION 1, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 9e8e444d079842a9b4a83ee9f8bed633 | HASH (key) PARTITION 6, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | a794a8e5d3f24e70a96b0beb5a355823 | HASH (key) PARTITION 3, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | bfb8f24b91cd4ecf924aacbb37125041 | HASH (key) PARTITION 2, RANGE (key) PARTITION UNBOUNDED
 foo                                           | e184a99893b44b17a7b2131123c6de0e | c3ce418c72ab4fea8548387f236dd1fa |
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | e00a284081ca468a994a3609a511e886 | HASH (key) PARTITION 4, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | efa22fc899a44bb2a16f620464a15c60 | HASH (key) PARTITION 6, RANGE (key) PARTITION UNBOUNDED
```

The 'foo' table looks interesting; let's drill down into its tablet, and
see what rowsets and blocks it has, and some of their associated metadata:

```bash
$ kudu fs list --fs-wal-dir=/data/kudu/tserver \
    --columns="rowset-id, column, column-id, block-kind, block-id" \
    --tablet-id=c3ce418c72ab4fea8548387f236dd1fa

 rowset-id | column | column-id | block-kind  |    block-id
-----------+--------+-----------+-------------+----------------
 0         | k1     | 10        | column      | 90680632611552
 0         | k2     | 11        | column      | 90680632611553
 0         | k3     | 12        | column      | 90680632611554
 0         | k4     | 13        | column      | 90680632611555
 0         | v1     | 14        | column      | 90680632611556
 0         | v2     | 15        | column      | 90680632611557
 0         | v3     | 16        | column      | 90680632611558
 0         | v4     | 17        | column      | 90680632611559
 0         |        |           | bloom       | 90680632611560
 0         |        |           | adhoc-index | 90680632611561
 1         | k1     | 10        | column      | 90680632611564
 1         | k2     | 11        | column      | 90680632611565
 1         | k3     | 12        | column      | 90680632611566
 1         | k4     | 13        | column      | 90680632611567
 1         | v1     | 14        | column      | 90680632611568
 1         | v2     | 15        | column      | 90680632611569
 1         | v3     | 16        | column      | 90680632611570
 1         | v4     | 17        | column      | 90680632611571
 1         |        |           | bloom       | 90680632611572
 1         |        |           | adhoc-index | 90680632611573
```

We can immediately see that this tablet has two rowsets, each of which
has 8 column blocks, a bloom block, and an ad-hoc index block. Lets
drill down futher and inspect the 'v4' column:

```bash
$ kudu fs list --fs-wal-dir=<> \
    --columns="block-id, cfile-data-type, cfile-encoding, cfile-compression, cfile-num-values, cfile-size" \
    --tablet-id=c3ce418c72ab4fea8548387f236dd1fa \
    --column-id=17

    block-id    | cfile-data-type | cfile-encoding | cfile-compression | cfile-num-values | cfile-size
----------------+-----------------+----------------+-------------------+------------------+------------
 90680632611555 | int64           | BIT_SHUFFLE    | NO_COMPRESSION    | 5.09M            | 782.6K
 90680632611567 | int64           | BIT_SHUFFLE    | NO_COMPRESSION    | 5.40M            | 830.1K
```

And we can immediately see the CFile's on-disk encoding and compression,
the number of cells, and the CFile/block size.

Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
---
M src/kudu/tablet/rowset_metadata.h
M src/kudu/tools/tool_action.cc
M src/kudu/tools/tool_action_fs.cc
M src/kudu/tools/tool_action_tserver.cc
4 files changed, 473 insertions(+), 12 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/11/8911/2
-- 
To view, visit http://gerrit.cloudera.org:8080/8911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
Gerrit-Change-Number: 8911
Gerrit-PatchSet: 2
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] Add 'kudu fs list' tool

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Hello Will Berkeley, Tidy Bot, Mike Percy, Kudu Jenkins, Adar Dembo, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8911

to look at the new patch set (#8).

Change subject: Add 'kudu fs list' tool
......................................................................

Add 'kudu fs list' tool

This tool aims to replace exploratory usages of 'kudu fs dump' and
'kudu local_replica dump' with an improved, unified tool. 'kudu fs list' is
more flexible, easier to use, and can show more information.

Output is formatted using the DataTable abstraction, which gives it
good default pretty-printing, with options to output in CSV and JSON for
scripts. Results can easily be filtered to a specific table, tablet, column,
rowset, or block using flags.

The tool can output many different fields: table, table-id, tablet-id,
partition, rowset-id, block-id, block-kind, column, column-id,
cfile-data-type, cfile-encoding, cfile-compression, cfile-num-values,
cfile-size cfile-incompatible-features, cfile-compatible-features,
cfile-min-key, cfile-max-key, and cfile-delta-stats. More fields should
be straightforward to add.

The tool transparently joins information from tablet superblocks with
CFile footers, only materializing the metadata necessary to satisfy the
requested fields and filters.

Examples:

To get our bearings, let's look at what tablets are stored on a local
tablet server:

```bash
$ kudu fs list --fs-wal-dir=/data/kudu/tserver \
    --columns="table, table-id, tablet-id, partition"

                     table                     |             table-id             |            tablet-id             |                        partition
-----------------------------------------------+----------------------------------+----------------------------------+---------------------------------------------------------
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 2a631714f2d243ff92bf525630baa1ec | HASH (key) PARTITION 7, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 36827286a00049bc8b242243c6728157 | HASH (key) PARTITION 0, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 3880b30ccebd4ede867febd9c7d5580f | HASH (key) PARTITION 0, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 39436e9e17d84884b1cb689e88b8415f | HASH (key) PARTITION 5, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 44252efb9aaa4c2c963cf6dd5e875c04 | HASH (key) PARTITION 3, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 57c92ed8391b4d2bbfdeb339f9fb59fd | HASH (key) PARTITION 2, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 68a64aba5917499ebb7773f16bcd6f6d | HASH (key) PARTITION 7, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 6b5f0729a9bf454791239f77b0912f4e | HASH (key) PARTITION 1, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 8a2d120bd6984144ae963bfe8435206e | HASH (key) PARTITION 4, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 8b3ba4f415f945849a6a690a142cf1e4 | HASH (key) PARTITION 5, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 9656be3aa07248a69e3ad6edaa0048cb | HASH (key) PARTITION 1, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 9e8e444d079842a9b4a83ee9f8bed633 | HASH (key) PARTITION 6, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | a794a8e5d3f24e70a96b0beb5a355823 | HASH (key) PARTITION 3, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | bfb8f24b91cd4ecf924aacbb37125041 | HASH (key) PARTITION 2, RANGE (key) PARTITION UNBOUNDED
 foo                                           | e184a99893b44b17a7b2131123c6de0e | c3ce418c72ab4fea8548387f236dd1fa |
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | e00a284081ca468a994a3609a511e886 | HASH (key) PARTITION 4, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | efa22fc899a44bb2a16f620464a15c60 | HASH (key) PARTITION 6, RANGE (key) PARTITION UNBOUNDED
```

The 'foo' table looks interesting; let's drill down into its tablet, and
see what rowsets and blocks it has, and some of their associated metadata:

```bash
$ kudu fs list --fs-wal-dir=/data/kudu/tserver \
    --columns="rowset-id, column, column-id, block-kind, block-id" \
    --tablet-id=c3ce418c72ab4fea8548387f236dd1fa

 rowset-id | column | column-id | block-kind  |    block-id
-----------+--------+-----------+-------------+----------------
 0         | k1     | 10        | column      | 90680632611552
 0         | k2     | 11        | column      | 90680632611553
 0         | k3     | 12        | column      | 90680632611554
 0         | k4     | 13        | column      | 90680632611555
 0         | v1     | 14        | column      | 90680632611556
 0         | v2     | 15        | column      | 90680632611557
 0         | v3     | 16        | column      | 90680632611558
 0         | v4     | 17        | column      | 90680632611559
 0         |        |           | bloom       | 90680632611560
 0         |        |           | adhoc-index | 90680632611561
 1         | k1     | 10        | column      | 90680632611564
 1         | k2     | 11        | column      | 90680632611565
 1         | k3     | 12        | column      | 90680632611566
 1         | k4     | 13        | column      | 90680632611567
 1         | v1     | 14        | column      | 90680632611568
 1         | v2     | 15        | column      | 90680632611569
 1         | v3     | 16        | column      | 90680632611570
 1         | v4     | 17        | column      | 90680632611571
 1         |        |           | bloom       | 90680632611572
 1         |        |           | adhoc-index | 90680632611573
```

We can immediately see that this tablet has two rowsets, each of which
has 8 column blocks, a bloom block, and an ad-hoc index block. Lets
drill down futher and inspect the 'v4' column:

```bash
$ kudu fs list --fs-wal-dir=<> \
    --columns="block-id, cfile-data-type, cfile-encoding, cfile-compression, cfile-num-values, cfile-size" \
    --tablet-id=c3ce418c72ab4fea8548387f236dd1fa \
    --column-id=17

    block-id    | cfile-data-type | cfile-encoding | cfile-compression | cfile-num-values | cfile-size
----------------+-----------------+----------------+-------------------+------------------+------------
 90680632611555 | int64           | BIT_SHUFFLE    | NO_COMPRESSION    | 5.09M            | 782.6K
 90680632611567 | int64           | BIT_SHUFFLE    | NO_COMPRESSION    | 5.40M            | 830.1K
```

And we can immediately see the CFile's on-disk encoding and compression,
the number of cells, and the CFile/block size.

Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
---
M src/kudu/cfile/cfile_reader.cc
M src/kudu/cfile/cfile_reader.h
M src/kudu/gutil/strings/join.h
M src/kudu/tablet/rowset_metadata.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action.cc
M src/kudu/tools/tool_action_fs.cc
M src/kudu/tools/tool_action_tserver.cc
8 files changed, 585 insertions(+), 23 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/11/8911/8
-- 
To view, visit http://gerrit.cloudera.org:8080/8911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
Gerrit-Change-Number: 8911
Gerrit-PatchSet: 8
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] Add 'kudu fs list' tool

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Hello Will Berkeley, Tidy Bot, Mike Percy, Kudu Jenkins, Adar Dembo, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8911

to look at the new patch set (#9).

Change subject: Add 'kudu fs list' tool
......................................................................

Add 'kudu fs list' tool

This tool aims to replace exploratory usages of 'kudu fs dump' and
'kudu local_replica dump' with an improved, unified tool. 'kudu fs list' is
more flexible, easier to use, and can show more information.

Output is formatted using the DataTable abstraction, which gives it
good default pretty-printing, with options to output in CSV and JSON for
scripts. Results can easily be filtered to a specific table, tablet, column,
rowset, or block using flags.

The tool can output many different fields: table, table-id, tablet-id,
partition, rowset-id, block-id, block-kind, column, column-id,
cfile-data-type, cfile-encoding, cfile-compression, cfile-num-values,
cfile-size cfile-incompatible-features, cfile-compatible-features,
cfile-min-key, cfile-max-key, and cfile-delta-stats. More fields should
be straightforward to add.

The tool transparently joins information from tablet superblocks with
CFile footers, only materializing the metadata necessary to satisfy the
requested fields and filters.

Examples:

To get our bearings, let's look at what tablets are stored on a local
tablet server:

```bash
$ kudu fs list --fs-wal-dir=/data/kudu/tserver \
    --columns="table, table-id, tablet-id, partition"

                     table                     |             table-id             |            tablet-id             |                        partition
-----------------------------------------------+----------------------------------+----------------------------------+---------------------------------------------------------
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 2a631714f2d243ff92bf525630baa1ec | HASH (key) PARTITION 7, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 36827286a00049bc8b242243c6728157 | HASH (key) PARTITION 0, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 3880b30ccebd4ede867febd9c7d5580f | HASH (key) PARTITION 0, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 39436e9e17d84884b1cb689e88b8415f | HASH (key) PARTITION 5, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 44252efb9aaa4c2c963cf6dd5e875c04 | HASH (key) PARTITION 3, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 57c92ed8391b4d2bbfdeb339f9fb59fd | HASH (key) PARTITION 2, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 68a64aba5917499ebb7773f16bcd6f6d | HASH (key) PARTITION 7, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 6b5f0729a9bf454791239f77b0912f4e | HASH (key) PARTITION 1, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 8a2d120bd6984144ae963bfe8435206e | HASH (key) PARTITION 4, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 8b3ba4f415f945849a6a690a142cf1e4 | HASH (key) PARTITION 5, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 9656be3aa07248a69e3ad6edaa0048cb | HASH (key) PARTITION 1, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 9e8e444d079842a9b4a83ee9f8bed633 | HASH (key) PARTITION 6, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | a794a8e5d3f24e70a96b0beb5a355823 | HASH (key) PARTITION 3, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | bfb8f24b91cd4ecf924aacbb37125041 | HASH (key) PARTITION 2, RANGE (key) PARTITION UNBOUNDED
 foo                                           | e184a99893b44b17a7b2131123c6de0e | c3ce418c72ab4fea8548387f236dd1fa |
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | e00a284081ca468a994a3609a511e886 | HASH (key) PARTITION 4, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | efa22fc899a44bb2a16f620464a15c60 | HASH (key) PARTITION 6, RANGE (key) PARTITION UNBOUNDED
```

The 'foo' table looks interesting; let's drill down into its tablet, and
see what rowsets and blocks it has, and some of their associated metadata:

```bash
$ kudu fs list --fs-wal-dir=/data/kudu/tserver \
    --columns="rowset-id, column, column-id, block-kind, block-id" \
    --tablet-id=c3ce418c72ab4fea8548387f236dd1fa

 rowset-id | column | column-id | block-kind  |    block-id
-----------+--------+-----------+-------------+----------------
 0         | k1     | 10        | column      | 90680632611552
 0         | k2     | 11        | column      | 90680632611553
 0         | k3     | 12        | column      | 90680632611554
 0         | k4     | 13        | column      | 90680632611555
 0         | v1     | 14        | column      | 90680632611556
 0         | v2     | 15        | column      | 90680632611557
 0         | v3     | 16        | column      | 90680632611558
 0         | v4     | 17        | column      | 90680632611559
 0         |        |           | bloom       | 90680632611560
 0         |        |           | adhoc-index | 90680632611561
 1         | k1     | 10        | column      | 90680632611564
 1         | k2     | 11        | column      | 90680632611565
 1         | k3     | 12        | column      | 90680632611566
 1         | k4     | 13        | column      | 90680632611567
 1         | v1     | 14        | column      | 90680632611568
 1         | v2     | 15        | column      | 90680632611569
 1         | v3     | 16        | column      | 90680632611570
 1         | v4     | 17        | column      | 90680632611571
 1         |        |           | bloom       | 90680632611572
 1         |        |           | adhoc-index | 90680632611573
```

We can immediately see that this tablet has two rowsets, each of which
has 8 column blocks, a bloom block, and an ad-hoc index block. Lets
drill down futher and inspect the 'v4' column:

```bash
$ kudu fs list --fs-wal-dir=<> \
    --columns="block-id, cfile-data-type, cfile-encoding, cfile-compression, cfile-num-values, cfile-size" \
    --tablet-id=c3ce418c72ab4fea8548387f236dd1fa \
    --column-id=17

    block-id    | cfile-data-type | cfile-encoding | cfile-compression | cfile-num-values | cfile-size
----------------+-----------------+----------------+-------------------+------------------+------------
 90680632611555 | int64           | BIT_SHUFFLE    | NO_COMPRESSION    | 5.09M            | 782.6K
 90680632611567 | int64           | BIT_SHUFFLE    | NO_COMPRESSION    | 5.40M            | 830.1K
```

And we can immediately see the CFile's on-disk encoding and compression,
the number of cells, and the CFile/block size.

Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
---
M src/kudu/cfile/cfile_reader.cc
M src/kudu/cfile/cfile_reader.h
M src/kudu/gutil/strings/join.h
M src/kudu/tablet/rowset_metadata.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action.cc
M src/kudu/tools/tool_action_fs.cc
M src/kudu/tools/tool_action_tserver.cc
8 files changed, 587 insertions(+), 23 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/11/8911/9
-- 
To view, visit http://gerrit.cloudera.org:8080/8911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
Gerrit-Change-Number: 8911
Gerrit-PatchSet: 9
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] Add 'kudu fs list' tool

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Hello Will Berkeley, Tidy Bot, Mike Percy, Kudu Jenkins, Adar Dembo, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8911

to look at the new patch set (#5).

Change subject: Add 'kudu fs list' tool
......................................................................

Add 'kudu fs list' tool

This tool aims to replace exploratory usages of 'kudu fs dump' and
'kudu local_replica dump' with an improved, unified tool. 'kudu fs list' is
more flexible, easier to use, and can show more information.

Output is formatted using the DataTable abstraction, which gives it
good default pretty-printing, with options to output in CSV and JSON for
scripts. Results can easily be filtered to a specific table, tablet, column,
rowset, or block using flags.

The tool can output many different fields: table, table-id, tablet-id,
partition, rowset-id, block-id, block-kind, column, column-id,
cfile-data-type, cfile-encoding, cfile-compression, cfile-num-values,
cfile-size cfile-incompatible-features, and cfile-compatible-features.
More fields should be straightforward to add.

The tool transparently joins information from tablet superblocks with
CFile footers, only materializing the metadata necessary to satisfy the
requested fields and filters.

Examples:

To get our bearings, let's look at what tablets are stored on a local
tablet server:

```bash
$ kudu fs list --fs-wal-dir=/data/kudu/tserver \
    --columns="table, table-id, tablet-id, partition"

                     table                     |             table-id             |            tablet-id             |                        partition
-----------------------------------------------+----------------------------------+----------------------------------+---------------------------------------------------------
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 2a631714f2d243ff92bf525630baa1ec | HASH (key) PARTITION 7, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 36827286a00049bc8b242243c6728157 | HASH (key) PARTITION 0, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 3880b30ccebd4ede867febd9c7d5580f | HASH (key) PARTITION 0, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 39436e9e17d84884b1cb689e88b8415f | HASH (key) PARTITION 5, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 44252efb9aaa4c2c963cf6dd5e875c04 | HASH (key) PARTITION 3, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 57c92ed8391b4d2bbfdeb339f9fb59fd | HASH (key) PARTITION 2, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 68a64aba5917499ebb7773f16bcd6f6d | HASH (key) PARTITION 7, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 6b5f0729a9bf454791239f77b0912f4e | HASH (key) PARTITION 1, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 8a2d120bd6984144ae963bfe8435206e | HASH (key) PARTITION 4, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 8b3ba4f415f945849a6a690a142cf1e4 | HASH (key) PARTITION 5, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 9656be3aa07248a69e3ad6edaa0048cb | HASH (key) PARTITION 1, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 9e8e444d079842a9b4a83ee9f8bed633 | HASH (key) PARTITION 6, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | a794a8e5d3f24e70a96b0beb5a355823 | HASH (key) PARTITION 3, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | bfb8f24b91cd4ecf924aacbb37125041 | HASH (key) PARTITION 2, RANGE (key) PARTITION UNBOUNDED
 foo                                           | e184a99893b44b17a7b2131123c6de0e | c3ce418c72ab4fea8548387f236dd1fa |
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | e00a284081ca468a994a3609a511e886 | HASH (key) PARTITION 4, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | efa22fc899a44bb2a16f620464a15c60 | HASH (key) PARTITION 6, RANGE (key) PARTITION UNBOUNDED
```

The 'foo' table looks interesting; let's drill down into its tablet, and
see what rowsets and blocks it has, and some of their associated metadata:

```bash
$ kudu fs list --fs-wal-dir=/data/kudu/tserver \
    --columns="rowset-id, column, column-id, block-kind, block-id" \
    --tablet-id=c3ce418c72ab4fea8548387f236dd1fa

 rowset-id | column | column-id | block-kind  |    block-id
-----------+--------+-----------+-------------+----------------
 0         | k1     | 10        | column      | 90680632611552
 0         | k2     | 11        | column      | 90680632611553
 0         | k3     | 12        | column      | 90680632611554
 0         | k4     | 13        | column      | 90680632611555
 0         | v1     | 14        | column      | 90680632611556
 0         | v2     | 15        | column      | 90680632611557
 0         | v3     | 16        | column      | 90680632611558
 0         | v4     | 17        | column      | 90680632611559
 0         |        |           | bloom       | 90680632611560
 0         |        |           | adhoc-index | 90680632611561
 1         | k1     | 10        | column      | 90680632611564
 1         | k2     | 11        | column      | 90680632611565
 1         | k3     | 12        | column      | 90680632611566
 1         | k4     | 13        | column      | 90680632611567
 1         | v1     | 14        | column      | 90680632611568
 1         | v2     | 15        | column      | 90680632611569
 1         | v3     | 16        | column      | 90680632611570
 1         | v4     | 17        | column      | 90680632611571
 1         |        |           | bloom       | 90680632611572
 1         |        |           | adhoc-index | 90680632611573
```

We can immediately see that this tablet has two rowsets, each of which
has 8 column blocks, a bloom block, and an ad-hoc index block. Lets
drill down futher and inspect the 'v4' column:

```bash
$ kudu fs list --fs-wal-dir=<> \
    --columns="block-id, cfile-data-type, cfile-encoding, cfile-compression, cfile-num-values, cfile-size" \
    --tablet-id=c3ce418c72ab4fea8548387f236dd1fa \
    --column-id=17

    block-id    | cfile-data-type | cfile-encoding | cfile-compression | cfile-num-values | cfile-size
----------------+-----------------+----------------+-------------------+------------------+------------
 90680632611555 | int64           | BIT_SHUFFLE    | NO_COMPRESSION    | 5.09M            | 782.6K
 90680632611567 | int64           | BIT_SHUFFLE    | NO_COMPRESSION    | 5.40M            | 830.1K
```

And we can immediately see the CFile's on-disk encoding and compression,
the number of cells, and the CFile/block size.

Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
---
M src/kudu/tablet/rowset_metadata.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action.cc
M src/kudu/tools/tool_action_fs.cc
M src/kudu/tools/tool_action_tserver.cc
5 files changed, 512 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/11/8911/5
-- 
To view, visit http://gerrit.cloudera.org:8080/8911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
Gerrit-Change-Number: 8911
Gerrit-PatchSet: 5
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] Add 'kudu fs list' tool

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has posted comments on this change. ( http://gerrit.cloudera.org:8080/8911 )

Change subject: Add 'kudu fs list' tool
......................................................................


Patch Set 2:

(2 comments)

Actively ignoring the ColumnId copying hints, since ColumnId is trivially copyable and smaller than a pointer.

http://gerrit.cloudera.org:8080/#/c/8911/1/src/kudu/tools/tool_action_fs.cc
File src/kudu/tools/tool_action_fs.cc:

http://gerrit.cloudera.org:8080/#/c/8911/1/src/kudu/tools/tool_action_fs.cc@99
PS1, Line 99: using cfile::ReaderOptions;
> warning: using decl 'ColumnDataPB' is unused [misc-unused-using-decls]
Done


http://gerrit.cloudera.org:8080/#/c/8911/1/src/kudu/tools/tool_action_fs.cc@567
PS1, Line 567: // repeatedly to build up a row.
> warning: the const qualified parameter 'columns' is copied for each invocat
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/8911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
Gerrit-Change-Number: 8911
Gerrit-PatchSet: 2
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Fri, 22 Dec 2017 08:01:35 +0000
Gerrit-HasComments: Yes

[kudu-CR] Add 'kudu fs list' tool

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Hello Will Berkeley, Tidy Bot, Mike Percy, Kudu Jenkins, Adar Dembo, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8911

to look at the new patch set (#10).

Change subject: Add 'kudu fs list' tool
......................................................................

Add 'kudu fs list' tool

This tool aims to replace exploratory usages of 'kudu fs dump' and
'kudu local_replica dump' with an improved, unified tool. 'kudu fs list' is
more flexible, easier to use, and can show more information.

Output is formatted using the DataTable abstraction, which gives it
good default pretty-printing, with options to output in CSV and JSON for
scripts. Results can easily be filtered to a specific table, tablet, column,
rowset, or block using flags.

The tool can output many different fields: table, table-id, tablet-id,
partition, rowset-id, block-id, block-kind, column, column-id,
cfile-data-type, cfile-encoding, cfile-compression, cfile-num-values,
cfile-size cfile-incompatible-features, cfile-compatible-features,
cfile-min-key, cfile-max-key, and cfile-delta-stats. More fields should
be straightforward to add.

The tool transparently joins information from tablet superblocks with
CFile footers, only materializing the metadata necessary to satisfy the
requested fields and filters.

Examples:

To get our bearings, let's look at what tablets are stored on a local
tablet server:

```bash
$ kudu fs list --fs-wal-dir=/data/kudu/tserver \
    --columns="table, table-id, tablet-id, partition"

                     table                     |             table-id             |            tablet-id             |                        partition
-----------------------------------------------+----------------------------------+----------------------------------+---------------------------------------------------------
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 2a631714f2d243ff92bf525630baa1ec | HASH (key) PARTITION 7, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 36827286a00049bc8b242243c6728157 | HASH (key) PARTITION 0, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 3880b30ccebd4ede867febd9c7d5580f | HASH (key) PARTITION 0, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 39436e9e17d84884b1cb689e88b8415f | HASH (key) PARTITION 5, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 44252efb9aaa4c2c963cf6dd5e875c04 | HASH (key) PARTITION 3, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 57c92ed8391b4d2bbfdeb339f9fb59fd | HASH (key) PARTITION 2, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 68a64aba5917499ebb7773f16bcd6f6d | HASH (key) PARTITION 7, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 6b5f0729a9bf454791239f77b0912f4e | HASH (key) PARTITION 1, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 8a2d120bd6984144ae963bfe8435206e | HASH (key) PARTITION 4, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 8b3ba4f415f945849a6a690a142cf1e4 | HASH (key) PARTITION 5, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 9656be3aa07248a69e3ad6edaa0048cb | HASH (key) PARTITION 1, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 9e8e444d079842a9b4a83ee9f8bed633 | HASH (key) PARTITION 6, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | a794a8e5d3f24e70a96b0beb5a355823 | HASH (key) PARTITION 3, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | bfb8f24b91cd4ecf924aacbb37125041 | HASH (key) PARTITION 2, RANGE (key) PARTITION UNBOUNDED
 foo                                           | e184a99893b44b17a7b2131123c6de0e | c3ce418c72ab4fea8548387f236dd1fa |
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | e00a284081ca468a994a3609a511e886 | HASH (key) PARTITION 4, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | efa22fc899a44bb2a16f620464a15c60 | HASH (key) PARTITION 6, RANGE (key) PARTITION UNBOUNDED
```

The 'foo' table looks interesting; let's drill down into its tablet, and
see what rowsets and blocks it has, and some of their associated metadata:

```bash
$ kudu fs list --fs-wal-dir=/data/kudu/tserver \
    --columns="rowset-id, column, column-id, block-kind, block-id" \
    --tablet-id=c3ce418c72ab4fea8548387f236dd1fa

 rowset-id | column | column-id | block-kind  |    block-id
-----------+--------+-----------+-------------+----------------
 0         | k1     | 10        | column      | 90680632611552
 0         | k2     | 11        | column      | 90680632611553
 0         | k3     | 12        | column      | 90680632611554
 0         | k4     | 13        | column      | 90680632611555
 0         | v1     | 14        | column      | 90680632611556
 0         | v2     | 15        | column      | 90680632611557
 0         | v3     | 16        | column      | 90680632611558
 0         | v4     | 17        | column      | 90680632611559
 0         |        |           | bloom       | 90680632611560
 0         |        |           | adhoc-index | 90680632611561
 1         | k1     | 10        | column      | 90680632611564
 1         | k2     | 11        | column      | 90680632611565
 1         | k3     | 12        | column      | 90680632611566
 1         | k4     | 13        | column      | 90680632611567
 1         | v1     | 14        | column      | 90680632611568
 1         | v2     | 15        | column      | 90680632611569
 1         | v3     | 16        | column      | 90680632611570
 1         | v4     | 17        | column      | 90680632611571
 1         |        |           | bloom       | 90680632611572
 1         |        |           | adhoc-index | 90680632611573
```

We can immediately see that this tablet has two rowsets, each of which
has 8 column blocks, a bloom block, and an ad-hoc index block. Lets
drill down futher and inspect the 'v4' column:

```bash
$ kudu fs list --fs-wal-dir=<> \
    --columns="block-id, cfile-data-type, cfile-encoding, cfile-compression, cfile-num-values, cfile-size" \
    --tablet-id=c3ce418c72ab4fea8548387f236dd1fa \
    --column-id=17

    block-id    | cfile-data-type | cfile-encoding | cfile-compression | cfile-num-values | cfile-size
----------------+-----------------+----------------+-------------------+------------------+------------
 90680632611555 | int64           | BIT_SHUFFLE    | NO_COMPRESSION    | 5.09M            | 782.6K
 90680632611567 | int64           | BIT_SHUFFLE    | NO_COMPRESSION    | 5.40M            | 830.1K
```

And we can immediately see the CFile's on-disk encoding and compression,
the number of cells, and the CFile/block size.

Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
---
M src/kudu/cfile/cfile_reader.cc
M src/kudu/cfile/cfile_reader.h
M src/kudu/gutil/strings/join.h
M src/kudu/tablet/rowset_metadata.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action.cc
M src/kudu/tools/tool_action_fs.cc
M src/kudu/tools/tool_action_tserver.cc
8 files changed, 587 insertions(+), 23 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/11/8911/10
-- 
To view, visit http://gerrit.cloudera.org:8080/8911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
Gerrit-Change-Number: 8911
Gerrit-PatchSet: 10
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] Add 'kudu fs list' tool

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Hello Will Berkeley, Tidy Bot, Mike Percy, Kudu Jenkins, Adar Dembo, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8911

to look at the new patch set (#6).

Change subject: Add 'kudu fs list' tool
......................................................................

Add 'kudu fs list' tool

This tool aims to replace exploratory usages of 'kudu fs dump' and
'kudu local_replica dump' with an improved, unified tool. 'kudu fs list' is
more flexible, easier to use, and can show more information.

Output is formatted using the DataTable abstraction, which gives it
good default pretty-printing, with options to output in CSV and JSON for
scripts. Results can easily be filtered to a specific table, tablet, column,
rowset, or block using flags.

The tool can output many different fields: table, table-id, tablet-id,
partition, rowset-id, block-id, block-kind, column, column-id,
cfile-data-type, cfile-encoding, cfile-compression, cfile-num-values,
cfile-size cfile-incompatible-features, and cfile-compatible-features.
More fields should be straightforward to add.

The tool transparently joins information from tablet superblocks with
CFile footers, only materializing the metadata necessary to satisfy the
requested fields and filters.

Examples:

To get our bearings, let's look at what tablets are stored on a local
tablet server:

```bash
$ kudu fs list --fs-wal-dir=/data/kudu/tserver \
    --columns="table, table-id, tablet-id, partition"

                     table                     |             table-id             |            tablet-id             |                        partition
-----------------------------------------------+----------------------------------+----------------------------------+---------------------------------------------------------
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 2a631714f2d243ff92bf525630baa1ec | HASH (key) PARTITION 7, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 36827286a00049bc8b242243c6728157 | HASH (key) PARTITION 0, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 3880b30ccebd4ede867febd9c7d5580f | HASH (key) PARTITION 0, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 39436e9e17d84884b1cb689e88b8415f | HASH (key) PARTITION 5, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 44252efb9aaa4c2c963cf6dd5e875c04 | HASH (key) PARTITION 3, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 57c92ed8391b4d2bbfdeb339f9fb59fd | HASH (key) PARTITION 2, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 68a64aba5917499ebb7773f16bcd6f6d | HASH (key) PARTITION 7, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 6b5f0729a9bf454791239f77b0912f4e | HASH (key) PARTITION 1, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 8a2d120bd6984144ae963bfe8435206e | HASH (key) PARTITION 4, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 8b3ba4f415f945849a6a690a142cf1e4 | HASH (key) PARTITION 5, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 9656be3aa07248a69e3ad6edaa0048cb | HASH (key) PARTITION 1, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 9e8e444d079842a9b4a83ee9f8bed633 | HASH (key) PARTITION 6, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | a794a8e5d3f24e70a96b0beb5a355823 | HASH (key) PARTITION 3, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | bfb8f24b91cd4ecf924aacbb37125041 | HASH (key) PARTITION 2, RANGE (key) PARTITION UNBOUNDED
 foo                                           | e184a99893b44b17a7b2131123c6de0e | c3ce418c72ab4fea8548387f236dd1fa |
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | e00a284081ca468a994a3609a511e886 | HASH (key) PARTITION 4, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | efa22fc899a44bb2a16f620464a15c60 | HASH (key) PARTITION 6, RANGE (key) PARTITION UNBOUNDED
```

The 'foo' table looks interesting; let's drill down into its tablet, and
see what rowsets and blocks it has, and some of their associated metadata:

```bash
$ kudu fs list --fs-wal-dir=/data/kudu/tserver \
    --columns="rowset-id, column, column-id, block-kind, block-id" \
    --tablet-id=c3ce418c72ab4fea8548387f236dd1fa

 rowset-id | column | column-id | block-kind  |    block-id
-----------+--------+-----------+-------------+----------------
 0         | k1     | 10        | column      | 90680632611552
 0         | k2     | 11        | column      | 90680632611553
 0         | k3     | 12        | column      | 90680632611554
 0         | k4     | 13        | column      | 90680632611555
 0         | v1     | 14        | column      | 90680632611556
 0         | v2     | 15        | column      | 90680632611557
 0         | v3     | 16        | column      | 90680632611558
 0         | v4     | 17        | column      | 90680632611559
 0         |        |           | bloom       | 90680632611560
 0         |        |           | adhoc-index | 90680632611561
 1         | k1     | 10        | column      | 90680632611564
 1         | k2     | 11        | column      | 90680632611565
 1         | k3     | 12        | column      | 90680632611566
 1         | k4     | 13        | column      | 90680632611567
 1         | v1     | 14        | column      | 90680632611568
 1         | v2     | 15        | column      | 90680632611569
 1         | v3     | 16        | column      | 90680632611570
 1         | v4     | 17        | column      | 90680632611571
 1         |        |           | bloom       | 90680632611572
 1         |        |           | adhoc-index | 90680632611573
```

We can immediately see that this tablet has two rowsets, each of which
has 8 column blocks, a bloom block, and an ad-hoc index block. Lets
drill down futher and inspect the 'v4' column:

```bash
$ kudu fs list --fs-wal-dir=<> \
    --columns="block-id, cfile-data-type, cfile-encoding, cfile-compression, cfile-num-values, cfile-size" \
    --tablet-id=c3ce418c72ab4fea8548387f236dd1fa \
    --column-id=17

    block-id    | cfile-data-type | cfile-encoding | cfile-compression | cfile-num-values | cfile-size
----------------+-----------------+----------------+-------------------+------------------+------------
 90680632611555 | int64           | BIT_SHUFFLE    | NO_COMPRESSION    | 5.09M            | 782.6K
 90680632611567 | int64           | BIT_SHUFFLE    | NO_COMPRESSION    | 5.40M            | 830.1K
```

And we can immediately see the CFile's on-disk encoding and compression,
the number of cells, and the CFile/block size.

Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
---
M src/kudu/tablet/rowset_metadata.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action.cc
M src/kudu/tools/tool_action_fs.cc
M src/kudu/tools/tool_action_tserver.cc
5 files changed, 517 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/11/8911/6
-- 
To view, visit http://gerrit.cloudera.org:8080/8911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
Gerrit-Change-Number: 8911
Gerrit-PatchSet: 6
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] Add 'kudu fs list' tool

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/8911 )

Change subject: Add 'kudu fs list' tool
......................................................................


Patch Set 2:

> (6 comments)

Also could use a few tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/8911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
Gerrit-Change-Number: 8911
Gerrit-PatchSet: 2
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Fri, 22 Dec 2017 17:47:16 +0000
Gerrit-HasComments: No

[kudu-CR] Add 'kudu fs list' tool

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Hello Will Berkeley, Tidy Bot, Mike Percy, Kudu Jenkins, Adar Dembo, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8911

to look at the new patch set (#4).

Change subject: Add 'kudu fs list' tool
......................................................................

Add 'kudu fs list' tool

This tool aims to replace exploratory usages of 'kudu fs dump' and
'kudu local_replica dump' with an improved, unified tool. 'kudu fs list' is
more flexible, easier to use, and can show more information.

Output is formatted using the DataTable abstraction, which gives it
good default pretty-printing, with options to output in CSV and JSON for
scripts. Results can easily be filtered to a specific table, tablet, column,
rowset, or block using flags.

The tool can output many different fields: table, table-id, tablet-id,
partition, rowset-id, block-id, block-kind, column, column-id,
cfile-data-type, cfile-encoding, cfile-compression, cfile-num-values,
cfile-size cfile-incompatible-features, and cfile-compatible-features.
More fields should be straightforward to add.

The tool transparently joins information from tablet superblocks with
CFile footers, only materializing the metadata necessary to satisfy the
requested fields and filters.

Examples:

To get our bearings, let's look at what tablets are stored on a local
tablet server:

```bash
$ kudu fs list --fs-wal-dir=/data/kudu/tserver \
    --columns="table, table-id, tablet-id, partition"

                     table                     |             table-id             |            tablet-id             |                        partition
-----------------------------------------------+----------------------------------+----------------------------------+---------------------------------------------------------
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 2a631714f2d243ff92bf525630baa1ec | HASH (key) PARTITION 7, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 36827286a00049bc8b242243c6728157 | HASH (key) PARTITION 0, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 3880b30ccebd4ede867febd9c7d5580f | HASH (key) PARTITION 0, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 39436e9e17d84884b1cb689e88b8415f | HASH (key) PARTITION 5, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 44252efb9aaa4c2c963cf6dd5e875c04 | HASH (key) PARTITION 3, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 57c92ed8391b4d2bbfdeb339f9fb59fd | HASH (key) PARTITION 2, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 68a64aba5917499ebb7773f16bcd6f6d | HASH (key) PARTITION 7, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 6b5f0729a9bf454791239f77b0912f4e | HASH (key) PARTITION 1, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 8a2d120bd6984144ae963bfe8435206e | HASH (key) PARTITION 4, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 8b3ba4f415f945849a6a690a142cf1e4 | HASH (key) PARTITION 5, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 9656be3aa07248a69e3ad6edaa0048cb | HASH (key) PARTITION 1, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 9e8e444d079842a9b4a83ee9f8bed633 | HASH (key) PARTITION 6, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | a794a8e5d3f24e70a96b0beb5a355823 | HASH (key) PARTITION 3, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | bfb8f24b91cd4ecf924aacbb37125041 | HASH (key) PARTITION 2, RANGE (key) PARTITION UNBOUNDED
 foo                                           | e184a99893b44b17a7b2131123c6de0e | c3ce418c72ab4fea8548387f236dd1fa |
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | e00a284081ca468a994a3609a511e886 | HASH (key) PARTITION 4, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | efa22fc899a44bb2a16f620464a15c60 | HASH (key) PARTITION 6, RANGE (key) PARTITION UNBOUNDED
```

The 'foo' table looks interesting; let's drill down into its tablet, and
see what rowsets and blocks it has, and some of their associated metadata:

```bash
$ kudu fs list --fs-wal-dir=/data/kudu/tserver \
    --columns="rowset-id, column, column-id, block-kind, block-id" \
    --tablet-id=c3ce418c72ab4fea8548387f236dd1fa

 rowset-id | column | column-id | block-kind  |    block-id
-----------+--------+-----------+-------------+----------------
 0         | k1     | 10        | column      | 90680632611552
 0         | k2     | 11        | column      | 90680632611553
 0         | k3     | 12        | column      | 90680632611554
 0         | k4     | 13        | column      | 90680632611555
 0         | v1     | 14        | column      | 90680632611556
 0         | v2     | 15        | column      | 90680632611557
 0         | v3     | 16        | column      | 90680632611558
 0         | v4     | 17        | column      | 90680632611559
 0         |        |           | bloom       | 90680632611560
 0         |        |           | adhoc-index | 90680632611561
 1         | k1     | 10        | column      | 90680632611564
 1         | k2     | 11        | column      | 90680632611565
 1         | k3     | 12        | column      | 90680632611566
 1         | k4     | 13        | column      | 90680632611567
 1         | v1     | 14        | column      | 90680632611568
 1         | v2     | 15        | column      | 90680632611569
 1         | v3     | 16        | column      | 90680632611570
 1         | v4     | 17        | column      | 90680632611571
 1         |        |           | bloom       | 90680632611572
 1         |        |           | adhoc-index | 90680632611573
```

We can immediately see that this tablet has two rowsets, each of which
has 8 column blocks, a bloom block, and an ad-hoc index block. Lets
drill down futher and inspect the 'v4' column:

```bash
$ kudu fs list --fs-wal-dir=<> \
    --columns="block-id, cfile-data-type, cfile-encoding, cfile-compression, cfile-num-values, cfile-size" \
    --tablet-id=c3ce418c72ab4fea8548387f236dd1fa \
    --column-id=17

    block-id    | cfile-data-type | cfile-encoding | cfile-compression | cfile-num-values | cfile-size
----------------+-----------------+----------------+-------------------+------------------+------------
 90680632611555 | int64           | BIT_SHUFFLE    | NO_COMPRESSION    | 5.09M            | 782.6K
 90680632611567 | int64           | BIT_SHUFFLE    | NO_COMPRESSION    | 5.40M            | 830.1K
```

And we can immediately see the CFile's on-disk encoding and compression,
the number of cells, and the CFile/block size.

Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
---
M src/kudu/tablet/rowset_metadata.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action.cc
M src/kudu/tools/tool_action_fs.cc
M src/kudu/tools/tool_action_tserver.cc
5 files changed, 512 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/11/8911/4
-- 
To view, visit http://gerrit.cloudera.org:8080/8911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
Gerrit-Change-Number: 8911
Gerrit-PatchSet: 4
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] Add 'kudu fs list' tool

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Hello Will Berkeley, Tidy Bot, Mike Percy, Kudu Jenkins, Adar Dembo, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8911

to look at the new patch set (#3).

Change subject: Add 'kudu fs list' tool
......................................................................

Add 'kudu fs list' tool

This tool aims to replace exploratory usages of 'kudu fs dump' and
'kudu local_replica dump' with an improved, unified tool. 'kudu fs list' is
more flexible, easier to use, and can show more information.

Output is formatted using the DataTable abstraction, which gives it
good default pretty-printing, with options to output in CSV and JSON for
scripts. Results can easily be filtered to a specific table, tablet, column,
rowset, or block using flags.

The tool can output many different fields: table, table-id, tablet-id,
partition, rowset-id, block-id, block-kind, column, column-id,
cfile-data-type, cfile-encoding, cfile-compression, cfile-num-values,
cfile-size cfile-incompatible-features, and cfile-compatible-features.
More fields should be straightforward to add.

The tool transparently joins information from tablet superblocks with
CFile footers, only materializing the metadata necessary to satisfy the
requested fields and filters.

Examples:

To get our bearings, let's look at what tablets are stored on a local
tablet server:

```bash
$ kudu fs list --fs-wal-dir=/data/kudu/tserver \
    --columns="table, table-id, tablet-id, partition"

                     table                     |             table-id             |            tablet-id             |                        partition
-----------------------------------------------+----------------------------------+----------------------------------+---------------------------------------------------------
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 2a631714f2d243ff92bf525630baa1ec | HASH (key) PARTITION 7, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 36827286a00049bc8b242243c6728157 | HASH (key) PARTITION 0, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 3880b30ccebd4ede867febd9c7d5580f | HASH (key) PARTITION 0, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 39436e9e17d84884b1cb689e88b8415f | HASH (key) PARTITION 5, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 44252efb9aaa4c2c963cf6dd5e875c04 | HASH (key) PARTITION 3, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 57c92ed8391b4d2bbfdeb339f9fb59fd | HASH (key) PARTITION 2, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 68a64aba5917499ebb7773f16bcd6f6d | HASH (key) PARTITION 7, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 6b5f0729a9bf454791239f77b0912f4e | HASH (key) PARTITION 1, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | 8a2d120bd6984144ae963bfe8435206e | HASH (key) PARTITION 4, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 8b3ba4f415f945849a6a690a142cf1e4 | HASH (key) PARTITION 5, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 9656be3aa07248a69e3ad6edaa0048cb | HASH (key) PARTITION 1, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | 9e8e444d079842a9b4a83ee9f8bed633 | HASH (key) PARTITION 6, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | a794a8e5d3f24e70a96b0beb5a355823 | HASH (key) PARTITION 3, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | bfb8f24b91cd4ecf924aacbb37125041 | HASH (key) PARTITION 2, RANGE (key) PARTITION UNBOUNDED
 foo                                           | e184a99893b44b17a7b2131123c6de0e | c3ce418c72ab4fea8548387f236dd1fa |
 loadgen_auto_fa03aaf7bdf54bb4896c534f38d177a1 | 800af247c1424ecd8e96a37b5ee4d311 | e00a284081ca468a994a3609a511e886 | HASH (key) PARTITION 4, RANGE (key) PARTITION UNBOUNDED
 loadgen_auto_06c8038c02da40048397e4f6ad1662c3 | 84ff589b979e4f90aa630e7179fcb644 | efa22fc899a44bb2a16f620464a15c60 | HASH (key) PARTITION 6, RANGE (key) PARTITION UNBOUNDED
```

The 'foo' table looks interesting; let's drill down into its tablet, and
see what rowsets and blocks it has, and some of their associated metadata:

```bash
$ kudu fs list --fs-wal-dir=/data/kudu/tserver \
    --columns="rowset-id, column, column-id, block-kind, block-id" \
    --tablet-id=c3ce418c72ab4fea8548387f236dd1fa

 rowset-id | column | column-id | block-kind  |    block-id
-----------+--------+-----------+-------------+----------------
 0         | k1     | 10        | column      | 90680632611552
 0         | k2     | 11        | column      | 90680632611553
 0         | k3     | 12        | column      | 90680632611554
 0         | k4     | 13        | column      | 90680632611555
 0         | v1     | 14        | column      | 90680632611556
 0         | v2     | 15        | column      | 90680632611557
 0         | v3     | 16        | column      | 90680632611558
 0         | v4     | 17        | column      | 90680632611559
 0         |        |           | bloom       | 90680632611560
 0         |        |           | adhoc-index | 90680632611561
 1         | k1     | 10        | column      | 90680632611564
 1         | k2     | 11        | column      | 90680632611565
 1         | k3     | 12        | column      | 90680632611566
 1         | k4     | 13        | column      | 90680632611567
 1         | v1     | 14        | column      | 90680632611568
 1         | v2     | 15        | column      | 90680632611569
 1         | v3     | 16        | column      | 90680632611570
 1         | v4     | 17        | column      | 90680632611571
 1         |        |           | bloom       | 90680632611572
 1         |        |           | adhoc-index | 90680632611573
```

We can immediately see that this tablet has two rowsets, each of which
has 8 column blocks, a bloom block, and an ad-hoc index block. Lets
drill down futher and inspect the 'v4' column:

```bash
$ kudu fs list --fs-wal-dir=<> \
    --columns="block-id, cfile-data-type, cfile-encoding, cfile-compression, cfile-num-values, cfile-size" \
    --tablet-id=c3ce418c72ab4fea8548387f236dd1fa \
    --column-id=17

    block-id    | cfile-data-type | cfile-encoding | cfile-compression | cfile-num-values | cfile-size
----------------+-----------------+----------------+-------------------+------------------+------------
 90680632611555 | int64           | BIT_SHUFFLE    | NO_COMPRESSION    | 5.09M            | 782.6K
 90680632611567 | int64           | BIT_SHUFFLE    | NO_COMPRESSION    | 5.40M            | 830.1K
```

And we can immediately see the CFile's on-disk encoding and compression,
the number of cells, and the CFile/block size.

Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
---
M src/kudu/tablet/rowset_metadata.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action.cc
M src/kudu/tools/tool_action_fs.cc
M src/kudu/tools/tool_action_tserver.cc
5 files changed, 512 insertions(+), 12 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/11/8911/3
-- 
To view, visit http://gerrit.cloudera.org:8080/8911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
Gerrit-Change-Number: 8911
Gerrit-PatchSet: 3
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>