You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Paulo Motta (Jira)" <ji...@apache.org> on 2021/04/11 18:02:00 UTC

[jira] [Commented] (CASSANDRA-16513) Add tool to display or export the contents of a virtual table

    [ https://issues.apache.org/jira/browse/CASSANDRA-16513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17318912#comment-17318912 ] 

Paulo Motta commented on CASSANDRA-16513:
-----------------------------------------

Hi [~Si_d], thanks for your interest on working on this ticket. I updated the ticket description to reflect the previous discussion.

Currently you can query a [virtual table|https://cassandra.apache.org/doc/latest/new/virtualtables.html] with [cqlsh|https://cassandra.apache.org/doc/latest/tools/cqlsh.html]:
{code:java}
cqlsh:system_views> SELECT * FROM sstable_tasks;
keyspace_name | table_name | task_id                              | kind       | progress | total    | unit
---------------+------------+--------------------------------------+------------+----------+----------+-------
       basic |      wide2 | c3909740-cdf7-11e9-a8ed-0f03de2d9ae1 | compaction | 60418761 | 70882110 | bytes
       basic |      wide2 | c7556770-cdf7-11e9-a8ed-0f03de2d9ae1 | compaction |  2995623 | 40314679 | bytes
{code}
However before virtual tables were a thing, operators could query this information more easily with [nodetool|https://cassandra.apache.org/doc/latest/tools/nodetool/nodetool.html]:
{code:java}
nodetool compactionstats
pending tasks: 5
          compaction type        keyspace           table       completed           total      unit  progress
               Compaction       Keyspace1       Standard1       282310680       302170540     bytes    93.43%
               Compaction       Keyspace1       Standard1        58457931       307520780     bytes    19.01%
Active compaction remaining time :   0h00m16s
{code}
The benefit of the first approach is that it's very easy for developers to add new virtual tables to cassandra, since you only need to do this on the server side and the client can simply query the virtual table with CQLSH.

The benefit of the second approach is that operators are already used to discover and query system information via nodetool and the information comes nicely formatted for human consumption, potentially in different formats. The downside is that every new information that is added on the server requires a new client nodetool command to be added.

The idea here is to add a new tool with the best of both worlds: allow developers easily add new information that operators can query via virtual tables, while providing a simple way for operators to query this data and export to different formats (such as JSON or YAML) via a CLI interface.

The good thing about this task is that we can do it very incrementally, we can start with implementing "admintool show sstable_tasks" to display the information above, and then add new virtual tables to it incrementally. All virtual tables should expose the information in tabular, JSON and YAML format (depending on the format parameter), but in the future each virtual table can optionally implement a new data formatter to pretty-print virtual table data in different formats.

I think a simple way to start is to create a simple tool that just fetches and displays the contents of the "system.sstable_tasks" table in tabular format.

> Add tool to display or export the contents of a virtual table
> -------------------------------------------------------------
>
>                 Key: CASSANDRA-16513
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16513
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Observability/Metrics, Tool/nodetool
>            Reporter: Paulo Motta
>            Priority: Normal
>              Labels: gsoc2021, mentor
>
> Several virtual tables were recently added, but they're currently only accessible via cqlsh or programmatically. While this is valuable for many use cases, operators are accustomed with the convenience of querying system metrics with a simple nodetool command.
> In addition to that, a relatively common request is to provide nodetool output in different formats (JSON, YAML and even XML) (CASSANDRA-5977, CASSANDRA-12035, CASSANDRA-12486, CASSANDRA-12698, CASSANDRA-12503). However this requires lots of manual labor as each nodetool subcommand needs to be adapted to support new output formats.
> I propose adding a new CLI tool that will consistently print to the standard output the contents of a virtual table. By default the command will print the output in a tabular format similar to cqlsh, but a "--format" parameter can be specified to modify the output to some other format like JSON or YAML.
> It should be possible to add a limit to the amount of rows displayed and filter to display only rows from with specific keys (ie. keyspace or table). The command should be flexible and provide simple hooks for registration and customization of new virtual tables.
> My vision is that this is a path towards deprecating JMX and toward CQL for management, as we move information currently available through JMX to virtual tables (as CASSANDRA-14457 did with compactionstats) and easily expose them in this new tool as more virtual tables are added. Eventually we can also add setters when we start supporting writeable virtual tables.
> I propose calling this tool admintool (naming bikeshedding welcome), for example:
> {noformat}
> admintool help
> admintool <subcommand> <entity>
> Available subcommands and entities are:
> subcommands:
>  - show
>  - set (future)
> entities:
>  - caches
>  - internode_inbound
>  - internode_outbound
>  - settings
>  - sstable_tasks
>  - system_properties
>  - thread_pools
> nodetool show clients --format yaml
> ...
> nodetool show internode_outboud --format json
> ...
> nodetool show sstabletasks --filter keyspace=my_ks --filter table=my_table
> ...
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org