You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jon Haddad (JIRA)" <ji...@apache.org> on 2019/08/02 19:47:00 UTC
[jira] [Commented] (CASSANDRA-15194) Improve readability of Table metrics Virtual tables units

    [ https://issues.apache.org/jira/browse/CASSANDRA-15194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16899164#comment-16899164 ] 

Jon Haddad commented on CASSANDRA-15194:
----------------------------------------

Just took a look at the patch and noticed a few things.

I ran a simple test using tlp-stress, a basic key value run:
{noformat}
bin/tlp-stress run KeyValue -d 1h --rate 1000 -r .5
{noformat}
When I checked the virtual tables I'm seeing a p99 of 73 seconds:
{noformat}
cqlsh:system_views> select * from local_read_latency  where keyspace_name = 'tlp_stress' allow filtering;

 keyspace_name | table_name  | 99th_ms | count | max_ms     | median_ms | per_second
---------------+-------------+---------+-------+------------+-----------+------------
    tlp_stress |    keyvalue |   73457 | 86523 | 1.3581e+06 |     11864 |    205.219
    tlp_stress | sensor_data |       0 |     0 |          0 |         0 |          0
{noformat}
However, I don't see anything close to that in tlp-stress:
{noformat}
                  Writes                                    Reads                  Errors
  Count  Latency (p99)  1min (req/s) |   Count  Latency (p99)  1min (req/s) |   Count  1min (errors/s)
  76557           0.38        501.04 |   76620           0.36        501.75 |       0                0
  78058           0.38        500.41 |   78119           0.36        502.15 |       0                0
  79503           0.37        500.41 |   79671           0.36        502.15 |       0                0
  80999           0.37        499.96 |   81174           0.37         502.4 |       0                0
  82553           0.37         500.4 |   82623           0.37        501.77 |       0                0
  84085           0.36         500.4 |   84092           0.38        501.77 |       0                0
  85544           0.37        500.49 |   85633           0.38         501.5 |       0                0
{noformat}
Nodetool tablehistograms agrees with tlp_stress that my laptop isn't perfoming that poorly:
{noformat}
$ nodetool tablehistograms tlp_stress keyvalue
tlp_stress/keyvalue histograms
Percentile      Read Latency     Write Latency          SSTables    Partition Size        Cell Count
                    (micros)          (micros)                             (bytes)
50%                    11.86              8.24              0.00               215                 1
75%                    17.08              9.89              0.00               215                 1
95%                    42.51             29.52              0.00               258                 1
98%                    61.21             35.43              0.00               258                 1
99%                    73.46             35.43              0.00               258                 1
Min                     3.97              2.76              0.00               125                 0
Max                   152.32             73.46              1.00               258                 1
{noformat}
The same issue pops up for other tables as well:
{noformat}
cqlsh:system_views> select * from local_scan_latency where keyspace_name = 'tlp_stress' allow filtering;

 keyspace_name | table_name  | 99th_ms    | count | max_ms     | median_ms  | per_second
---------------+-------------+------------+-------+------------+------------+------------
    tlp_stress |    keyvalue | 1.1318e+06 |    16 | 1.1318e+06 | 5.4579e+05 |       0.05
    tlp_stress | sensor_data |          0 |     0 |          0 |          0 |          0
{noformat}
I think starting at line 205 of TableMetricTables.java you'd want this:
{noformat}
add(result, MEDIAN + suffix, snapshot.getMedian() / NS_TO_MS);
add(result, P99 + suffix, snapshot.get99thPercentile() / NS_TO_MS);
add(result, MAX + suffix, (double) snapshot.getMax() / NS_TO_MS);
{noformat}
When I apply that change and rerun things, I get output that makes a lot more sense:
{noformat}
cqlsh:system_views> select * from local_read_latency  where keyspace_name = 'tlp_stress' allow filtering;

 keyspace_name | table_name  | 99th_ms | count | max_ms | median_ms | per_second
---------------+-------------+---------+-------+--------+-----------+------------
    tlp_stress |    keyvalue |   0.786 | 17660 |  7.008 |      0.03 |     53.331
    tlp_stress | sensor_data |       0 |     0 |      0 |         0 |          0
{noformat}
Other than that, I don't have any issues. We can merge once that's addressed.

We should create a follow up JIRA to document all the virtual tables.

> Improve readability of Table metrics Virtual tables units
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-15194
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15194
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Feature/Virtual Tables
>            Reporter: Jon Haddad
>            Assignee: Chris Lohfink
>            Priority: Normal
>             Fix For: 4.0
>
>
> I just noticed this strange output in the coordinator_reads output::
> {code}
> cqlsh:system_views> select * from coordinator_reads ;
>  count | keyspace_name      | table_name                     | 99th | max | median | per_second
> -------+--------------------+--------------------------------+------+-----+--------+------------
>   7573 |         tlp_stress |                       keyvalue |    0 |   0 |      0 | 2.2375e-16
>   6076 |         tlp_stress |                  random_access |    0 |   0 |      0 | 7.4126e-12
>    390 |         tlp_stress |                sensor_data_udt |    0 |   0 |      0 | 1.7721e-64
>     30 |             system |                          local |    0 |   0 |      0 |   0.006406
>     11 |      system_schema |                        columns |    0 |   0 |      0 | 1.1192e-16
>     11 |      system_schema |                        indexes |    0 |   0 |      0 | 1.1192e-16
>     11 |      system_schema |                         tables |    0 |   0 |      0 | 1.1192e-16
>     11 |      system_schema |                          views |    0 |   0 |      0 | 1.1192e-16
> {code}
> cc [~cnlwsu]
> btw I realize the output is technically correct, but it's not very readable.  For practical purposes this should just say 0.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org