You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Stephen Mallette (Jira)" <ji...@apache.org> on 2020/05/19 18:11:00 UTC

[jira] [Created] (CASSANDRA-15821) Metrics Documentation Enhancements

Stephen Mallette created CASSANDRA-15821:
--------------------------------------------

             Summary: Metrics Documentation Enhancements
                 Key: CASSANDRA-15821
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15821
             Project: Cassandra
          Issue Type: Improvement
          Components: Documentation/Website
            Reporter: Stephen Mallette


CASSANDRA-15582 involves quality around metrics and it was mentioned that reviewing and [improving documentation|https://github.com/apache/cassandra/blob/trunk/doc/source/operating/metrics.rst] around metrics would fall into that scope. Please consider some of this analysis in determining what improvements to make here:

Please see [this spreadsheet|https://docs.google.com/spreadsheets/d/1iPWfCMIG75CI6LbYuDtCTjEOvZw-5dyH-e08bc63QnI/edit?usp=sharing] that itemizes almost all of cassandra's metrics and whether they are documented or not (and other notes).  That spreadsheet is "almost all" because there are some metrics that don't seem to initialize as part of Cassandra startup (i was able to trigger some to initialize, but all were not immediately obvious). The missing metrics seem to be related to the following:

* ThreadPool metrics - only some initialize at startup the list of which follow below
* Streaming Metrics
* HintedHandoff Metrics
* HintsService Metrics

Here are the ThreadPool scopes that get listed:

{code}
AntiEntropyStage
CacheCleanupExecutor
CompactionExecutor
GossipStage
HintsDispatcher
MemtableFlushWriter
MemtablePostFlush
MemtableReclaimMemory
MigrationStage
MutationStage
Native-Transport-Requests
PendingRangeCalculator
PerDiskMemtableFlushWriter_0
ReadStage
Repair-Task
RequestResponseStage
Sampler
SecondaryIndexManagement
ValidationExecutor
ViewBuildExecutor
{code}

I noticed that Keyspace Metrics have this note: "Most of these metrics are the same as the Table Metrics above, only they are aggregated at the Keyspace level." I think I've isolated those metrics on table that are not on keyspace to specifically be:

{code}
BloomFilterFalsePositives
BloomFilterFalseRatio
BytesAnticompacted
BytesFlushed
BytesMutatedAnticompaction
BytesPendingRepair
BytesRepaired
BytesUnrepaired
CompactionBytesWritten
CompressionRatio
CoordinatorReadLatency
CoordinatorScanLatency
CoordinatorWriteLatency
EstimatedColumnCountHistogram
EstimatedPartitionCount
EstimatedPartitionSizeHistogram
KeyCacheHitRate
LiveSSTableCount
MaxPartitionSize
MeanPartitionSize
MinPartitionSize
MutatedAnticompactionGauge
PercentRepaired
RowCacheHitOutOfRange
RowCacheHit
RowCacheMiss
SpeculativeSampleLatencyNanos
SyncTime
WaitingOnFreeMemtableSpace
DroppedMutations
{code}

Someone with greater knowledge of this area might consider it worth the effort to see if any of these metrics should be aggregated to the keyspace level in case they were inadvertently missed. In any case, perhaps the documentation could easily now reflect which metric names could be expected on Keyspace.

The DroppedMessage metrics have a much larger body of scopes than just what were documented:

{code}
ASYMMETRIC_SYNC_REQ
BATCH_REMOVE_REQ
BATCH_REMOVE_RSP
BATCH_STORE_REQ
BATCH_STORE_RSP
CLEANUP_MSG
COUNTER_MUTATION_REQ
COUNTER_MUTATION_RSP
ECHO_REQ
ECHO_RSP
FAILED_SESSION_MSG
FAILURE_RSP
FINALIZE_COMMIT_MSG
FINALIZE_PROMISE_MSG
FINALIZE_PROPOSE_MSG
GOSSIP_DIGEST_ACK
GOSSIP_DIGEST_ACK2
GOSSIP_DIGEST_SYN
GOSSIP_SHUTDOWN
HINT_REQ
HINT_RSP
INTERNAL_RSP
MUTATION_REQ
MUTATION_RSP
PAXOS_COMMIT_REQ
PAXOS_COMMIT_RSP
PAXOS_PREPARE_REQ
PAXOS_PREPARE_RSP
PAXOS_PROPOSE_REQ
PAXOS_PROPOSE_RSP
PING_REQ
PING_RSP
PREPARE_CONSISTENT_REQ
PREPARE_CONSISTENT_RSP
PREPARE_MSG
RANGE_REQ
RANGE_RSP
READ_REPAIR_REQ
READ_REPAIR_RSP
READ_REQ
READ_RSP
REPAIR_RSP
REPLICATION_DONE_REQ
REPLICATION_DONE_RSP
REQUEST_RSP
SCHEMA_PULL_REQ
SCHEMA_PULL_RSP
SCHEMA_PUSH_REQ
SCHEMA_PUSH_RSP
SCHEMA_VERSION_REQ
SCHEMA_VERSION_RSP
SNAPSHOT_MSG
SNAPSHOT_REQ
SNAPSHOT_RSP
STATUS_REQ
STATUS_RSP
SYNC_REQ
SYNC_RSP
TRUNCATE_REQ
TRUNCATE_RSP
VALIDATION_REQ
VALIDATION_RSP
_SAMPLE
_TEST_1
_TEST_2
_TRACE
{code}

I suppose I may yet be missing some metrics as my knowledge of what's available is limited to what I can get from JMX after cassandra initialization (and some initial starting commands) and what's int he documentation. If something is present that is missing from both then I won't know it's there.  Anyway, perhaps this issue can help build some discussion around the improvements that might be made given the analysis that has been provided so far. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org