You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "mck (JIRA)" <ji...@apache.org> on 2018/05/03 01:09:00 UTC

[jira] [Commented] (CASSANDRA-13459) Diag. Events: Native transport integration

    [ https://issues.apache.org/jira/browse/CASSANDRA-13459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16461807#comment-16461807 ] 

mck commented on CASSANDRA-13459:
---------------------------------

Providing node-based diagnostics via the native transport raises some questions:
 - diagnostic events are subscribed to on a per-node basis, (there's no messaging of the diagnostic events between nodes server-side),
 - a cql (native transport) driver must be forced connected (ie one contact host and all other addresses blacklisted) to a specific host to receive diagnostics for just that host, and
 - a separate cql (native transport) driver configuration, each forced to a separate host, is required to receive diagnostics for the whole cluster.

Drivers could be implemented to get around this, by maintaining "control connections" to every node, but this goes against the design of the drivers and puts a fair burden on them.

Having messaging of diagnostics events server-side, where just one connection could still receive cluster-wide diagnostics) pushes the overhead back into the cluster, which i presume and would say we want to avoid.

Integrating diagnostics via the JMX port is an alternative.
For the client it is more intuitive as the jmx port on a node today provides information and notifications for just the node. For example it aligns with the concept of per-node agents. While we lose the benefits of Netty (performance and load-shedding) and cross-language support (although still solvable) it fits in to the current design of C*. Maybe I am missing other reasons JMX was not considered in the proposal? 

If a JMX implementation came first, it could solely be a layer on top of the diagnostics event model in the server, so that a better transport layer could be fitted later. While it would help to document some idea of how many events per second we could be seeing, an initial jmx implementation can give us more production feedback from users before committing to a native transport or other implementation. Although the cost of an initial jmx implementation does mean no native protocol changes in 4.x releases. 

> Diag. Events: Native transport integration
> ------------------------------------------
>
>                 Key: CASSANDRA-13459
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13459
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: CQL
>            Reporter: Stefan Podkowinski
>            Assignee: Stefan Podkowinski
>            Priority: Major
>              Labels: client-impacting
>
> Events should be consumable by clients that would received subscribed events from the connected node. This functionality is designed to work on top of native transport with minor modifications to the protocol standard (see [original proposal|https://docs.google.com/document/d/1uEk7KYgxjNA0ybC9fOuegHTcK3Yi0hCQN5nTp5cNFyQ/edit?usp=sharing] for further considered options). First we have to add another value for existing event types. Also, we have to extend the protocol a bit to be able to specify a sub-class and sub-type value. E.g. {{DIAGNOSTIC_EVENT(GossiperEvent, MAJOR_STATE_CHANGE_HANDLED)}}. This still has to be worked out and I'd appreciate any feedback.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org