You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Benedict (JIRA)" <ji...@apache.org> on 2018/09/05 09:15:00 UTC

[jira] [Comment Edited] (CASSANDRA-14406) Transient Replication: Implement cheap quorum write optimizations

    [ https://issues.apache.org/jira/browse/CASSANDRA-14406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16602945#comment-16602945 ] 

Benedict edited comment on CASSANDRA-14406 at 9/5/18 9:14 AM:
--------------------------------------------------------------

Fixes:
 # StorageProxy.mutate would have attempted a standardWritePerformer maybeTryAdditionalReplicas for counters
 # assureSufficientLiveReplicas and blockFor were not transient replication (or pending replicas) aware
 ** in the case of transient replication, this would mean we did not send enough initial writes, because we capped ourselves to blockFor recipients
 # AbstractWriteResponseHandler
 ** was sending to all remaining replicas in case of failure to meet consistency, not only those relevant for consistency
 ** hasTransientResponse was racy - could have a transient response arrive after checking condition
 *** Have introduced {{Accumulator.snapshot}} to make working with it safely more obvious
 *** We take a snapshot, and look inside the list to decide if we have a transient response
 # sendMessagesToNonLocalDC was asserting no transient replicas - simply removed the assertions, as logic is consistent
 # Hints were not implemented, but mostly involved filtering them out; batch log will be less trivial when implemented, as currently must hint
# On write, we were filtering {{pending}} replicas to only full ones, which would have broken our consistency guarantees
 # This patch was also affected by [transient<->full ring ownership movements|https://issues.apache.org/jira/browse/CASSANDRA-14409?focusedCommentId=16602977&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16602977]
 # Introduced separate threshold for cheap quorum upgrades
 # There was a rare possible race condition when removing transient replication from a keyspace, during which period we would not handle transient replicas correctly

Nits:
 # StorageProxy.mutate used a HashMap, when a List would suffice

Follow-ups pre-4.0:
 # We should rename speculative_write_threshold (I thought we agreed on transient_write_threshold)?

Follow-ups:
 # EACH_QUORUM not implemented for transient replication; must either error or implement before release
 # we don’t limit our cheap quorum upgrade to the minimum number of additional transient replicas, so a single missing response will result in all DCs receiving an extra full write mutation, doubling cross-dc traffic for that write
 # maybeTryAdditionalReplicas / sendMessagesToNonLocalDC are not DC aware in their interactions, so transient writes incur more cross-DC traffic (ideally, the proxies would be able to coordinate upgrading to a transient write)
 # we don’t expose metrics around success/failure of cheap quorum
 # transient write count isn’t incremented when we perform a non-additional write (i.e. due to down full node)


was (Author: benedict):
Fixes:
# StorageProxy.mutate would have attempted a standardWritePerformer maybeTryAdditionalReplicas for counters	
# assureSufficientLiveReplicas and blockFor were not transient replication (or pending replicas) aware
#* in the case of transient replication, this would mean we did not send enough initial writes, because we capped ourselves to blockFor recipients
# AbstractWriteResponseHandler
#* was sending to all remaining replicas in case of failure to meet consistency, not only those relevant for consistency
#* hasTransientResponse was racy - could have a transient response arrive after checking condition
#** Have introduced {{Accumulator.snapshot}} to make working with it safely more obvious	
#** We take a snapshot, and look inside the list to decide if we have a transient response
# sendMessagesToNonLocalDC was asserting no transient replicas - simply removed the assertions, as logic is consistent
# Hints were not implemented, but mostly involved filtering them out; batch log will be less trivial when implemented, as currently must hint
# Introduced separate threshold for cheap quorum upgrades
# There was a rare possible race condition when removing transient replication from a keyspace, during which period we would not handle transient replicas correctly

Nits:
# StorageProxy.mutate used a HashMap, when a List would suffice

Follow-ups pre-4.0:
# We should rename speculative_write_threshold (I thought we agreed on transient_write_threshold)?

Follow-ups:
# EACH_QUORUM not implemented for transient replication; must either error or implement before release
# we don’t limit our cheap quorum upgrade to the minimum number of additional transient replicas, so a single missing response will result in all DCs receiving an extra full write mutation, doubling cross-dc traffic for that write
# maybeTryAdditionalReplicas / sendMessagesToNonLocalDC are not DC aware in their interactions, so transient writes incur more cross-DC traffic (ideally, the proxies would be able to coordinate upgrading to a transient write) 
# we don’t expose metrics around success/failure of cheap quorum
# transient write count isn’t incremented when we perform a non-additional write (i.e. due to down full node)

> Transient Replication: Implement cheap quorum write optimizations
> -----------------------------------------------------------------
>
>                 Key: CASSANDRA-14406
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14406
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: Coordination
>            Reporter: Ariel Weisberg
>            Assignee: Blake Eggleston
>            Priority: Major
>             Fix For: 4.0
>
>
> Writes should never be sent to transient replicas unless necessary to satisfy the requested consistency level. Such as RF not being sufficient for strong consistency or not enough full replicas marked as alive.
> If a write doesn't receive sufficient responses in time additional replicas should be sent the write similar to Rapid Read Protection.
> Hints should never be written for a transient replica.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org