You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "Alexey Zotov (JIRA)" <ji...@apache.org> on 2012/10/05 09:30:47 UTC

[jira] [Created] (CASSANDRA-4769) Prevent parallel hint delivery to the node

Alexey Zotov created CASSANDRA-4769:
---------------------------------------

             Summary: Prevent parallel hint delivery to the node 
                 Key: CASSANDRA-4769
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4769
             Project: Cassandra
          Issue Type: Improvement
          Components: Core
    Affects Versions: 1.1.2
            Reporter: Alexey Zotov


It's actual only in case of the using a big enough cluster. After node's failure other nodes try to send hints to the restored node. So theoretically it can affect performance of restored node. 
I suggest to create some mechanism for synchronization of hints delivery processes to restored node.

Could you please explain how it can be implemented.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4769) Prevent parallel hint delivery to the node

Posted by "Alexey Zotov (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13473295#comment-13473295 ] 

Alexey Zotov commented on CASSANDRA-4769:
-----------------------------------------

Yes, I know it. Each DC manages full range of keys. Each key should be written to some node (in case of RF is 1) in each DC. So there is some node that responsible for key K in each DC. If node (that responsible fok key K) is down in one DC, nodes from other DCs should store hint for that node. 

So at least one node (in case of RF 1) in each DC will send hints to the node that has been repaired. That's why I'm talking about DCs instead of nodes.

In case of RF = N each DC will send hints from N nodes.
                
> Prevent parallel hint delivery to the node 
> -------------------------------------------
>
>                 Key: CASSANDRA-4769
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4769
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.1.2
>            Reporter: Alexey Zotov
>
> It's actual only in case of the using a big enough cluster. After node's failure other nodes try to send hints to the restored node. So theoretically it can affect performance of restored node. 
> I suggest to create some mechanism for synchronization of hints delivery processes to restored node.
> Could you please explain how it can be implemented.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (CASSANDRA-4769) Prevent parallel hint delivery to the node

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-4769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis resolved CASSANDRA-4769.
---------------------------------------

    Resolution: Not A Problem

Set your hint throttle level appropriately instead of trying to synchronize.
                
> Prevent parallel hint delivery to the node 
> -------------------------------------------
>
>                 Key: CASSANDRA-4769
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4769
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.1.2
>            Reporter: Alexey Zotov
>
> It's actual only in case of the using a big enough cluster. After node's failure other nodes try to send hints to the restored node. So theoretically it can affect performance of restored node. 
> I suggest to create some mechanism for synchronization of hints delivery processes to restored node.
> Could you please explain how it can be implemented.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4769) Prevent parallel hint delivery to the node

Posted by "Alexey Zotov (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13473886#comment-13473886 ] 

Alexey Zotov commented on CASSANDRA-4769:
-----------------------------------------

Ok, thanks.
                
> Prevent parallel hint delivery to the node 
> -------------------------------------------
>
>                 Key: CASSANDRA-4769
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4769
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.1.2
>            Reporter: Alexey Zotov
>
> It's actual only in case of the using a big enough cluster. After node's failure other nodes try to send hints to the restored node. So theoretically it can affect performance of restored node. 
> I suggest to create some mechanism for synchronization of hints delivery processes to restored node.
> Could you please explain how it can be implemented.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4769) Prevent parallel hint delivery to the node

Posted by "Alexey Zotov (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13470583#comment-13470583 ] 

Alexey Zotov commented on CASSANDRA-4769:
-----------------------------------------

Yes, it's the simplest way. But what's about the following use case:
There are N remote data centers. Each data center sends hints to restored node. So that node will receive throttle_in_kb*N*Nrf of data per second. (Nrf - replication factor in Nth data center)
You could reduce throttle threshold, but in case of local node very small threshold is not a very good idea. 

I think that mechanism shoud be more configurable.
                
> Prevent parallel hint delivery to the node 
> -------------------------------------------
>
>                 Key: CASSANDRA-4769
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4769
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.1.2
>            Reporter: Alexey Zotov
>
> It's actual only in case of the using a big enough cluster. After node's failure other nodes try to send hints to the restored node. So theoretically it can affect performance of restored node. 
> I suggest to create some mechanism for synchronization of hints delivery processes to restored node.
> Could you please explain how it can be implemented.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4769) Prevent parallel hint delivery to the node

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13473400#comment-13473400 ] 

Brandon Williams commented on CASSANDRA-4769:
---------------------------------------------

I think phrasing this in terms of DCs is overcomplicating things, it's sufficient to imagine one large ring and realize that a recovering node is going to received throttle_in_kb*N since all other nodes are going to replay hints to it.  There's no way to prevent this though, the closest thing we had in the past was a random sleep to stagger the beginning of the replay.  Tuning the throttle appropriately is the best option now.
                
> Prevent parallel hint delivery to the node 
> -------------------------------------------
>
>                 Key: CASSANDRA-4769
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4769
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.1.2
>            Reporter: Alexey Zotov
>
> It's actual only in case of the using a big enough cluster. After node's failure other nodes try to send hints to the restored node. So theoretically it can affect performance of restored node. 
> I suggest to create some mechanism for synchronization of hints delivery processes to restored node.
> Could you please explain how it can be implemented.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4769) Prevent parallel hint delivery to the node

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13470603#comment-13470603 ] 

Jonathan Ellis commented on CASSANDRA-4769:
-------------------------------------------

Hints are per node, not per datacenter.
                
> Prevent parallel hint delivery to the node 
> -------------------------------------------
>
>                 Key: CASSANDRA-4769
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4769
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.1.2
>            Reporter: Alexey Zotov
>
> It's actual only in case of the using a big enough cluster. After node's failure other nodes try to send hints to the restored node. So theoretically it can affect performance of restored node. 
> I suggest to create some mechanism for synchronization of hints delivery processes to restored node.
> Could you please explain how it can be implemented.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira