You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "Nick Bailey (JIRA)" <ji...@apache.org> on 2011/09/21 23:15:08 UTC

[jira] [Created] (CASSANDRA-3238) Issue with multi region ec2 and replication updates

Issue with multi region ec2 and replication updates
---------------------------------------------------

                 Key: CASSANDRA-3238
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3238
             Project: Cassandra
          Issue Type: Bug
    Affects Versions: 0.8.2
            Reporter: Nick Bailey
             Fix For: 0.8.7


Using the Ec2MultiRegionSnitch and updating replication settings for a keyspace seems to cause some issues that require a rolling restart to fix. The following was observed when updating a keyspace from SimpleStrategy to NTS in a multi region environment:

* All repairs would hang. Even repairs only against a keyspace that was not updated.
* Reads at CL.ONE would start to go across region

After a rolling restart of the cluster, repairs started working correctly again and reads stayed local to the region.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3238) Issue with multi region ec2 and replication updates

Posted by "Vijay (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13112384#comment-13112384 ] 

Vijay commented on CASSANDRA-3238:
----------------------------------

Steps to reproduce:
I think it is a generic problem when switching from simple strategy to NTS...

Connect your clients to the server and start inserting ur data with SS, then without stopping the clients update to NTS and you will see the reads will still go across and restart of the nodes will fix it... all the issues mentioned seems to be related to that. I am still trying to find the root cause, fix will be soon...

> Issue with multi region ec2 and replication updates
> ---------------------------------------------------
>
>                 Key: CASSANDRA-3238
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3238
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: Nick Bailey
>            Assignee: Vijay
>             Fix For: 1.0.0
>
>
> Using the Ec2MultiRegionSnitch and updating replication settings for a keyspace seems to cause some issues that require a rolling restart to fix. The following was observed when updating a keyspace from SimpleStrategy to NTS in a multi region environment:
> * All repairs would hang. Even repairs only against a keyspace that was not updated.
> * Reads at CL.ONE would start to go across region
> After a rolling restart of the cluster, repairs started working correctly again and reads stayed local to the region.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3238) Issue with multi region ec2 and replication updates

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-3238:
--------------------------------------

    Component/s: Core
       Priority: Minor  (was: Major)

> Issue with multi region ec2 and replication updates
> ---------------------------------------------------
>
>                 Key: CASSANDRA-3238
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3238
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.0
>            Reporter: Nick Bailey
>            Assignee: Vijay
>            Priority: Minor
>             Fix For: 1.0.0
>
>
> Using the Ec2MultiRegionSnitch and updating replication settings for a keyspace seems to cause some issues that require a rolling restart to fix. The following was observed when updating a keyspace from SimpleStrategy to NTS in a multi region environment:
> * All repairs would hang. Even repairs only against a keyspace that was not updated.
> * Reads at CL.ONE would start to go across region
> After a rolling restart of the cluster, repairs started working correctly again and reads stayed local to the region.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (CASSANDRA-3238) Issue with multi region ec2 and replication updates

Posted by "Sylvain Lebresne (Resolved) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sylvain Lebresne resolved CASSANDRA-3238.
-----------------------------------------

    Resolution: Not A Problem

Since with dynamic_snitch_badness_threshold the problem goes away, resolving as Not A Problem. Feel free to reopen if you think there is still something to fix here.
                
> Issue with multi region ec2 and replication updates
> ---------------------------------------------------
>
>                 Key: CASSANDRA-3238
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3238
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.0
>            Reporter: Nick Bailey
>            Assignee: Vijay
>            Priority: Minor
>             Fix For: 1.0.0
>
>
> Using the Ec2MultiRegionSnitch and updating replication settings for a keyspace seems to cause some issues that require a rolling restart to fix. The following was observed when updating a keyspace from SimpleStrategy to NTS in a multi region environment:
> * All repairs would hang. Even repairs only against a keyspace that was not updated.
> * Reads at CL.ONE would start to go across region
> After a rolling restart of the cluster, repairs started working correctly again and reads stayed local to the region.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3238) Issue with multi region ec2 and replication updates

Posted by "Vijay (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13113687#comment-13113687 ] 

Vijay commented on CASSANDRA-3238:
----------------------------------

Talked with the user and it seems like we will have to set dynamic_snitch_badness_threshold=0.01 atleast. so far with that setting the problem seem to go away.

> Issue with multi region ec2 and replication updates
> ---------------------------------------------------
>
>                 Key: CASSANDRA-3238
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3238
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.0
>            Reporter: Nick Bailey
>            Assignee: Vijay
>            Priority: Minor
>             Fix For: 1.0.0
>
>
> Using the Ec2MultiRegionSnitch and updating replication settings for a keyspace seems to cause some issues that require a rolling restart to fix. The following was observed when updating a keyspace from SimpleStrategy to NTS in a multi region environment:
> * All repairs would hang. Even repairs only against a keyspace that was not updated.
> * Reads at CL.ONE would start to go across region
> After a rolling restart of the cluster, repairs started working correctly again and reads stayed local to the region.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3238) Issue with multi region ec2 and replication updates

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brandon Williams updated CASSANDRA-3238:
----------------------------------------

    Affects Version/s:     (was: 0.8.2)
                       1.0.0
        Fix Version/s:     (was: 0.8.7)
                       1.0.0
             Assignee: Vijay

> Issue with multi region ec2 and replication updates
> ---------------------------------------------------
>
>                 Key: CASSANDRA-3238
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3238
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: Nick Bailey
>            Assignee: Vijay
>             Fix For: 1.0.0
>
>
> Using the Ec2MultiRegionSnitch and updating replication settings for a keyspace seems to cause some issues that require a rolling restart to fix. The following was observed when updating a keyspace from SimpleStrategy to NTS in a multi region environment:
> * All repairs would hang. Even repairs only against a keyspace that was not updated.
> * Reads at CL.ONE would start to go across region
> After a rolling restart of the cluster, repairs started working correctly again and reads stayed local to the region.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3238) Issue with multi region ec2 and replication updates

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13112463#comment-13112463 ] 

Brandon Williams commented on CASSANDRA-3238:
---------------------------------------------

It sounds to me like something that needs to call ARS.clearEndpointCache isn't doing so.

> Issue with multi region ec2 and replication updates
> ---------------------------------------------------
>
>                 Key: CASSANDRA-3238
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3238
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: Nick Bailey
>            Assignee: Vijay
>             Fix For: 1.0.0
>
>
> Using the Ec2MultiRegionSnitch and updating replication settings for a keyspace seems to cause some issues that require a rolling restart to fix. The following was observed when updating a keyspace from SimpleStrategy to NTS in a multi region environment:
> * All repairs would hang. Even repairs only against a keyspace that was not updated.
> * Reads at CL.ONE would start to go across region
> After a rolling restart of the cluster, repairs started working correctly again and reads stayed local to the region.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira