You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "Marcus Eriksson (JIRA)" <ji...@apache.org> on 2011/09/09 14:47:08 UTC

[jira] [Created] (CASSANDRA-3166) Rolling upgrades from 0.7 to 0.8 not possible

Rolling upgrades from 0.7 to 0.8 not possible
---------------------------------------------

                 Key: CASSANDRA-3166
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3166
             Project: Cassandra
          Issue Type: Bug
    Affects Versions: 0.8.4, 0.7.9, 0.7.5
            Reporter: Marcus Eriksson


We are in the progress of upgrading to 0.8 and we need to do a rolling upgrade, this fails miserably and it is reproducible;

1. set up a 3 node cluster with 0.7.9 and rf=3, read and write, QUORUM
2. upgrade one of the nodes (i upped a seednode, not sure if that is important)
3. continue reading/writing
4. see logs on the 0.7 node fill up with: INFO 12:36:08,240 Received connection from newer protocol version. Ignorning message.


it does work if i start the 0.7.9 nodes *after* the 0.8.4 node which makes me think that it matters if it is the 0.8 node connecting to the 0.7 nodes or the other way round.

Debug logging on the 0.8 node shows:
/var/log/cassandra/system.log.9:DEBUG [pool-2-thread-82] 2011-09-09 11:55:06,067 StorageProxy.java (line 178) Write timeout java.util.concurrent.TimeoutException for one (or more) of: 
/var/log/cassandra/system.log.9:DEBUG [pool-2-thread-76] 2011-09-09 11:55:06,067 StorageProxy.java (line 584) Read timeout: java.util.concurrent.TimeoutException: Operation timed out - received only 1 responses from /193.182.3.92,  .

nothing except for the "newer protocol version..." in the 0.7-logs

i will continue to look at this issue but if anyone has a quick patch, let me know



--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3166) Rolling upgrades from 0.7 to 0.8 not possible

Posted by "Marcus Eriksson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-3166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101181#comment-13101181 ] 

Marcus Eriksson commented on CASSANDRA-3166:
--------------------------------------------

oh, note that it fails all the way to the client as well, timeouts in hector

> Rolling upgrades from 0.7 to 0.8 not possible
> ---------------------------------------------
>
>                 Key: CASSANDRA-3166
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3166
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.7.5, 0.7.9, 0.8.4
>            Reporter: Marcus Eriksson
>
> We are in the progress of upgrading to 0.8 and we need to do a rolling upgrade, this fails miserably and it is reproducible;
> 1. set up a 3 node cluster with 0.7.9 and rf=3, read and write, QUORUM
> 2. upgrade one of the nodes (i upped a seednode, not sure if that is important)
> 3. continue reading/writing
> 4. see logs on the 0.7 node fill up with: INFO 12:36:08,240 Received connection from newer protocol version. Ignorning message.
> it does work if i start the 0.7.9 nodes *after* the 0.8.4 node which makes me think that it matters if it is the 0.8 node connecting to the 0.7 nodes or the other way round.
> Debug logging on the 0.8 node shows:
> /var/log/cassandra/system.log.9:DEBUG [pool-2-thread-82] 2011-09-09 11:55:06,067 StorageProxy.java (line 178) Write timeout java.util.concurrent.TimeoutException for one (or more) of: 
> /var/log/cassandra/system.log.9:DEBUG [pool-2-thread-76] 2011-09-09 11:55:06,067 StorageProxy.java (line 584) Read timeout: java.util.concurrent.TimeoutException: Operation timed out - received only 1 responses from /193.182.3.92,  .
> nothing except for the "newer protocol version..." in the 0.7-logs
> i will continue to look at this issue but if anyone has a quick patch, let me know

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3166) Rolling upgrades from 0.7 to 0.8 not possible

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-3166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101264#comment-13101264 ] 

Jonathan Ellis commented on CASSANDRA-3166:
-------------------------------------------

That's right.  This is why when we do get a message from a newer-version host we make sure to add it to gossiper so we connect back to it.

Not sure if that fix got applied to 0.7 -- if not, making the 0.8 node a seed should work around it.

> Rolling upgrades from 0.7 to 0.8 not possible
> ---------------------------------------------
>
>                 Key: CASSANDRA-3166
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3166
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.7.5, 0.7.9, 0.8.4
>            Reporter: Marcus Eriksson
>
> We are in the progress of upgrading to 0.8 and we need to do a rolling upgrade, this fails miserably and it is reproducible;
> 1. set up a 3 node cluster with 0.7.9 and rf=3, read and write, QUORUM
> 2. upgrade one of the nodes (i upped a seednode, not sure if that is important)
> 3. continue reading/writing
> 4. see logs on the 0.7 node fill up with: INFO 12:36:08,240 Received connection from newer protocol version. Ignorning message.
> it does work if i start the 0.7.9 nodes *after* the 0.8.4 node which makes me think that it matters if it is the 0.8 node connecting to the 0.7 nodes or the other way round.
> Debug logging on the 0.8 node shows:
> /var/log/cassandra/system.log.9:DEBUG [pool-2-thread-82] 2011-09-09 11:55:06,067 StorageProxy.java (line 178) Write timeout java.util.concurrent.TimeoutException for one (or more) of: 
> /var/log/cassandra/system.log.9:DEBUG [pool-2-thread-76] 2011-09-09 11:55:06,067 StorageProxy.java (line 584) Read timeout: java.util.concurrent.TimeoutException: Operation timed out - received only 1 responses from /193.182.3.92,  .
> nothing except for the "newer protocol version..." in the 0.7-logs
> i will continue to look at this issue but if anyone has a quick patch, let me know

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira