You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "T Jake Luciani (JIRA)" <ji...@apache.org> on 2011/09/16 19:40:08 UTC

[jira] [Created] (CASSANDRA-3219) Nodes started at the same time end up with the same token

Nodes started at the same time end up with the same token
---------------------------------------------------------

                 Key: CASSANDRA-3219
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3219
             Project: Cassandra
          Issue Type: Bug
    Affects Versions: 1.0.0
            Reporter: T Jake Luciani


Since autoboostrap is defaulted to on when you start a cluster at once (http://screenr.com/5G6) you can end up with nodes being assigned the same token.

{code}
INFO 17:34:55,688 Node /67.23.43.14 is now part of the cluster
 INFO 17:34:55,698 InetAddress /67.23.43.14 is now UP
 INFO 17:34:55,698 Nodes /67.23.43.14 and tjake2/67.23.43.15 have the same token 8823900603000512634329811229926543166.  Ignoring /67.23.43.14
 INFO 17:34:55,698 Node /98.129.220.182 is now part of the cluster
 INFO 17:34:55,698 InetAddress /98.129.220.182 is now UP
 INFO 17:34:55,698 Nodes /98.129.220.182 and tjake2/67.23.43.15 have the same token 8823900603000512634329811229926543166.  Ignoring /98.129.220.182
{code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3219) Nodes started at the same time end up with the same token

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13106721#comment-13106721 ] 

Jonathan Ellis commented on CASSANDRA-3219:
-------------------------------------------

That is what "auto" does, with the caveat that nodes need to be started 2 minutes apart so they don't race as in Jake's example here.

> Nodes started at the same time end up with the same token
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-3219
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3219
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: T Jake Luciani
>
> Since autoboostrap is defaulted to on when you start a cluster at once (http://screenr.com/5G6) you can end up with nodes being assigned the same token.
> {code}
> INFO 17:34:55,688 Node /67.23.43.14 is now part of the cluster
>  INFO 17:34:55,698 InetAddress /67.23.43.14 is now UP
>  INFO 17:34:55,698 Nodes /67.23.43.14 and tjake2/67.23.43.15 have the same token 8823900603000512634329811229926543166.  Ignoring /67.23.43.14
>  INFO 17:34:55,698 Node /98.129.220.182 is now part of the cluster
>  INFO 17:34:55,698 InetAddress /98.129.220.182 is now UP
>  INFO 17:34:55,698 Nodes /98.129.220.182 and tjake2/67.23.43.15 have the same token 8823900603000512634329811229926543166.  Ignoring /98.129.220.182
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-3219) Nodes started at the same time end up with the same token

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sylvain Lebresne updated CASSANDRA-3219:
----------------------------------------

    Attachment: 3219_v2.patch

bq. auto and random initial_token modes added.

I hate it and I'm -1 on that idea.

Basically, I think that it's more complicated to explain/understand how to choose between those two options that it was to explain the "old" auto-bootstrap option while it's essentially the same option. The default to random would also make it more likely for people to leave it at that when bootstrapping new nodes, while random is really the worst possible algorithm you can use expect maybe for the 2-3 initial nodes of a cluster (and even then it's really only admissible because the balanced token algorithm don't work in that case and picking random token is the only simple choice we have).

I'd rather add back the auto-bootstrap option than setting the initial_token to random.

As for alternatives, I can propose one of:
  * Decide whether we are really bootstrapping (if we are, balanced token is the "right" automatic choice, otherwise we have no other choice than to fall back to random tokens) or not based on whether there is a keyspace defined already. That is the same test we use to decide whether we actually do some bootstrap streaming or not so this doesn't seem too far fetched (attaching a v2 patch for that to make it clear what I mean here).
  * Stop pretending we know how to pick up token automatically and just force user to set a token. We can default that token to 0 so that you can start a single node cluster with 0 configuration and we can ship a new small tiny script that compute the tokens for a n node initial cluster if we want people to be able to do without a calculator.


> Nodes started at the same time end up with the same token
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-3219
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3219
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: T Jake Luciani
>            Assignee: Jonathan Ellis
>              Labels: bootstrap
>             Fix For: 1.0.0
>
>         Attachments: 3219.txt, 3219_v2.patch
>
>
> Since autoboostrap is defaulted to on when you start a cluster at once (http://screenr.com/5G6) you can end up with nodes being assigned the same token.
> {code}
> INFO 17:34:55,688 Node /67.23.43.14 is now part of the cluster
>  INFO 17:34:55,698 InetAddress /67.23.43.14 is now UP
>  INFO 17:34:55,698 Nodes /67.23.43.14 and tjake2/67.23.43.15 have the same token 8823900603000512634329811229926543166.  Ignoring /67.23.43.14
>  INFO 17:34:55,698 Node /98.129.220.182 is now part of the cluster
>  INFO 17:34:55,698 InetAddress /98.129.220.182 is now UP
>  INFO 17:34:55,698 Nodes /98.129.220.182 and tjake2/67.23.43.15 have the same token 8823900603000512634329811229926543166.  Ignoring /98.129.220.182
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-3219) Nodes started at the same time end up with the same token

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-3219:
--------------------------------------

    Attachment: 3219.txt

auto and random initial_token modes added.

> Nodes started at the same time end up with the same token
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-3219
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3219
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: T Jake Luciani
>              Labels: bootstrap
>             Fix For: 1.0.0
>
>         Attachments: 3219.txt
>
>
> Since autoboostrap is defaulted to on when you start a cluster at once (http://screenr.com/5G6) you can end up with nodes being assigned the same token.
> {code}
> INFO 17:34:55,688 Node /67.23.43.14 is now part of the cluster
>  INFO 17:34:55,698 InetAddress /67.23.43.14 is now UP
>  INFO 17:34:55,698 Nodes /67.23.43.14 and tjake2/67.23.43.15 have the same token 8823900603000512634329811229926543166.  Ignoring /67.23.43.14
>  INFO 17:34:55,698 Node /98.129.220.182 is now part of the cluster
>  INFO 17:34:55,698 InetAddress /98.129.220.182 is now UP
>  INFO 17:34:55,698 Nodes /98.129.220.182 and tjake2/67.23.43.15 have the same token 8823900603000512634329811229926543166.  Ignoring /98.129.220.182
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3219) Nodes started at the same time end up with the same token

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107882#comment-13107882 ] 

Sylvain Lebresne commented on CASSANDRA-3219:
---------------------------------------------


bq. Further, the old behavior doesn't let you specify bootstrap/random

That would depend on what is definition of valid. In my opinion, picking random token is *always* stupid, it will always result in crappy distribution (it just happens to be less stupid than "auto" in the noboostrap case, which imho is and should remain the only reason we ever generate random token). If you bootstrap, it means you have an existing cluster with data in it (I'm not saying you *have to*, I'm saying this is why bootstrap is for and so should be the case if you don't do something wrong). In such situation, I don't see why you would want to pick a random token. If some people like to live on the edge, they can write a random token generator and use that, but that we would want to expose an option, hence suggesting that this could be something useful ...

bq. or nobootstrap/auto, both of which are valid things to do

Again, really depends on the definition of "valid". First, if you start two nodes at the same time with that, you end up in this ticket situation. Sure the patch adds a "don't do it" comment but it doesn't really fix it more than that. Second, noboostrap (when token selection is involved, i.e, not a replace_token) is mainly useful to set up an initial cluster, that is when nodes don't have data at all (otherwise you want to bootstrap the node). In that situation, auto will likely don't do anything useful (it's a completely degenerated case for the algorithm). That the nobootstrap/auto pair doesn't work correctly is actually the only reason I can come up for us picking a random token in <= 0.8 versions.

Besides, when was the last time we had a user requesting to do one of boostrap/random or nobootstrap/auto, or us recommanding anything else than 'pick your token yourself'?

bq. I don't see how this helps the situation Jake describes

I'm willing to bet that when Jake encountered that problem he was trying to set up an initial cluster *before* having set up schemas and inserted some data. In that case, the second patch would pick a random token so there wouldn't be problem.

The thing is, there is not too many way to create a Cassandra cluster. First you create a cluster with n initial machines. For that you want to be in mode noboostrap/random (noboostrap/auto doesn't really work too well with no data; and by noboostrap I don't really speak of the auto-boostrap=false option, but more of not doing data streaming). Once you have data in the cluster, you want to bootstrap and then auto is always less stupid than random (IMHO). Hence the rational for the v2 patch.

bq. This is a non-starter

I find it weird to consider that a non-starter so rapidly when we all know that the very first advise we give is to hand pick token and that it's unreasonable to use auto (let's not talk about random) token in any real life situation (even the config file basically says it). But I'm willing to consider that it's not the right time to discuss that and to discard that solution, at least for now.


> Nodes started at the same time end up with the same token
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-3219
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3219
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: T Jake Luciani
>            Assignee: Jonathan Ellis
>              Labels: bootstrap
>             Fix For: 1.0.0
>
>         Attachments: 3219.txt, 3219_v2.patch
>
>
> Since autoboostrap is defaulted to on when you start a cluster at once (http://screenr.com/5G6) you can end up with nodes being assigned the same token.
> {code}
> INFO 17:34:55,688 Node /67.23.43.14 is now part of the cluster
>  INFO 17:34:55,698 InetAddress /67.23.43.14 is now UP
>  INFO 17:34:55,698 Nodes /67.23.43.14 and tjake2/67.23.43.15 have the same token 8823900603000512634329811229926543166.  Ignoring /67.23.43.14
>  INFO 17:34:55,698 Node /98.129.220.182 is now part of the cluster
>  INFO 17:34:55,698 InetAddress /98.129.220.182 is now UP
>  INFO 17:34:55,698 Nodes /98.129.220.182 and tjake2/67.23.43.15 have the same token 8823900603000512634329811229926543166.  Ignoring /98.129.220.182
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3219) Nodes started at the same time end up with the same token

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107935#comment-13107935 ] 

Sylvain Lebresne commented on CASSANDRA-3219:
---------------------------------------------

bq. Which as I pointed out in chat is NOT a new problem, but it's one we should address.

Agreed, but I suspect this is due (in the not a new problem case) to races in Boostrapper.getBootstrapSource() detection of already bootstrapping node. We should fix that if possible, which the patch don't really since if you will still potentially have those race with auto. But note that this problem is present in 0.8 and I think is not a top priority because it's relatively rare to actually start bootstrapping 2 nodes at the same time in real life. Again, I'm not saying we shouldn't fix, but it's ok to say that as long as it's not worth than in 0.8, it can wait post 1.0.0 to get fixed.

Now there is a actual new problem with 1.0.0. That problem is that when you start an initial cluster, i.e, when in 0.8 you would start node with auto-boostrap=false, you do often end up starting nodes simultaneously. That is why older version were using random token when auto-bootstrap was false. This problem does need to be fix for 1.0.0 because that is a serious regression. However, my argument is that even though we now default to auto-boostrap=true, that doesn't mean that there is no difference between setting up the initial nodes of a cluster and the latter bootstrapping of nodes to add capacity to an existing cluster. Indeed, in 1.0.0 we decided to draw this line based on whether a schema had been created or not (we call the bootstrap() method based on that). Imho, this means that we have no boostrap option and the "I have no schema" is the old auto-boostrap=false. So we should use random token in that case and balanced one otherwise the same way we are doing it in 0.8.

And I'm saying that I would prefer we do that and report the fixing of Boostrapper.getBootstrapSource() rather than exposing (and making the default) the random choice of tokens, which is my opinion is a bad idea.

> Nodes started at the same time end up with the same token
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-3219
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3219
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: T Jake Luciani
>            Assignee: Jonathan Ellis
>              Labels: bootstrap
>             Fix For: 1.0.0
>
>         Attachments: 3219.txt, 3219_v2.patch
>
>
> Since autoboostrap is defaulted to on when you start a cluster at once (http://screenr.com/5G6) you can end up with nodes being assigned the same token.
> {code}
> INFO 17:34:55,688 Node /67.23.43.14 is now part of the cluster
>  INFO 17:34:55,698 InetAddress /67.23.43.14 is now UP
>  INFO 17:34:55,698 Nodes /67.23.43.14 and tjake2/67.23.43.15 have the same token 8823900603000512634329811229926543166.  Ignoring /67.23.43.14
>  INFO 17:34:55,698 Node /98.129.220.182 is now part of the cluster
>  INFO 17:34:55,698 InetAddress /98.129.220.182 is now UP
>  INFO 17:34:55,698 Nodes /98.129.220.182 and tjake2/67.23.43.15 have the same token 8823900603000512634329811229926543166.  Ignoring /98.129.220.182
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3219) Nodes started at the same time end up with the same token

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13108004#comment-13108004 ] 

Jonathan Ellis commented on CASSANDRA-3219:
-------------------------------------------

v2 actually breaks things because getNewToken doesn't repeat the test for seed-ness.  I reverted things and went with a simpler change to accomplish the same goal in r1172717.

> Nodes started at the same time end up with the same token
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-3219
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3219
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: T Jake Luciani
>            Assignee: Sylvain Lebresne
>              Labels: bootstrap
>             Fix For: 1.0.0
>
>         Attachments: 3219.txt, 3219_v2.patch
>
>
> Since autoboostrap is defaulted to on when you start a cluster at once (http://screenr.com/5G6) you can end up with nodes being assigned the same token.
> {code}
> INFO 17:34:55,688 Node /67.23.43.14 is now part of the cluster
>  INFO 17:34:55,698 InetAddress /67.23.43.14 is now UP
>  INFO 17:34:55,698 Nodes /67.23.43.14 and tjake2/67.23.43.15 have the same token 8823900603000512634329811229926543166.  Ignoring /67.23.43.14
>  INFO 17:34:55,698 Node /98.129.220.182 is now part of the cluster
>  INFO 17:34:55,698 InetAddress /98.129.220.182 is now UP
>  INFO 17:34:55,698 Nodes /98.129.220.182 and tjake2/67.23.43.15 have the same token 8823900603000512634329811229926543166.  Ignoring /98.129.220.182
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3219) Nodes started at the same time end up with the same token

Posted by "Vijay (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107892#comment-13107892 ] 

Vijay commented on CASSANDRA-3219:
----------------------------------

IMO... Instead of random we should actually try and balance the tokens when no keyspace is defined... By which I mean moving the nodes around as there is no data to stream and at that time it will be more predictive... This will give a better distribution...




> Nodes started at the same time end up with the same token
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-3219
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3219
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: T Jake Luciani
>            Assignee: Jonathan Ellis
>              Labels: bootstrap
>             Fix For: 1.0.0
>
>         Attachments: 3219.txt, 3219_v2.patch
>
>
> Since autoboostrap is defaulted to on when you start a cluster at once (http://screenr.com/5G6) you can end up with nodes being assigned the same token.
> {code}
> INFO 17:34:55,688 Node /67.23.43.14 is now part of the cluster
>  INFO 17:34:55,698 InetAddress /67.23.43.14 is now UP
>  INFO 17:34:55,698 Nodes /67.23.43.14 and tjake2/67.23.43.15 have the same token 8823900603000512634329811229926543166.  Ignoring /67.23.43.14
>  INFO 17:34:55,698 Node /98.129.220.182 is now part of the cluster
>  INFO 17:34:55,698 InetAddress /98.129.220.182 is now UP
>  INFO 17:34:55,698 Nodes /98.129.220.182 and tjake2/67.23.43.15 have the same token 8823900603000512634329811229926543166.  Ignoring /98.129.220.182
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3219) Nodes started at the same time end up with the same token

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13106686#comment-13106686 ] 

Jonathan Ellis commented on CASSANDRA-3219:
-------------------------------------------

We should add special values "auto" and "random" to initial_token, so you can have random with bootstrap and auto-selected w/o.

Of course both of those are not recommended vs picking your own.

> Nodes started at the same time end up with the same token
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-3219
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3219
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: T Jake Luciani
>
> Since autoboostrap is defaulted to on when you start a cluster at once (http://screenr.com/5G6) you can end up with nodes being assigned the same token.
> {code}
> INFO 17:34:55,688 Node /67.23.43.14 is now part of the cluster
>  INFO 17:34:55,698 InetAddress /67.23.43.14 is now UP
>  INFO 17:34:55,698 Nodes /67.23.43.14 and tjake2/67.23.43.15 have the same token 8823900603000512634329811229926543166.  Ignoring /67.23.43.14
>  INFO 17:34:55,698 Node /98.129.220.182 is now part of the cluster
>  INFO 17:34:55,698 InetAddress /98.129.220.182 is now UP
>  INFO 17:34:55,698 Nodes /98.129.220.182 and tjake2/67.23.43.15 have the same token 8823900603000512634329811229926543166.  Ignoring /98.129.220.182
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3219) Nodes started at the same time end up with the same token

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107890#comment-13107890 ] 

Jonathan Ellis commented on CASSANDRA-3219:
-------------------------------------------

bq. when Jake encountered that problem he was trying to set up an initial cluster before having set up schemas and inserted some data

Maybe, but it sounded to me like the situation was "I was adding new nodes to an existing cluster, and they picked the same token."  Which as I pointed out in chat is NOT a new problem, but it's one we should address.

Another way of looking at my patch is, it's okay for defaults to give you something suboptimal (random tokens) but it's not okay for it to give you something broken (two nodes w/ same token).  If you want auto token picking and its potential downsides, you need to opt in.  (And hopefully read the comments and go with manual token assignment instead.)

bq. I find it weird to consider that a non-starter so rapidly

Because demo-ability matters.

> Nodes started at the same time end up with the same token
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-3219
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3219
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: T Jake Luciani
>            Assignee: Jonathan Ellis
>              Labels: bootstrap
>             Fix For: 1.0.0
>
>         Attachments: 3219.txt, 3219_v2.patch
>
>
> Since autoboostrap is defaulted to on when you start a cluster at once (http://screenr.com/5G6) you can end up with nodes being assigned the same token.
> {code}
> INFO 17:34:55,688 Node /67.23.43.14 is now part of the cluster
>  INFO 17:34:55,698 InetAddress /67.23.43.14 is now UP
>  INFO 17:34:55,698 Nodes /67.23.43.14 and tjake2/67.23.43.15 have the same token 8823900603000512634329811229926543166.  Ignoring /67.23.43.14
>  INFO 17:34:55,698 Node /98.129.220.182 is now part of the cluster
>  INFO 17:34:55,698 InetAddress /98.129.220.182 is now UP
>  INFO 17:34:55,698 Nodes /98.129.220.182 and tjake2/67.23.43.15 have the same token 8823900603000512634329811229926543166.  Ignoring /98.129.220.182
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3219) Nodes started at the same time end up with the same token

Posted by "Vijay (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13106701#comment-13106701 ] 

Vijay commented on CASSANDRA-3219:
----------------------------------

Can we also have some thing like "equal split" which will try to split the token ranges into perfect halfs? this will work well for bootstrapping ring of sizes = 2^n

> Nodes started at the same time end up with the same token
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-3219
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3219
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: T Jake Luciani
>
> Since autoboostrap is defaulted to on when you start a cluster at once (http://screenr.com/5G6) you can end up with nodes being assigned the same token.
> {code}
> INFO 17:34:55,688 Node /67.23.43.14 is now part of the cluster
>  INFO 17:34:55,698 InetAddress /67.23.43.14 is now UP
>  INFO 17:34:55,698 Nodes /67.23.43.14 and tjake2/67.23.43.15 have the same token 8823900603000512634329811229926543166.  Ignoring /67.23.43.14
>  INFO 17:34:55,698 Node /98.129.220.182 is now part of the cluster
>  INFO 17:34:55,698 InetAddress /98.129.220.182 is now UP
>  INFO 17:34:55,698 Nodes /98.129.220.182 and tjake2/67.23.43.15 have the same token 8823900603000512634329811229926543166.  Ignoring /98.129.220.182
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3219) Nodes started at the same time end up with the same token

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107905#comment-13107905 ] 

Jonathan Ellis commented on CASSANDRA-3219:
-------------------------------------------

Sure, in magic fairy land I'd love that too, but the question here is what can we improve for 1.0.

> Nodes started at the same time end up with the same token
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-3219
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3219
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: T Jake Luciani
>            Assignee: Jonathan Ellis
>              Labels: bootstrap
>             Fix For: 1.0.0
>
>         Attachments: 3219.txt, 3219_v2.patch
>
>
> Since autoboostrap is defaulted to on when you start a cluster at once (http://screenr.com/5G6) you can end up with nodes being assigned the same token.
> {code}
> INFO 17:34:55,688 Node /67.23.43.14 is now part of the cluster
>  INFO 17:34:55,698 InetAddress /67.23.43.14 is now UP
>  INFO 17:34:55,698 Nodes /67.23.43.14 and tjake2/67.23.43.15 have the same token 8823900603000512634329811229926543166.  Ignoring /67.23.43.14
>  INFO 17:34:55,698 Node /98.129.220.182 is now part of the cluster
>  INFO 17:34:55,698 InetAddress /98.129.220.182 is now UP
>  INFO 17:34:55,698 Nodes /98.129.220.182 and tjake2/67.23.43.15 have the same token 8823900603000512634329811229926543166.  Ignoring /98.129.220.182
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3219) Nodes started at the same time end up with the same token

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107826#comment-13107826 ] 

Jonathan Ellis commented on CASSANDRA-3219:
-------------------------------------------

bq. I think that it's more complicated to explain/understand how to choose between those two options

Huh?  This is a HUGE simplification because initial_token behavior depends only on initial_token.  The old behavior (where initial_token=empty behavior does one thing with auto_bootstrap=true, and another with a_b=false) was ENORMOUSLY confusing: EVERY training class I taught was baffled by this.

Further, the old behavior doesn't let you specify bootstrap/random or nobootstrap/auto, both of which are valid things to do.

bq. based on whether there is a keyspace defined already

I don't see how this helps the situation Jake describes.

bq. just force user to set a token

This is a non-starter.

> Nodes started at the same time end up with the same token
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-3219
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3219
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: T Jake Luciani
>            Assignee: Jonathan Ellis
>              Labels: bootstrap
>             Fix For: 1.0.0
>
>         Attachments: 3219.txt, 3219_v2.patch
>
>
> Since autoboostrap is defaulted to on when you start a cluster at once (http://screenr.com/5G6) you can end up with nodes being assigned the same token.
> {code}
> INFO 17:34:55,688 Node /67.23.43.14 is now part of the cluster
>  INFO 17:34:55,698 InetAddress /67.23.43.14 is now UP
>  INFO 17:34:55,698 Nodes /67.23.43.14 and tjake2/67.23.43.15 have the same token 8823900603000512634329811229926543166.  Ignoring /67.23.43.14
>  INFO 17:34:55,698 Node /98.129.220.182 is now part of the cluster
>  INFO 17:34:55,698 InetAddress /98.129.220.182 is now UP
>  INFO 17:34:55,698 Nodes /98.129.220.182 and tjake2/67.23.43.15 have the same token 8823900603000512634329811229926543166.  Ignoring /98.129.220.182
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira