You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Edward Evans <ee...@gmail.com> on 2010/08/26 07:03:27 UTC

Ordered Partitioner load balance problem

I am currently using Cassandra 0.6.2 on four virtual nodes in two different
data centers (A, B). My initial testing used the Random Partitioner and
everything behaved as expected. I moved to the Ordered Partitioner using
SHA256 hashes as the keys and subsequently these are the tokens (If the
stories I am told are true). My hope is that, in defining the initial tokens
correctly, I would see random and even load balancing.

My keyspace is defined rather simply as:

<Keyspace Name="cc-count5">
        <ColumnFamily Name="StandardCount"/>


 <ReplicaPlacementStrategy>org.apache.cassandra.locator.RackUnawareStrategy</ReplicaPlacementStrategy>
      <!-- Number of replicas of the data -->
      <ReplicationFactor>2</ReplicationFactor>


 <EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch>
</Keyspace>

Since in 0.6.2, comparison is a utf8 string comparison I define my initial
tokens as follows. (note: 3rd octet of Address represents the DC)

Address       Status     Load          Range
     Ring

ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
192.168.152.237Up         0 bytes
3fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff|<--|
192.168.136.179Up         0 bytes
7fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff|   |
192.168.152.238Up         0 bytes
bfffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff|   |
192.168.146.254Up         0 bytes
ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff|-->|

using clustertool to confirm this could be correct:

[ee@priam cassandra]$ ./bin/clustertool -h localhost get_endpoints cc-count5
a213c14de9c1d4464aedd84eff70ae91b83b6937d11feaeb02d25f36a622d05c
Key              :
a213c14de9c1d4464aedd84eff70ae91b83b6937d11feaeb02d25f36a622d05c
Endpoints        : [/192.168.152.238, /192.168.146.254]
[ee@priam cassandra]$ ./bin/clustertool -h localhost get_endpoints cc-count5
1ad31e4f9ed64d9d056ceb9363c8ceeb7c8e65fb0931d43f85dac7aff9152f43
Key              :
1ad31e4f9ed64d9d056ceb9363c8ceeb7c8e65fb0931d43f85dac7aff9152f43
Endpoints        : [/192.168.152.237, /192.168.136.179]
[ee@priam cassandra]$ ./bin/clustertool -h localhost get_endpoints cc-count5
e71ca1e6af17d22c40c93bd3e9814f66dd371d00e366988a5d44701dc809bd45
Key              :
e71ca1e6af17d22c40c93bd3e9814f66dd371d00e366988a5d44701dc809bd45
Endpoints        : [/192.168.146.254, /192.168.152.237]
[ee@priam cassandra]$ ./bin/clustertool -h localhost get_endpoints cc-count5
7becffa44913d0b4d5200b763c88dc9251f3796586f160dc260482a6777ea696
Key              :
7becffa44913d0b4d5200b763c88dc9251f3796586f160dc260482a6777ea696
Endpoints        : [/192.168.136.179, /192.168.152.238]
[ee@priam cassandra]$ ./bin/clustertool -h localhost get_endpoints cc-count5
7ffffff44913d0b4d5200b763c88dc9251f3796586f160dc260482a6777ea696
Key              :
7ffffff44913d0b4d5200b763c88dc9251f3796586f160dc260482a6777ea696
Endpoints        : [/192.168.136.179, /192.168.152.238]
[ee@priam cassandra]$ ./bin/clustertool -h localhost get_endpoints cc-count5
fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffe
Key              :
fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffe
Endpoints        : [/192.168.146.254, /192.168.152.237]

Looks good. Now doing some (~5M) inserts.....

192.168.152.237Up         1.22 GB
3fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff|<--|
192.168.136.179Up         2.16 GB
7fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff|   |
192.168.152.238Up         1.23 GB
bfffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff|   |
192.168.146.254Up         148.97 MB
ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff|-->|

definitely out of balance.

Some command line perl shows the tokens are evenly distributed through the
token space, unfortunately they are not going to the correct nodes. This is
a count of keys from the commit-log on host  192.168.152.237.
^1 77068
^2 76977
^3 77280
^4 77065
^5 77038
^6 76728
^7 76976
^8 76824
^9 76792
^a 76794
^b 77178
^c 76917
^d 77074
^e 76921
^f 76848

Finanlly, Looking at StandardCount-8-Index.db (again on 192.168.152.237) I
see keys that I *think* should not exist on this host. Also see these is the
Data file itself.

H2600704.07cb1d3ba32940d82d168c4e8ad30511927752f3d66dba369bb4377ddec31e40
H2600705.070b88ff8bad566a92f6090e678cc2ed734d29cdf2cfb1fbf4ef69cd45968b12
H2600706.8a53bbc1725f81d0dcf945f29646352ad61a7092f4d1476f03c52f73cf9d0964
H2600707.6938f1810c443e4007c65e97997b64c886a83788f9fae7a72339fd0df702dd0d
H2600708.a8cfa25e72d68edfe2a552a34145f403d5d0b977ff245b053b2f1fef9aebf3e3
H2600709.11513c343acc885f0f315e8c0d05a71c3f5369d2339ad2860740dfe416a129d8
H2600710.bcc4519c8c20d5ac1565752a749500b2138982c1595bed0c19d6582f6d1f364b
H2600711.a55a2077cb887d9510645d4814e37b70341981a8b112474b179ea6a20ce4d061
H2600712.c0be35e21309c5138c623033f5e2d795475fdcb19d6fcf925d66a892bba3e511
H2600713.62d8ad66d5bc65f597ac469089e252d3034777553fb3fc9ea3b966fa6a1a5922
H2600714.c7af0277f8e2aea7d26701648708d1698dde849af547af216d5bdeab63acec60
H2600715.fe3bab952d2cae53bb06e9c2ef8abcbf5f08bf02009422d55e82b58f04ecdcb0
H2600716.9cd05c7f6b806d34498eac313d375431a00dce3cc64e4d1a18827687a44ca99a
H2600717.086f36abf2b5bc9234a89028ffcccd1607ba1253acf46d8c61e46c3e392438ff
H2600718.fbacc6bd2a6572cb3b8b2c1f5b8ce9e99cc619b9d76695e4c3cb0c01be2a4bc8
H2600719.69d0b298a89b85394c328ca87c12577adb2f7e289bc50b2eb03885b3b7641f00
H2600720.c52e0c1ec88836d45fcee9a0f65539e7b9a9a9ef29ad7e3befa1add81b0822a9
H2600721.74e776b48ae66fc4ebbb68dbd94c7470de1e39dca508538866747d8a7a55815d
H2600722.6f8b1bd5c70ab8b3a2716b980f1ffe0885444d98a9b748aca03625709904e1ef
H2600723.a14f4a929b83782a3643122981f5eacd1e36ce48c95cd550ad4b068215b93d93
H2600724.6463a74790168c31b3f0a4edf71487f67df36026ee7bda552ba74b4e60afc119
H2600725.7f87d7980f636cff5904f46ce86f8b530cdd745b13cd0e1aba9b76980ae6fda4
H2600726.2ab8636baf10a6da9d232033c0dbd060d4ae208ec39db2e029a73694d051453c
H2600727.91cf1ae3e787f569756ae8ec12766cdd3a5dc3454b232c8f42df41aa5615bead
H2600728.71846f16f1b21b002226e65b4ed072377bc187762611ad8e6a09aee57461a48c
H2600729.598f2c23e9262fa141fee951cadcdde67396c166845d110526f0f210c6519edc
H2600730.e9fa44e98b30758b0f7af98169e14774fbfa2b017efd7c5425d40a662f92b0cd
H2600731.00e7562b73f924259f96e39454aff9801806dcb7dabc790501a7fb558dcad55e
H2600732.a7a9288c6258fb1a538059dddd773cca3db4827b74a8e5d6d075a0035126e2d0
H2600733.b675424304c639925ebf1afa501d6fdfedcb418a0b609aba21284edc8cf90fd3
H2600734.296b6da84fdf575736c004d69e2d7d4f4d0cc1caf9a31e53402f8753cdf07015
H2600735.3de5a1a34ebae67f7981cd0e294a24e11f056cb8fd66226136cfa933e356d26d
H2600736.032d351b7d596584de11f4499f85f732a1b38a9eee9dcc2b0df4896b10e67ab8
H2600737.5b5015c97e780bf8ca073a660af913aa524ae1122d8a5ae14ce26f74a8b43b65
H2600738.73d14b7553b2f82c7d06d3a832c1307a9b0cd7bf3f24ec1bcc389b010d7d52fd
H2600739.b7166b6fd2c85eae52760e26957c1fccbdc8c7ed61dffa244966ef18572185ca
H2600740.dff5e7342254303453d22675c363c43a7bc871c6580d6eb81380014ee4e2223b
H2600741.f07cbdfb65a31182aacdcb64b1c32d8f8909bf47b6a56d6a4cf14e8e23536be7
H2600742.996389925b25c66e9aa4d8e807cdc403d7853787e55eed83ff66e35c62d22b2d

What am I doing wrong here? Thanks.

-ee-

Re: Ordered Partitioner load balance problem

Posted by Edward Evans <ee...@gmail.com>.
Upgraded to 0.6.4 and still seeing this behavior. Any help is appreciated.

On Thu, Aug 26, 2010 at 1:03 AM, Edward Evans <ee...@gmail.com> wrote:

> I am currently using Cassandra 0.6.2 on four virtual nodes in two different
> data centers (A, B). My initial testing used the Random Partitioner and
> everything behaved as expected. I moved to the Ordered Partitioner using
> SHA256 hashes as the keys and subsequently these are the tokens (If the
> stories I am told are true). My hope is that, in defining the initial tokens
> correctly, I would see random and even load balancing.
>
> My keyspace is defined rather simply as:
>
> <Keyspace Name="cc-count5">
>         <ColumnFamily Name="StandardCount"/>
>
>
>  <ReplicaPlacementStrategy>org.apache.cassandra.locator.RackUnawareStrategy</ReplicaPlacementStrategy>
>       <!-- Number of replicas of the data -->
>       <ReplicationFactor>2</ReplicationFactor>
>
>
>  <EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch>
> </Keyspace>
>
> Since in 0.6.2, comparison is a utf8 string comparison I define my initial
> tokens as follows. (note: 3rd octet of Address represents the DC)
>
> Address       Status     Load          Range
>        Ring
>
> ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
> 192.168.152.237Up         0 bytes
> 3fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff|<--|
> 192.168.136.179Up         0 bytes
> 7fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff|   |
> 192.168.152.238Up         0 bytes
> bfffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff|   |
> 192.168.146.254Up         0 bytes
> ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff|-->|
>
> using clustertool to confirm this could be correct:
>
> [ee@priam cassandra]$ ./bin/clustertool -h localhost get_endpoints
> cc-count5 a213c14de9c1d4464aedd84eff70ae91b83b6937d11feaeb02d25f36a622d05c
> Key              :
> a213c14de9c1d4464aedd84eff70ae91b83b6937d11feaeb02d25f36a622d05c
> Endpoints        : [/192.168.152.238, /192.168.146.254]
> [ee@priam cassandra]$ ./bin/clustertool -h localhost get_endpoints
> cc-count5 1ad31e4f9ed64d9d056ceb9363c8ceeb7c8e65fb0931d43f85dac7aff9152f43
> Key              :
> 1ad31e4f9ed64d9d056ceb9363c8ceeb7c8e65fb0931d43f85dac7aff9152f43
> Endpoints        : [/192.168.152.237, /192.168.136.179]
> [ee@priam cassandra]$ ./bin/clustertool -h localhost get_endpoints
> cc-count5 e71ca1e6af17d22c40c93bd3e9814f66dd371d00e366988a5d44701dc809bd45
> Key              :
> e71ca1e6af17d22c40c93bd3e9814f66dd371d00e366988a5d44701dc809bd45
> Endpoints        : [/192.168.146.254, /192.168.152.237]
> [ee@priam cassandra]$ ./bin/clustertool -h localhost get_endpoints
> cc-count5 7becffa44913d0b4d5200b763c88dc9251f3796586f160dc260482a6777ea696
> Key              :
> 7becffa44913d0b4d5200b763c88dc9251f3796586f160dc260482a6777ea696
> Endpoints        : [/192.168.136.179, /192.168.152.238]
> [ee@priam cassandra]$ ./bin/clustertool -h localhost get_endpoints
> cc-count5 7ffffff44913d0b4d5200b763c88dc9251f3796586f160dc260482a6777ea696
> Key              :
> 7ffffff44913d0b4d5200b763c88dc9251f3796586f160dc260482a6777ea696
> Endpoints        : [/192.168.136.179, /192.168.152.238]
> [ee@priam cassandra]$ ./bin/clustertool -h localhost get_endpoints
> cc-count5 fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffe
> Key              :
> fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffe
> Endpoints        : [/192.168.146.254, /192.168.152.237]
>
> Looks good. Now doing some (~5M) inserts.....
>
> 192.168.152.237Up         1.22 GB
> 3fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff|<--|
> 192.168.136.179Up         2.16 GB
> 7fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff|   |
> 192.168.152.238Up         1.23 GB
> bfffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff|   |
> 192.168.146.254Up         148.97 MB
> ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff|-->|
>
> definitely out of balance.
>
> Some command line perl shows the tokens are evenly distributed through the
> token space, unfortunately they are not going to the correct nodes. This is
> a count of keys from the commit-log on host  192.168.152.237.
> ^1 77068
> ^2 76977
> ^3 77280
> ^4 77065
> ^5 77038
> ^6 76728
> ^7 76976
> ^8 76824
> ^9 76792
> ^a 76794
> ^b 77178
> ^c 76917
> ^d 77074
> ^e 76921
> ^f 76848
>
> Finanlly, Looking at StandardCount-8-Index.db (again on 192.168.152.237) I
> see keys that I *think* should not exist on this host. Also see these is the
> Data file itself.
>
> H2600704.07cb1d3ba32940d82d168c4e8ad30511927752f3d66dba369bb4377ddec31e40
> H2600705.070b88ff8bad566a92f6090e678cc2ed734d29cdf2cfb1fbf4ef69cd45968b12
> H2600706.8a53bbc1725f81d0dcf945f29646352ad61a7092f4d1476f03c52f73cf9d0964
> H2600707.6938f1810c443e4007c65e97997b64c886a83788f9fae7a72339fd0df702dd0d
> H2600708.a8cfa25e72d68edfe2a552a34145f403d5d0b977ff245b053b2f1fef9aebf3e3
> H2600709.11513c343acc885f0f315e8c0d05a71c3f5369d2339ad2860740dfe416a129d8
> H2600710.bcc4519c8c20d5ac1565752a749500b2138982c1595bed0c19d6582f6d1f364b
> H2600711.a55a2077cb887d9510645d4814e37b70341981a8b112474b179ea6a20ce4d061
> H2600712.c0be35e21309c5138c623033f5e2d795475fdcb19d6fcf925d66a892bba3e511
> H2600713.62d8ad66d5bc65f597ac469089e252d3034777553fb3fc9ea3b966fa6a1a5922
> H2600714.c7af0277f8e2aea7d26701648708d1698dde849af547af216d5bdeab63acec60
> H2600715.fe3bab952d2cae53bb06e9c2ef8abcbf5f08bf02009422d55e82b58f04ecdcb0
> H2600716.9cd05c7f6b806d34498eac313d375431a00dce3cc64e4d1a18827687a44ca99a
> H2600717.086f36abf2b5bc9234a89028ffcccd1607ba1253acf46d8c61e46c3e392438ff
> H2600718.fbacc6bd2a6572cb3b8b2c1f5b8ce9e99cc619b9d76695e4c3cb0c01be2a4bc8
> H2600719.69d0b298a89b85394c328ca87c12577adb2f7e289bc50b2eb03885b3b7641f00
> H2600720.c52e0c1ec88836d45fcee9a0f65539e7b9a9a9ef29ad7e3befa1add81b0822a9
> H2600721.74e776b48ae66fc4ebbb68dbd94c7470de1e39dca508538866747d8a7a55815d
> H2600722.6f8b1bd5c70ab8b3a2716b980f1ffe0885444d98a9b748aca03625709904e1ef
> H2600723.a14f4a929b83782a3643122981f5eacd1e36ce48c95cd550ad4b068215b93d93
> H2600724.6463a74790168c31b3f0a4edf71487f67df36026ee7bda552ba74b4e60afc119
> H2600725.7f87d7980f636cff5904f46ce86f8b530cdd745b13cd0e1aba9b76980ae6fda4
> H2600726.2ab8636baf10a6da9d232033c0dbd060d4ae208ec39db2e029a73694d051453c
> H2600727.91cf1ae3e787f569756ae8ec12766cdd3a5dc3454b232c8f42df41aa5615bead
> H2600728.71846f16f1b21b002226e65b4ed072377bc187762611ad8e6a09aee57461a48c
> H2600729.598f2c23e9262fa141fee951cadcdde67396c166845d110526f0f210c6519edc
> H2600730.e9fa44e98b30758b0f7af98169e14774fbfa2b017efd7c5425d40a662f92b0cd
> H2600731.00e7562b73f924259f96e39454aff9801806dcb7dabc790501a7fb558dcad55e
> H2600732.a7a9288c6258fb1a538059dddd773cca3db4827b74a8e5d6d075a0035126e2d0
> H2600733.b675424304c639925ebf1afa501d6fdfedcb418a0b609aba21284edc8cf90fd3
> H2600734.296b6da84fdf575736c004d69e2d7d4f4d0cc1caf9a31e53402f8753cdf07015
> H2600735.3de5a1a34ebae67f7981cd0e294a24e11f056cb8fd66226136cfa933e356d26d
> H2600736.032d351b7d596584de11f4499f85f732a1b38a9eee9dcc2b0df4896b10e67ab8
> H2600737.5b5015c97e780bf8ca073a660af913aa524ae1122d8a5ae14ce26f74a8b43b65
> H2600738.73d14b7553b2f82c7d06d3a832c1307a9b0cd7bf3f24ec1bcc389b010d7d52fd
> H2600739.b7166b6fd2c85eae52760e26957c1fccbdc8c7ed61dffa244966ef18572185ca
> H2600740.dff5e7342254303453d22675c363c43a7bc871c6580d6eb81380014ee4e2223b
> H2600741.f07cbdfb65a31182aacdcb64b1c32d8f8909bf47b6a56d6a4cf14e8e23536be7
> H2600742.996389925b25c66e9aa4d8e807cdc403d7853787e55eed83ff66e35c62d22b2d
>
> What am I doing wrong here? Thanks.
>
> -ee-
>