You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Alain RODRIGUEZ <ar...@gmail.com> on 2013/06/04 12:08:10 UTC

Can't reach itself

Hi,

I have an issue since switch to multiple DC. I use AWS EC2 instances,
C*1.2.2, 12 nodes eu-west + 6 nodes us-east (new DC).

Datacenter: eu-west
===================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address           Load       Owns   Host ID
UN  public ip      133.43 GB  8.3%   ae33d60c-1c24-4c10-b58c-59d06faac5ca
UN  public ip      171.3 GB   8.3%   bb94c428-c98d-454d-af80-6612548a8125
UN  public ip      140.26 GB  8.3%   136bbced-25ed-4a37-abd9-7ab0d146d1c7
UN  public ip      132.14 GB  8.3%   086ebf3e-c58f-4b76-b4d5-6600f7b79cf7
UN  public ip      178.26 GB  8.3%   9255d30f-848f-4251-800b-2c61b4e0cfbf
UN  public ip      153.79 GB  8.3%   7b4fd83a-ca9c-4115-b146-222ab040abd6
UN  public ip      146.82 GB  8.3%   bf233d59-d7a4-482f-adaf-d48531d16305
UN  public ip      151.1 GB   8.3%   fa3b617d-5d31-4db2-87bf-494ee8a9f95f
UN  public ip      131.78 GB  8.3%   dac399dc-ac7c-4ee3-9503-f55e8a9f1675
UN  public ip      130.18 GB  8.3%   56b8654a-f8b3-43d4-8b15-2e74d5dfe81b
UN  public ip     161.96 GB  8.3%   97624d02-ba48-42e7-88f7-2d3b0175d6ef
UN  public ip     130.26 GB  8.3%   868c45b3-4afc-43db-b2d0-5c0f89d018fb
Datacenter: us-east
===================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address           Load       Owns   Host ID
UN  public ip        246.74 GB  0.0%   212888f6-ecf8-4953-8f83-c5653fb176cb
UN  public ip        320.15 GB  0.0%   bcd696da-433b-4e6b-8030-11629eaf5b84
UN  public ip        353.22 GB  0.0%   3f5cb04a-3ac3-46f3-b101-31a9ae7682bc
UN  public ip        348.91 GB  0.0%   836b3b76-418a-4a22-bab4-c1a0bd49de65
UN  public ip        269.37 GB  0.0%   9408c7ff-ec47-4824-af81-92aa311a1984
UN  public ip        244.94 GB  0.0%   668eb3ca-8ee4-40ae-98e7-987c471bd675

On each node of the new DC, owns 0% (from status view). A nodetool ring
myks gives me:

Datacenter: eu-west
==========
Replicas: 3

Address         Rack        Status State   Load            Owns
   Token
public ip    1b          Up     Normal  131.78 GB       25.00%
 113427455640312821154458202477256070485
public ip    1b          Up     Normal  161.96 GB       25.00%
 141784319550391026443072753096570088106
public ip    1b          Up     Normal  153.43 GB       25.00%
 70892159775195513221536376548285044053
public ip    1b          Up     Normal  151.1 GB        25.00%
 99249023685273718510150927167599061674
public ip    1b          Up     Normal  130.26 GB       25.00%
 155962751505430129087380028406227096917
public ip    1b          Up     Normal  146.82 GB       25.00%
 85070591730234615865843651857942052864
public ip    1b          Up     Normal  171.35 GB       25.00%
 14178431955039102644307275309657008810
public ip    1b          Up     Normal  132.14 GB       25.00%
 42535295865117307932921825928971026432
public ip    1b          Up     Normal  140.26 GB       25.00%
 28356863910078205288614550619314017621
public ip    1b          Up     Normal  133.43 GB       25.00%
 0
public ip    1b          Up     Normal  130.18 GB       25.00%
 127605887595351923798765477786913079296
public ip    1b          Up     Normal  178.27 GB       25.00%
 56713727820156410577229101238628035242

Datacenter: us-east
==========
Replicas: 3

Address         Rack        Status State   Load            Owns
   Token

   100
public ip   1b          Up     Normal  320.15 GB       50.00%
 28356863910078205288614550619314017721
public ip   1b          Up     Normal  353.14 GB       50.00%
 56713727820156410577229101238628035342
public ip   1b          Up     Normal  348.35 GB       50.00%
 85070591730234615865843651857942052964
public ip   1b          Up     Normal  269.35 GB       50.00%
 113427455640312821154458202477256070585
public ip   1b          Up     Normal  244.94 GB       50.00%
 141784319550391026443072753096570088206
public ip   1b          Up     Normal  246.74 GB       50.00%
 100

This seems to be ok.

When I run "describe cluster;" from cassandra-cli from an eu-west node :

[default@unknown] describe cluster;
Cluster Information:
   Snitch: org.apache.cassandra.locator.Ec2MultiRegionSnitch
   Partitioner: org.apache.cassandra.dht.RandomPartitioner
   Schema versions:
        e968865b-3b96-3c87-af0a-6294067a832f: [My 18 publics ip]

So far so good.
>From an us-east node now :

[default@unknown] describe cluster;
Cluster Information:
   Snitch: org.apache.cassandra.locator.Ec2MultiRegionSnitch
   Partitioner: org.apache.cassandra.dht.RandomPartitioner
   Schema versions:
        UNREACHABLE: [public ip of the node itself]

        e968865b-3b96-3c87-af0a-6294067a832f: [17 others publics ip]


Why isn't this node not able to see itself ? What port / service is in used
while describing cluster ? I have tried opening all port with no success.
Also tried the following script to help the node finding itself, but it
doesn't seems to work...

--------------------- script
---------------------------------------------------------------------------------------
#!/bin/bash
PUBLIC_IP=$(wget -qO- http://instance-data/latest/meta-data/public-ipv4)
/sbin/ifconfig eth0:1 $PUBLIC_IP netmask 255.255.255.255 broadcast
$PUBLIC_IP

--------------------- end of script
--------------------------------------------------------------------------------------

eth0:1    Link encap:Ethernet  HWaddr 12:31:39:22:c1:41
          inet addr:xx.xx.xx.xx  Bcast:xx.xx.xx.xx  Mask:255.255.255.255
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          Interrupt:47


I see a lot of hinted handoff compactions too.

Any clue on what's wrong ?

Re: Can't reach itself

Posted by Alain RODRIGUEZ <ar...@gmail.com>.
"I see a lot of hinted handoff compactions too."

I might have not been clear enough, I see a lot of "compaction of
system.hints" that I interpret as being due to a lot of data that couldn't
reach their destination.


2013/6/4 Alain RODRIGUEZ <ar...@gmail.com>

> Hi,
>
> I have an issue since switch to multiple DC. I use AWS EC2 instances,
> C*1.2.2, 12 nodes eu-west + 6 nodes us-east (new DC).
>
> Datacenter: eu-west
> ===================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address           Load       Owns   Host ID
> UN  public ip      133.43 GB  8.3%   ae33d60c-1c24-4c10-b58c-59d06faac5ca
> UN  public ip      171.3 GB   8.3%   bb94c428-c98d-454d-af80-6612548a8125
> UN  public ip      140.26 GB  8.3%   136bbced-25ed-4a37-abd9-7ab0d146d1c7
> UN  public ip      132.14 GB  8.3%   086ebf3e-c58f-4b76-b4d5-6600f7b79cf7
> UN  public ip      178.26 GB  8.3%   9255d30f-848f-4251-800b-2c61b4e0cfbf
> UN  public ip      153.79 GB  8.3%   7b4fd83a-ca9c-4115-b146-222ab040abd6
> UN  public ip      146.82 GB  8.3%   bf233d59-d7a4-482f-adaf-d48531d16305
> UN  public ip      151.1 GB   8.3%   fa3b617d-5d31-4db2-87bf-494ee8a9f95f
> UN  public ip      131.78 GB  8.3%   dac399dc-ac7c-4ee3-9503-f55e8a9f1675
> UN  public ip      130.18 GB  8.3%   56b8654a-f8b3-43d4-8b15-2e74d5dfe81b
> UN  public ip     161.96 GB  8.3%   97624d02-ba48-42e7-88f7-2d3b0175d6ef
> UN  public ip     130.26 GB  8.3%   868c45b3-4afc-43db-b2d0-5c0f89d018fb
> Datacenter: us-east
> ===================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address           Load       Owns   Host ID
> UN  public ip        246.74 GB  0.0%   212888f6-ecf8-4953-8f83-c5653fb176cb
> UN  public ip        320.15 GB  0.0%   bcd696da-433b-4e6b-8030-11629eaf5b84
> UN  public ip        353.22 GB  0.0%   3f5cb04a-3ac3-46f3-b101-31a9ae7682bc
> UN  public ip        348.91 GB  0.0%   836b3b76-418a-4a22-bab4-c1a0bd49de65
> UN  public ip        269.37 GB  0.0%   9408c7ff-ec47-4824-af81-92aa311a1984
> UN  public ip        244.94 GB  0.0%   668eb3ca-8ee4-40ae-98e7-987c471bd675
>
> On each node of the new DC, owns 0% (from status view). A nodetool ring
> myks gives me:
>
> Datacenter: eu-west
> ==========
> Replicas: 3
>
> Address         Rack        Status State   Load            Owns
>      Token
> public ip    1b          Up     Normal  131.78 GB       25.00%
>  113427455640312821154458202477256070485
> public ip    1b          Up     Normal  161.96 GB       25.00%
>  141784319550391026443072753096570088106
> public ip    1b          Up     Normal  153.43 GB       25.00%
>  70892159775195513221536376548285044053
> public ip    1b          Up     Normal  151.1 GB        25.00%
>  99249023685273718510150927167599061674
> public ip    1b          Up     Normal  130.26 GB       25.00%
>  155962751505430129087380028406227096917
> public ip    1b          Up     Normal  146.82 GB       25.00%
>  85070591730234615865843651857942052864
> public ip    1b          Up     Normal  171.35 GB       25.00%
>  14178431955039102644307275309657008810
> public ip    1b          Up     Normal  132.14 GB       25.00%
>  42535295865117307932921825928971026432
> public ip    1b          Up     Normal  140.26 GB       25.00%
>  28356863910078205288614550619314017621
> public ip    1b          Up     Normal  133.43 GB       25.00%
>  0
> public ip    1b          Up     Normal  130.18 GB       25.00%
>  127605887595351923798765477786913079296
> public ip    1b          Up     Normal  178.27 GB       25.00%
>  56713727820156410577229101238628035242
>
> Datacenter: us-east
> ==========
> Replicas: 3
>
> Address         Rack        Status State   Load            Owns
>      Token
>
>      100
> public ip   1b          Up     Normal  320.15 GB       50.00%
>  28356863910078205288614550619314017721
> public ip   1b          Up     Normal  353.14 GB       50.00%
>  56713727820156410577229101238628035342
> public ip   1b          Up     Normal  348.35 GB       50.00%
>  85070591730234615865843651857942052964
> public ip   1b          Up     Normal  269.35 GB       50.00%
>  113427455640312821154458202477256070585
> public ip   1b          Up     Normal  244.94 GB       50.00%
>  141784319550391026443072753096570088206
> public ip   1b          Up     Normal  246.74 GB       50.00%
>  100
>
> This seems to be ok.
>
> When I run "describe cluster;" from cassandra-cli from an eu-west node :
>
> [default@unknown] describe cluster;
> Cluster Information:
>    Snitch: org.apache.cassandra.locator.Ec2MultiRegionSnitch
>    Partitioner: org.apache.cassandra.dht.RandomPartitioner
>    Schema versions:
>         e968865b-3b96-3c87-af0a-6294067a832f: [My 18 publics ip]
>
> So far so good.
> From an us-east node now :
>
> [default@unknown] describe cluster;
> Cluster Information:
>    Snitch: org.apache.cassandra.locator.Ec2MultiRegionSnitch
>    Partitioner: org.apache.cassandra.dht.RandomPartitioner
>    Schema versions:
>         UNREACHABLE: [public ip of the node itself]
>
>         e968865b-3b96-3c87-af0a-6294067a832f: [17 others publics ip]
>
>
> Why isn't this node not able to see itself ? What port / service is in
> used while describing cluster ? I have tried opening all port with no
> success. Also tried the following script to help the node finding itself,
> but it doesn't seems to work...
>
> --------------------- script
> ---------------------------------------------------------------------------------------
> #!/bin/bash
> PUBLIC_IP=$(wget -qO- http://instance-data/latest/meta-data/public-ipv4)
> /sbin/ifconfig eth0:1 $PUBLIC_IP netmask 255.255.255.255 broadcast
> $PUBLIC_IP
>
> --------------------- end of script
> --------------------------------------------------------------------------------------
>
> eth0:1    Link encap:Ethernet  HWaddr 12:31:39:22:c1:41
>           inet addr:xx.xx.xx.xx  Bcast:xx.xx.xx.xx  Mask:255.255.255.255
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           Interrupt:47
>
>
> I see a lot of hinted handoff compactions too.
>
> Any clue on what's wrong ?
>