You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Mark Harwood (JIRA)" <ji...@apache.org> on 2008/08/27 15:19:45 UTC

[jira] Created: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Use of non-standard election ports in config breaks services
------------------------------------------------------------

                 Key: ZOOKEEPER-127
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
             Project: Zookeeper
          Issue Type: Bug
          Components: quorum
    Affects Versions: 3.0.0
            Reporter: Mark Harwood
            Priority: Minor


In QuorumCnxManager.toSend there is a call to create a connection as follows:
    channel = SocketChannel.open(new InetSocketAddress(addr, port));

Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
As an example, given this configuration (taken from my zoo.cfg)
  server.1=10.20.9.254:2881
  server.2=10.20.9.9:2882
  server.3=10.20.9.254:2883
Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.

In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Benjamin Reed (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12626243#action_12626243 ] 

Benjamin Reed commented on ZOOKEEPER-127:
-----------------------------------------

This needs to be fixed. I'm not sure how the tests are passing since all the peers are going to bind to the same port.

I would propose removing the electionPort property and adding a port to the server configuration line:  server.X=addr:quorum_port:le_port

so 
  server.1=10.20.9.254:2881
  server.2=10.20.9.9:2882
  server.3=10.20.9.254:2883

becomes
  server.1=10.20.9.254:2881:3881
  server.2=10.20.9.9:2882:3882
  server.3=10.20.9.254:2883:3883

for example.

If the le_port isn't specified should we assume to le_port is one more than the quorum port? server.1=10.20.9.254:2881 is the same as server.1=10.20.9.254:2881:2882?

The first order of business should probably be to figure out why the test cases are not broken...

> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Priority: Minor
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Mark Harwood (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Harwood updated ZOOKEEPER-127:
-----------------------------------

    Attachment: mhPortChanges.patch

This is a snapshot of the changes I had been thinking of making.

Note: this DOES NOT COMPILE!

If you apply the patch you can see the compile error at the point in RecvWorker where this ran out of steam. 
It seems that in order to know which QourumPeer was talking to another you would need to pass "myid" as part of the comms protocol - there would be no other way of server A knowing if server B or server C had connected to it if we allow >1 Quorum peer per machine - the source IP address obtained from SocketChannel.socket().getInetAddress() is not sufficient to distinguish between them.

That seems like a much bigger change that I don't feel particularly well equipped to take on just yet.

> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Priority: Minor
>         Attachments: mhPortChanges.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Austin Shoemaker (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632111#action_12632111 ] 

Austin Shoemaker commented on ZOOKEEPER-127:
--------------------------------------------

After about 6 runs of our unit test the test hangs as the service repeatedly tries to reelect the killed leader (similar to ZOOKEEPER-131 with algorithms 0 and 1). 


After several more runs of our unit test using the patched algorithm 3, the test hangs as the service repeatedly tries to reelect the killed leader. This behavior is similar to ZOOKEEPER-131 which we had experienced using algorithms 0 and 1.

Server 10 is 10.50.65.40 and has been explicitly killed. The following log is from server 5, which mirrors logs on all the other servers.

Any idea what's happening here?

2008-09-18 00:28:20,029 - INFO  [QuorumPeer:QuorumPeer@394] - LOOKING
2008-09-18 00:28:20,029 - WARN  [QuorumPeer:ZooKeeperServer@198] - unable to parse zxid string into long: txt
2008-09-18 00:28:20,029 - WARN  [QuorumPeer:FastLeaderElection@493] - New election: 8589935405
2008-09-18 00:28:20,031 - WARN  [WorkerSender Thread:QuorumCnxManager@381] - Cannot open channel to 10( java.net.ConnectException: Connection refused)
2008-09-18 00:28:20,031 - INFO  [QuorumPeer:QuorumPeer@403] - FOLLOWING
2008-09-18 00:28:20,031 - INFO  [QuorumPeer:ZooKeeperServer@166] - Created server with dataDir:/zookeeper_data/5_data dataLogDir:/zookeeper_data/5_data tickT
ime:2000
2008-09-18 00:28:20,031 - INFO  [QuorumPeer:Follower@128] - Following /10.50.65.40:2888

[[[ exception below repeats 5 times ]]]

2008-09-18 00:28:20,032 - WARN  [QuorumPeer:Follower@145] - Unexpected exception
java.net.ConnectException: Connection refused
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
        at java.net.Socket.connect(Socket.java:519)
        at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:137)
        at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:405)

[[[ then the follower is restarted ]]]

2008-09-18 00:28:24,049 - ERROR [QuorumPeer:Follower@370] - FIXMSG
java.lang.Exception: shutdown Follower
        at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:370)
        at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:409)

[[[ at this point the log repeats from the beginning ]]]


> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Minor
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Flavio Paiva Junqueira (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12631319#action_12631319 ] 

Flavio Paiva Junqueira commented on ZOOKEEPER-127:
--------------------------------------------------

Mahadev, thanks for reviewing the patch. Here are my answers to your points:

1- Server ID has always been a long in QuorumPeer and QuorumPeerConfig, so it was not my choice, and I've only followed what was in place already. If you feel that there is a need to make server ID an integer all over, then I suggest that you open a JIRA to discuss it;
2- We can combine if we create a new data structure to hold both the quorum address and the election address, so that we have either an array of this new data structure or a hashmap from server id to this new data structure. I prefer the way it is in the patch;
3- That's a good point, I'll fix it. Also, the version I generated the patch agains is stale (QuorumPeer.java has changed), so I'll have to regenerate the patch anyway. :-(



> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Minor
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Flavio Paiva Junqueira (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Flavio Paiva Junqueira updated ZOOKEEPER-127:
---------------------------------------------

    Status: Patch Available  (was: Open)

> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Minor
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Flavio Paiva Junqueira (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Flavio Paiva Junqueira updated ZOOKEEPER-127:
---------------------------------------------

    Priority: Critical  (was: Minor)

> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Critical
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12636633#action_12636633 ] 

Patrick Hunt commented on ZOOKEEPER-127:
----------------------------------------

are there tests for these cases?

host:port and host:port:port? we should verify both cases (even if one is illegal - verify handled appropriately)

> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Critical
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Mark Harwood (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12626939#action_12626939 ] 

Mark Harwood commented on ZOOKEEPER-127:
----------------------------------------

Did you see my comment about RecvWorker in my last post?
It relies solely on INetAddress to identify the caller. This is stored in the Message class and subsequently used in toSend messages generated by WorkerReceiver when sending back notifications.
Unfortunately RecvWorker currently is not in a position to uniquely identify which QuourumPeer is accessing it via the channel passed in it's constructor. It knows which *machine* is accessing it but has no idea which QuorumPeer of potentitally several running on that machine, and therefore which port that particular peer has chosen to use for its election port.
At least that's how it looks to me from a quick scan of the code....

> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Priority: Minor
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12636510#action_12636510 ] 

Stu Hood commented on ZOOKEEPER-127:
------------------------------------

This patch has been committed, but the changes are not documented at all. Servers will fail to start (with a NPE) if using the old config format with only 1 port per server.

> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Critical
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Flavio Paiva Junqueira (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12626935#action_12626935 ] 

Flavio Paiva Junqueira commented on ZOOKEEPER-127:
--------------------------------------------------

Mark, It is excellent that you're working on this patch. Many thanks!

Now, with respect to your comment, it seems to me just you just have to modify the calls to SocketChannel.open(), and it only appears in two places: receiveConnection and toSend. More concretely, you have to replace the variable port in the constructor of InetSocketAddress with the port value that you read from the configuration. 

What I had in mind is fairly simple. When we parse the configuration file, we could create another hashmap that maps server id to port, and we could simply replace the port variable in the constructor of InetSocketAddress with a call to get() on the hashmap.

> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Priority: Minor
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Flavio Paiva Junqueira (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627075#action_12627075 ] 

Flavio Paiva Junqueira commented on ZOOKEEPER-127:
--------------------------------------------------

No worries, I'll work on it, but it'd be great to have your feedback once I have a patch.

> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Priority: Minor
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Mark Harwood (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12626139#action_12626139 ] 

Mark Harwood commented on ZOOKEEPER-127:
----------------------------------------

Thanks for the quick response, Flavio.

>>If we allow a different port for each peer, then we would need a line in the configuration file for each peer for leader election, right?

I currently do that. The config file allows you to specify an "electionPort" property which is read by QuorumPeerConfig. 
The javadocs for QuorumPeer don't document this but my assumption was that this was a legitimate setting and I could choose different values for each server as long as they all tied up OK.
This worked out OK for me using the sourceforge 2.2.1 version.

The main reason for me wanting to mess around with ports in this way was to allow me to test Zookeeper services out on my single development machine where I could fire up and kill individual processes. I imagine it may not be uncommon for people starting with Zookeeper to want to do this so I would suggest doing one of the following:

1) Documenting clearly that electionPorts must be the same on all machines (but this would disallow my scenario of single-machine testing of multiple QuorumPeerMain processes)
2) Fixing the code to allow multiple electionPorts to be used.

Am I understanding this correctly?
Thanks,
Mark






> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Priority: Minor
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Flavio Paiva Junqueira (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Flavio Paiva Junqueira updated ZOOKEEPER-127:
---------------------------------------------

    Attachment: ZOOKEEPER-127.patch

There is compilation problem with the previous patch I submitted, so consider this new one instead.

> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Minor
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12636325#action_12636325 ] 

Hudson commented on ZOOKEEPER-127:
----------------------------------

Integrated in ZooKeeper-trunk #101 (See [http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/101/])
    

> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Critical
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Benjamin Reed (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Benjamin Reed updated ZOOKEEPER-127:
------------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

Committed revision 700714.

> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Critical
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Flavio Paiva Junqueira (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Flavio Paiva Junqueira updated ZOOKEEPER-127:
---------------------------------------------

    Status: Patch Available  (was: Open)

> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Minor
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Austin Shoemaker (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632111#action_12632111 ] 

austin edited comment on ZOOKEEPER-127 at 9/17/08 11:36 PM:
----------------------------------------------------------------------

After several more runs of our unit test using the patched algorithm 3, the test hangs as the service repeatedly tries to reelect the killed leader. This behavior is similar to ZOOKEEPER-131 which we had experienced using algorithms 0 and 1.

Server 10 is 10.50.65.40 and has been explicitly killed. The following log is from server 5, which mirrors logs on all the other servers.

Any idea what's happening here?

2008-09-18 00:28:20,029 - INFO  [QuorumPeer:QuorumPeer@394] - LOOKING
2008-09-18 00:28:20,029 - WARN  [QuorumPeer:ZooKeeperServer@198] - unable to parse zxid string into long: txt
2008-09-18 00:28:20,029 - WARN  [QuorumPeer:FastLeaderElection@493] - New election: 8589935405
2008-09-18 00:28:20,031 - WARN  [WorkerSender Thread:QuorumCnxManager@381] - Cannot open channel to 10( java.net.ConnectException: Connection refused)
2008-09-18 00:28:20,031 - INFO  [QuorumPeer:QuorumPeer@403] - FOLLOWING
2008-09-18 00:28:20,031 - INFO  [QuorumPeer:ZooKeeperServer@166] - Created server with dataDir:/zookeeper_data/5_data dataLogDir:/zookeeper_data/5_data tickT
ime:2000
2008-09-18 00:28:20,031 - INFO  [QuorumPeer:Follower@128] - Following /10.50.65.40:2888

[[[ exception below repeats 5 times ]]]

2008-09-18 00:28:20,032 - WARN  [QuorumPeer:Follower@145] - Unexpected exception
java.net.ConnectException: Connection refused
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
        at java.net.Socket.connect(Socket.java:519)
        at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:137)
        at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:405)

[[[ then the follower is restarted ]]]

2008-09-18 00:28:24,049 - ERROR [QuorumPeer:Follower@370] - FIXMSG
java.lang.Exception: shutdown Follower
        at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:370)
        at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:409)

[[[ at this point the log repeats from the beginning ]]]


      was (Author: austin):
    After about 6 runs of our unit test the test hangs as the service repeatedly tries to reelect the killed leader (similar to ZOOKEEPER-131 with algorithms 0 and 1). 


After several more runs of our unit test using the patched algorithm 3, the test hangs as the service repeatedly tries to reelect the killed leader. This behavior is similar to ZOOKEEPER-131 which we had experienced using algorithms 0 and 1.

Server 10 is 10.50.65.40 and has been explicitly killed. The following log is from server 5, which mirrors logs on all the other servers.

Any idea what's happening here?

2008-09-18 00:28:20,029 - INFO  [QuorumPeer:QuorumPeer@394] - LOOKING
2008-09-18 00:28:20,029 - WARN  [QuorumPeer:ZooKeeperServer@198] - unable to parse zxid string into long: txt
2008-09-18 00:28:20,029 - WARN  [QuorumPeer:FastLeaderElection@493] - New election: 8589935405
2008-09-18 00:28:20,031 - WARN  [WorkerSender Thread:QuorumCnxManager@381] - Cannot open channel to 10( java.net.ConnectException: Connection refused)
2008-09-18 00:28:20,031 - INFO  [QuorumPeer:QuorumPeer@403] - FOLLOWING
2008-09-18 00:28:20,031 - INFO  [QuorumPeer:ZooKeeperServer@166] - Created server with dataDir:/zookeeper_data/5_data dataLogDir:/zookeeper_data/5_data tickT
ime:2000
2008-09-18 00:28:20,031 - INFO  [QuorumPeer:Follower@128] - Following /10.50.65.40:2888

[[[ exception below repeats 5 times ]]]

2008-09-18 00:28:20,032 - WARN  [QuorumPeer:Follower@145] - Unexpected exception
java.net.ConnectException: Connection refused
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
        at java.net.Socket.connect(Socket.java:519)
        at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:137)
        at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:405)

[[[ then the follower is restarted ]]]

2008-09-18 00:28:24,049 - ERROR [QuorumPeer:Follower@370] - FIXMSG
java.lang.Exception: shutdown Follower
        at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:370)
        at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:409)

[[[ at this point the log repeats from the beginning ]]]

  
> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Minor
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Flavio Paiva Junqueira (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Flavio Paiva Junqueira updated ZOOKEEPER-127:
---------------------------------------------

    Attachment: ZOOKEEPER-127.patch

This patch removes the restriction of a single port for leader election. This patch changes the configuration file so that to specify a server, we have do as follows:


server.1=myserver:11111:3182

Note that this patch changes the procedure to parse a configuration file on QuorumPeerConfig.java.

On QuorumCnxManager and FastLeaderElection/AuthFastLeaderElection, this patch replaces all references to InetAddress objects to server id values. That is, now we identify peers by their server id.


> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Minor
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Flavio Paiva Junqueira (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Flavio Paiva Junqueira updated ZOOKEEPER-127:
---------------------------------------------

    Attachment: ZOOKEEPER-127.patch

This new patch also addresses a race condition in FastLeaderElection. Without this patch, it is possible that one of the worker threads reads the value of either proposedLeader, proposedZxid, or logicalclock in an inconsistent fashion. I have created two synchronized methods and added one synchronized block to have threads reading consistent values.

I have also addressed the points that Austin pointed out. Thanks, Austin!
 

> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Minor
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Flavio Paiva Junqueira (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12626975#action_12626975 ] 

Flavio Paiva Junqueira commented on ZOOKEEPER-127:
--------------------------------------------------

You're right, we need more changes. Here is what I think.

We'd have to change the type of +addr+ in +QuorumCnxManager.Message+ to +InetSocketAddress+, and all the comparisons of addresses that we make across +QuorumCnxManager+. For example, in +QuorumCnxManager.toSend+, we use an +InetAddress+ object to determine if there is a connection to a peer. Alternatively, we could try to map server id to connection, which sounds like a cleaner way of doing it. We would also have to change the type of +addr+ in +FastLeaderElection.Notification+ to +InetSocketAddress+. That's all I could pinpoint so far. 

> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Priority: Minor
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Flavio Paiva Junqueira (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632171#action_12632171 ] 

Flavio Paiva Junqueira commented on ZOOKEEPER-127:
--------------------------------------------------

It happens typically that a follower starts up its follower code before the leader is available as a leader, so we get this connection refused exception. However, getting it 5 times means that there has been 5 attempts and no success, so the follower gave up and started a new leader election.  

By design, the situation that happens with LE 0 should not happen with LEs 1, 2, or 3, since we use a logical clock to prevent a peer from using stale information on potential leaders.  In more detail, if a peer p1 starts a new leader election, and asks a peer p2 that believes that 10 is still the leader, then peer p1 shouldn't take p2's vote because it corresponds to the value decided in a previous election round. Moreover, because 10 is dead, we know that is can't vote, so the vote must come from another peer that believes that 10 is the leader. 

There must be a bug in the leader election code then, so I'll check the code. 

> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Minor
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Hunt updated ZOOKEEPER-127:
-----------------------------------

    Fix Version/s: 3.0.0

> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Priority: Minor
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Benjamin Reed (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Benjamin Reed updated ZOOKEEPER-127:
------------------------------------

    Hadoop Flags: [Reviewed]

+1 looks good

> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Critical
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-127:
------------------------------------

    Status: Open  (was: Patch Available)

cancelling the patch for addressing review comments... 

> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Minor
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12636512#action_12636512 ] 

Mahadev konar commented on ZOOKEEPER-127:
-----------------------------------------

though failing with a NPE is obiously bad :) ... please create a jira for it 


> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Critical
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12636632#action_12636632 ] 

Patrick Hunt commented on ZOOKEEPER-127:
----------------------------------------

re stu's npe comment, is that true? I thought I had seen some code in the config processing that indicated that we allowed both host:port and host:port:port, does it npe if host:port is used?

Flavio please look into this and also ensure that the documentation is updated to reflect.

> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Critical
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Flavio Paiva Junqueira (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Flavio Paiva Junqueira updated ZOOKEEPER-127:
---------------------------------------------

    Attachment: ZOOKEEPER-127.patch

Two minor changes. I have replaced an import on FLETest.java and LETest.java to comply with changes of JIRA 21. I have asked Pat to do it here, otherwise it would be more work to regenerate the patch. 

> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Critical
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Flavio Paiva Junqueira (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12626107#action_12626107 ] 

Flavio Paiva Junqueira commented on ZOOKEEPER-127:
--------------------------------------------------

Thanks for pointing it out, Mark. If I recall correctly, electionPort is the port that all peers use for leader election, so in the currently implementation "electionPort" doesn't mean the port of this peer, but instead the port that all peers use for leader election. If we allow a different port for each peer, then we would need a line in the configuration file for each peer for leader election, right? 

> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Priority: Minor
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Mark Harwood (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12626635#action_12626635 ] 

Mark Harwood commented on ZOOKEEPER-127:
----------------------------------------

Not good news on this I think.
I got as far as changing the QuoromPeerConfig parsing code and replacing the "addr" property in QuorumServer with electionAddress and qourumAddress for use around the place.

The main issue I ran into is that the code in QuorumCnxManager seems to rely solely on ip address (without port) in many cases. As initially pointed out here, QuorumCnxManager uses its *own* choice of election port and combines it with IP address to talk to remote servers (a problem if they happen to use different election ports e.g. required when running on the same machine). 
To address this I started off with the assumption that all existing references to INetAddress in this class should probably be substituted for INetSocketAddress to more correctly identify the references to each service.

While I was able to change most code to use InetSocketAddress in place of INetAddress (e.g. in senderWorkerMap) I found that this was not possible in RecvWorker. In its constructor it is reliant on using the IP address of the connecting client to identify the caller and is therefore unaware of the electionPort that might be needed to differentiate between more than one QuorumPeer running on the same machine when calling back. To get around this the server id (from myid) is probably required to be passed as part of the communication protocols in order to authenticate different callers.

Ick!

> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Priority: Minor
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Flavio Paiva Junqueira (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Flavio Paiva Junqueira updated ZOOKEEPER-127:
---------------------------------------------

    Attachment: ZOOKEEPER-127.patch

Because 131 was committed, I had to regenerate this patch. Also, it gave me time to write a unit test for FastLeaderElection. I used the one Ben wrote for the old leader election as a basis. 

This patch:
- Fixes issues with QuorumCnxManager, in particular, servers can have different ports for leader election, and QuorumCnxManager uses server id to identify peers instead of IP address;
- Remove the challenge used in QuorumCnxManager. QuorumCnxManager uses a randomly generated challenge to decide whether to keep or not to keep a connection. This patch eliminates this challenge, and uses server id instead. As server id is supposed to be unique, it serves the same purpose, although it is simpler, and more reliable;
- Does a general cleanup of QuorumCnxManager. I have tried to address the comments of Austin;
- Changes the format of the ZooKeeper configuration file. Now we specify the leader election port as the third parameter of a server specification, as described in the initial description of this jira;
- Fixes a bug in FastLeaderElection. Because it uses multiple threads, there is currently the possibility that a peer sends inconsistent information to other peers;
- Adds a unit test to FastLeaderElection based on the one Ben wrote;
- Finally, because of the patch of jira 131, it touch some other parts, like LETest.java to comply with my changes.

 

> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Minor
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632118#action_12632118 ] 

Mahadev konar commented on ZOOKEEPER-127:
-----------------------------------------

flavio, just a reminder, the patch should be generated against the trunk and also ^M characters (line delimiters in windows) should not be there in the patch... :)

> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Minor
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Hunt updated ZOOKEEPER-127:
-----------------------------------

    Status: Open  (was: Patch Available)

Both the fle and le tests are failing on my machine:

    [junit] Running org.apache.zookeeper.test.FLETest
    [junit] Tests run: 1, Failures: 1, Errors: 0, Time elapsed: 62.084 sec
    [junit] Test org.apache.zookeeper.test.FLETest FAILED
    [junit] Running org.apache.zookeeper.test.LETest
    [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0.397 sec
    [junit] Test org.apache.zookeeper.test.LETest FAILED

------------- Standard Error -----------------
Exception in thread "Thread-23" java.lang.NullPointerException
	at org.apache.zookeeper.test.FLETest$LEThread.run(FLETest.java:55)
Exception in thread "Thread-1" java.lang.NullPointerException
	at org.apache.zookeeper.test.FLETest$LEThread.run(FLETest.java:55)
Exception in thread "Thread-13" java.lang.NullPointerException
	at org.apache.zookeeper.test.FLETest$LEThread.run(FLETest.java:55)
Exception in thread "Thread-3" java.lang.NullPointerException
	at org.apache.zookeeper.test.FLETest$LEThread.run(FLETest.java:55)
------------- ---------------- ---------------

Testcase: testLE took 62.064 sec
	FAILED
Threads didn't join
junit.framework.AssertionFailedError: Threads didn't join
	at org.apache.zookeeper.test.FLETest.testLE(FLETest.java:121)




LETEST

Testcase: testLE took 0.377 sec
	Caused an ERROR
Address already in use
java.net.BindException: Address already in use
	at sun.nio.ch.Net.bind(Native Method)
	at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
	at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
	at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52)
	at org.apache.zookeeper.server.NIOServerCnxn$Factory.<init>(NIOServerCnxn.java:89)
	at org.apache.zookeeper.server.quorum.QuorumPeer.<init>(QuorumPeer.java:327)
	at org.apache.zookeeper.test.LETest.testLE(LETest.java:105)


> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Minor
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12636511#action_12636511 ] 

Mahadev konar commented on ZOOKEEPER-127:
-----------------------------------------

ZOOKEEPER-151  has been created to fix that :)... we should change our example config file format at well.. 

> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Critical
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Flavio Paiva Junqueira (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635022#action_12635022 ] 

fpj edited comment on ZOOKEEPER-127 at 9/26/08 2:21 PM:
---------------------------------------------------------------------------

Two minor changes. I have replaced an import on FLETest.java and LETest.java to comply with changes of JIRA 21. I have asked Pat to do it here, otherwise it would be more work to regenerate the patch. 

======
 
I have actually removed this patch because I realized that the import is necessary in this patch, and the two unit tests do not compile without the imports

      was (Author: fpj):
    Two minor changes. I have replaced an import on FLETest.java and LETest.java to comply with changes of JIRA 21. I have asked Pat to do it here, otherwise it would be more work to regenerate the patch. 
  
> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Critical
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Austin Shoemaker (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632099#action_12632099 ] 

Austin Shoemaker commented on ZOOKEEPER-127:
--------------------------------------------

Applying the patch (from 9/17) to the latest trunk (r696563) now passes our leader election unit tests using algorithm 3. This is great.

Two minor issues I noticed:

1. The default constructor for QuorumPeer should call setStatsProvider, rather than the attribute-passing constructor. Since QuorumPeerMain calls the default constructor, echo stat | nc ... requests are returning invalid data because no provider is set.

2. In QuorumPeerConfig.java:105 where parts.length is checked the operator should be && instead of ||.


> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Minor
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632211#action_12632211 ] 

Patrick Hunt commented on ZOOKEEPER-127:
----------------------------------------

Austin, when you say "our leader election unit tests" is that something you could contribute back to the Apache ZK community? If so please create a new jira and attach a patch because I'd love to include them. Thanks.


> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Minor
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12631182#action_12631182 ] 

Mahadev konar commented on ZOOKEEPER-127:
-----------------------------------------

just reviewd the code here are my comments -- 

1) why is the server id Long and not Integer?

2) we have two ways of specifying the quorum list now -- one is quorumserver arraylist and the other is the hashcode list -- can we combine these two and just keep the hash aary? Do we require these two to be different?

3) I do not see nay changes to QuorumPeerMain class? Shouldnt this be changed now to create a new contructor for QuorumPeer with the hashmap?




> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Minor
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Flavio Paiva Junqueira (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Flavio Paiva Junqueira updated ZOOKEEPER-127:
---------------------------------------------

    Status: Patch Available  (was: Open)

> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Minor
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Flavio Paiva Junqueira (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Flavio Paiva Junqueira updated ZOOKEEPER-127:
---------------------------------------------

    Attachment: ZOOKEEPER-127.patch

This last patch improves the FLE unit test, and deals with a couple of corner cases in the FastLeaderElection implementation. 

> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Minor
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Flavio Paiva Junqueira (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Flavio Paiva Junqueira reassigned ZOOKEEPER-127:
------------------------------------------------

    Assignee: Flavio Paiva Junqueira

> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Minor
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Flavio Paiva Junqueira (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12631718#action_12631718 ] 

Flavio Paiva Junqueira commented on ZOOKEEPER-127:
--------------------------------------------------

Cleanup code like this to use logging instead of printstacktrace.

         try {
             mySocket = new DatagramSocket(port);
             // mySocket.setSoTimeout(20000);
         } catch (SocketException e1) {
             e1.printStackTrace();
             throw new Runtime....

(Thanks for pointing it out, Pat)

> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Minor
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Benjamin Reed (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12626251#action_12626251 ] 

Benjamin Reed commented on ZOOKEEPER-127:
-----------------------------------------

Actually Pat showed me that the tests are passing because we don't use configuration files and instead instantiate QuorumPeers directly. It would probably be good to change the tests to use configuration files to get a more end-to-end test.

> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Priority: Minor
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Flavio Paiva Junqueira (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12631356#action_12631356 ] 

Flavio Paiva Junqueira commented on ZOOKEEPER-127:
--------------------------------------------------

Ok, I reviewed the code, and I think you meant to say for point 2 that I could just add another field to QuorumServer. I'll do it, but it implies changes across more files. Hopefully this is ok. 

> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Minor
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Flavio Paiva Junqueira (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Flavio Paiva Junqueira updated ZOOKEEPER-127:
---------------------------------------------

    Attachment: ZOOKEEPER-127.patch

This patch chenges the way we pass in the configuration file the port number used for leader election. It also changes the references to servers on QuorumCnxManager from InetAddress to their server ids. I have them changed the challenge mechanism to use the server id instead, simplifying the mechanism.

I have also implemented some of the modifications proposed in JIRA 140.

> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Minor
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-127) Use of non-standard election ports in config breaks services

Posted by "Flavio Paiva Junqueira (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Flavio Paiva Junqueira updated ZOOKEEPER-127:
---------------------------------------------

    Attachment:     (was: ZOOKEEPER-127.patch)

> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Critical
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.