You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@activemq.apache.org by "st.h (JIRA)" <ji...@apache.org> on 2012/10/22 18:08:12 UTC

[jira] [Created] (AMQ-4122) Lease Database Locker failover broken

st.h created AMQ-4122:
-------------------------

             Summary: Lease Database Locker failover broken
                 Key: AMQ-4122
                 URL: https://issues.apache.org/jira/browse/AMQ-4122
             Project: ActiveMQ
          Issue Type: Bug
    Affects Versions: 5.7.0
            Reporter: st.h


We are using ActiveMQ 5.7.0 together with a mysql database and could not observe correct failover behavior with lease database locker.
It seems that there is a race condition, which prevents the correct failover procedure.
We noticed that when starting up two instances, both instance are becoming master.

We did several test, including the following and could not observe intended functionality:
- shutdown all instances
- manipulate database lock that one node has lock and set expiry time in distance future
- start up both instances. both instances are unable to acquire lock, as the lock hasn't expired, which should be correct behavior.
- update the expiry time in database, so that the lock is expired.
- first instance notices expired lock and becomes master
- when second instance checks for lock, it also updates the database and becomes master.

To my understanding the second instance should not be able to update the lock, as it is held by the first instance and should not be able to become master.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (AMQ-4122) Lease Database Locker failover broken

Posted by "Gary Tully (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AMQ-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13483533#comment-13483533 ] 

Gary Tully commented on AMQ-4122:
---------------------------------

thanks for the config Christoph, I went ahead and built a simple unit test based on your description. It works fine. Maybe I need to add more contention, ie: more locks to try and replicate a race condition. The locker currently verifies that it succeeded in acquiring the lock by doing a lease extend immediately after, so it should be safe but there may be need for a random pause in there.
Can you peek at the test case and see if it reflects your use case and is possible modify it such that it breaks?
http://svn.apache.org/viewvc/activemq/trunk/activemq-core/src/test/java/org/apache/activemq/store/jdbc/LeaseDatabaseLockerTest.java?view=diff&r1=1401848&r2=1401849&pathrev=1401849
                
> Lease Database Locker failover broken
> -------------------------------------
>
>                 Key: AMQ-4122
>                 URL: https://issues.apache.org/jira/browse/AMQ-4122
>             Project: ActiveMQ
>          Issue Type: Bug
>    Affects Versions: 5.7.0
>         Environment: Java 7u9, SUSE 11, Mysql
>            Reporter: st.h
>            Assignee: Gary Tully
>         Attachments: activemq.xml
>
>
> We are using ActiveMQ 5.7.0 together with a mysql database and could not observe correct failover behavior with lease database locker.
> It seems that there is a race condition, which prevents the correct failover procedure.
> We noticed that when starting up two instances, both instance are becoming master.
> We did several test, including the following and could not observe intended functionality:
> - shutdown all instances
> - manipulate database lock that one node has lock and set expiry time in distance future
> - start up both instances. both instances are unable to acquire lock, as the lock hasn't expired, which should be correct behavior.
> - update the expiry time in database, so that the lock is expired.
> - first instance notices expired lock and becomes master
> - when second instance checks for lock, it also updates the database and becomes master.
> To my understanding the second instance should not be able to update the lock, as it is held by the first instance and should not be able to become master.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (AMQ-4122) Lease Database Locker failover broken

Posted by "Gary Tully (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AMQ-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gary Tully resolved AMQ-4122.
-----------------------------

       Resolution: Fixed
    Fix Version/s: 5.8.0

Closing this off, did the final bit of tidy up in: http://svn.apache.org/viewvc?rev=1407614&view=rev

It would be great if you could validate a 5.8-SNAPSHOT at some stage.
                
> Lease Database Locker failover broken
> -------------------------------------
>
>                 Key: AMQ-4122
>                 URL: https://issues.apache.org/jira/browse/AMQ-4122
>             Project: ActiveMQ
>          Issue Type: Bug
>    Affects Versions: 5.7.0
>         Environment: Java 7u9, SUSE 11, Mysql
>            Reporter: st.h
>            Assignee: Gary Tully
>             Fix For: 5.8.0
>
>         Attachments: activemq-kyle.xml, activemq.xml, activemq.xml
>
>
> We are using ActiveMQ 5.7.0 together with a mysql database and could not observe correct failover behavior with lease database locker.
> It seems that there is a race condition, which prevents the correct failover procedure.
> We noticed that when starting up two instances, both instance are becoming master.
> We did several test, including the following and could not observe intended functionality:
> - shutdown all instances
> - manipulate database lock that one node has lock and set expiry time in distance future
> - start up both instances. both instances are unable to acquire lock, as the lock hasn't expired, which should be correct behavior.
> - update the expiry time in database, so that the lock is expired.
> - first instance notices expired lock and becomes master
> - when second instance checks for lock, it also updates the database and becomes master.
> To my understanding the second instance should not be able to update the lock, as it is held by the first instance and should not be able to become master.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (AMQ-4122) Lease Database Locker failover broken

Posted by "Justin Field (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AMQ-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13485200#comment-13485200 ] 

Justin Field edited comment on AMQ-4122 at 10/26/12 9:13 PM:
-------------------------------------------------------------

I can also confirm that I am having the same issue.

I have to brokers running on 2 different machines and when I kill the mysql DB, 
and it comes back online both brokers both become the master.

I also attached my activeMQ config
                
      was (Author: fieldju):
    I can also confirm that I am having the same issue.

I have to brokers running on 2 different machines and when I kill the mysql DB
the 2 brokers both become the master.

I also attached my activeMQ config
                  
> Lease Database Locker failover broken
> -------------------------------------
>
>                 Key: AMQ-4122
>                 URL: https://issues.apache.org/jira/browse/AMQ-4122
>             Project: ActiveMQ
>          Issue Type: Bug
>    Affects Versions: 5.7.0
>         Environment: Java 7u9, SUSE 11, Mysql
>            Reporter: st.h
>            Assignee: Gary Tully
>         Attachments: activemq.xml, activemq.xml
>
>
> We are using ActiveMQ 5.7.0 together with a mysql database and could not observe correct failover behavior with lease database locker.
> It seems that there is a race condition, which prevents the correct failover procedure.
> We noticed that when starting up two instances, both instance are becoming master.
> We did several test, including the following and could not observe intended functionality:
> - shutdown all instances
> - manipulate database lock that one node has lock and set expiry time in distance future
> - start up both instances. both instances are unable to acquire lock, as the lock hasn't expired, which should be correct behavior.
> - update the expiry time in database, so that the lock is expired.
> - first instance notices expired lock and becomes master
> - when second instance checks for lock, it also updates the database and becomes master.
> To my understanding the second instance should not be able to update the lock, as it is held by the first instance and should not be able to become master.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (AMQ-4122) Lease Database Locker failover broken

Posted by "Kyle Miller (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AMQ-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kyle Miller updated AMQ-4122:
-----------------------------

    Attachment: activemq-kyle.xml
    
> Lease Database Locker failover broken
> -------------------------------------
>
>                 Key: AMQ-4122
>                 URL: https://issues.apache.org/jira/browse/AMQ-4122
>             Project: ActiveMQ
>          Issue Type: Bug
>    Affects Versions: 5.7.0
>         Environment: Java 7u9, SUSE 11, Mysql
>            Reporter: st.h
>            Assignee: Gary Tully
>         Attachments: activemq-kyle.xml, activemq.xml, activemq.xml
>
>
> We are using ActiveMQ 5.7.0 together with a mysql database and could not observe correct failover behavior with lease database locker.
> It seems that there is a race condition, which prevents the correct failover procedure.
> We noticed that when starting up two instances, both instance are becoming master.
> We did several test, including the following and could not observe intended functionality:
> - shutdown all instances
> - manipulate database lock that one node has lock and set expiry time in distance future
> - start up both instances. both instances are unable to acquire lock, as the lock hasn't expired, which should be correct behavior.
> - update the expiry time in database, so that the lock is expired.
> - first instance notices expired lock and becomes master
> - when second instance checks for lock, it also updates the database and becomes master.
> To my understanding the second instance should not be able to update the lock, as it is held by the first instance and should not be able to become master.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (AMQ-4122) Lease Database Locker failover broken

Posted by "Justin Field (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AMQ-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Justin Field updated AMQ-4122:
------------------------------

    Attachment: activemq.xml

ActiveMQ config
                
> Lease Database Locker failover broken
> -------------------------------------
>
>                 Key: AMQ-4122
>                 URL: https://issues.apache.org/jira/browse/AMQ-4122
>             Project: ActiveMQ
>          Issue Type: Bug
>    Affects Versions: 5.7.0
>         Environment: Java 7u9, SUSE 11, Mysql
>            Reporter: st.h
>            Assignee: Gary Tully
>         Attachments: activemq.xml, activemq.xml
>
>
> We are using ActiveMQ 5.7.0 together with a mysql database and could not observe correct failover behavior with lease database locker.
> It seems that there is a race condition, which prevents the correct failover procedure.
> We noticed that when starting up two instances, both instance are becoming master.
> We did several test, including the following and could not observe intended functionality:
> - shutdown all instances
> - manipulate database lock that one node has lock and set expiry time in distance future
> - start up both instances. both instances are unable to acquire lock, as the lock hasn't expired, which should be correct behavior.
> - update the expiry time in database, so that the lock is expired.
> - first instance notices expired lock and becomes master
> - when second instance checks for lock, it also updates the database and becomes master.
> To my understanding the second instance should not be able to update the lock, as it is held by the first instance and should not be able to become master.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (AMQ-4122) Lease Database Locker failover broken

Posted by "Christoph Seyffer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AMQ-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Christoph Seyffer updated AMQ-4122:
-----------------------------------

    Attachment: activemq.xml

Hi Gary, this is our current config which was active on two nodes (master/slave). As you can see the broker attribute useLocalHostBrokerName is set to true. The broker registers with its unique hostname.
                
> Lease Database Locker failover broken
> -------------------------------------
>
>                 Key: AMQ-4122
>                 URL: https://issues.apache.org/jira/browse/AMQ-4122
>             Project: ActiveMQ
>          Issue Type: Bug
>    Affects Versions: 5.7.0
>         Environment: Java 7u9, SUSE 11, Mysql
>            Reporter: st.h
>         Attachments: activemq.xml
>
>
> We are using ActiveMQ 5.7.0 together with a mysql database and could not observe correct failover behavior with lease database locker.
> It seems that there is a race condition, which prevents the correct failover procedure.
> We noticed that when starting up two instances, both instance are becoming master.
> We did several test, including the following and could not observe intended functionality:
> - shutdown all instances
> - manipulate database lock that one node has lock and set expiry time in distance future
> - start up both instances. both instances are unable to acquire lock, as the lock hasn't expired, which should be correct behavior.
> - update the expiry time in database, so that the lock is expired.
> - first instance notices expired lock and becomes master
> - when second instance checks for lock, it also updates the database and becomes master.
> To my understanding the second instance should not be able to update the lock, as it is held by the first instance and should not be able to become master.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (AMQ-4122) Lease Database Locker failover broken

Posted by "Justin Field (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AMQ-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13493329#comment-13493329 ] 

Justin Field edited comment on AMQ-4122 at 11/8/12 6:41 PM:
------------------------------------------------------------

I was never able to resolve the issue so i made the following work around

ActiveMQ Broker Monitor
https://gist.github.com/4040309

ActiveMQ bean definitions / configuration
https://gist.github.com/4040646

Parallel Ingestion ActiveMQ Manager
https://gist.github.com/4fc5669d41f25072d2f4

Broker Factory (see bean definition in ActiveMQ bean definitions)
https://gist.github.com/8ad9cf6ace9245c63f41
                
      was (Author: fieldju):
    I was never able to resolve the issue so i made the following work around

https://gist.github.com/4040309
                  
> Lease Database Locker failover broken
> -------------------------------------
>
>                 Key: AMQ-4122
>                 URL: https://issues.apache.org/jira/browse/AMQ-4122
>             Project: ActiveMQ
>          Issue Type: Bug
>    Affects Versions: 5.7.0
>         Environment: Java 7u9, SUSE 11, Mysql
>            Reporter: st.h
>            Assignee: Gary Tully
>         Attachments: activemq-kyle.xml, activemq.xml, activemq.xml
>
>
> We are using ActiveMQ 5.7.0 together with a mysql database and could not observe correct failover behavior with lease database locker.
> It seems that there is a race condition, which prevents the correct failover procedure.
> We noticed that when starting up two instances, both instance are becoming master.
> We did several test, including the following and could not observe intended functionality:
> - shutdown all instances
> - manipulate database lock that one node has lock and set expiry time in distance future
> - start up both instances. both instances are unable to acquire lock, as the lock hasn't expired, which should be correct behavior.
> - update the expiry time in database, so that the lock is expired.
> - first instance notices expired lock and becomes master
> - when second instance checks for lock, it also updates the database and becomes master.
> To my understanding the second instance should not be able to update the lock, as it is held by the first instance and should not be able to become master.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (AMQ-4122) Lease Database Locker failover broken

Posted by "Justin Field (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AMQ-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13485200#comment-13485200 ] 

Justin Field edited comment on AMQ-4122 at 10/26/12 9:20 PM:
-------------------------------------------------------------

I can also confirm that I am having the same issue.

I have to brokers running on 2 different machines and when I kill the mysql DB, 
and it comes back online both brokers both become the master.

I also attached my activeMQ config
and here is a screen shot of the log
http://i.imgur.com/ZS4Er.jpg
                
      was (Author: fieldju):
    I can also confirm that I am having the same issue.

I have to brokers running on 2 different machines and when I kill the mysql DB, 
and it comes back online both brokers both become the master.

I also attached my activeMQ config
                  
> Lease Database Locker failover broken
> -------------------------------------
>
>                 Key: AMQ-4122
>                 URL: https://issues.apache.org/jira/browse/AMQ-4122
>             Project: ActiveMQ
>          Issue Type: Bug
>    Affects Versions: 5.7.0
>         Environment: Java 7u9, SUSE 11, Mysql
>            Reporter: st.h
>            Assignee: Gary Tully
>         Attachments: activemq.xml, activemq.xml
>
>
> We are using ActiveMQ 5.7.0 together with a mysql database and could not observe correct failover behavior with lease database locker.
> It seems that there is a race condition, which prevents the correct failover procedure.
> We noticed that when starting up two instances, both instance are becoming master.
> We did several test, including the following and could not observe intended functionality:
> - shutdown all instances
> - manipulate database lock that one node has lock and set expiry time in distance future
> - start up both instances. both instances are unable to acquire lock, as the lock hasn't expired, which should be correct behavior.
> - update the expiry time in database, so that the lock is expired.
> - first instance notices expired lock and becomes master
> - when second instance checks for lock, it also updates the database and becomes master.
> To my understanding the second instance should not be able to update the lock, as it is held by the first instance and should not be able to become master.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (AMQ-4122) Lease Database Locker failover broken

Posted by "Gary Tully (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AMQ-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13494058#comment-13494058 ] 

Gary Tully commented on AMQ-4122:
---------------------------------

the duplicate hiding brokerService variable in JDBCPersistenceAdapter was resolved by https://issues.apache.org/jira/browse/AMQ-4108 but there is still the need to tidy up the inheritance structure.
                
> Lease Database Locker failover broken
> -------------------------------------
>
>                 Key: AMQ-4122
>                 URL: https://issues.apache.org/jira/browse/AMQ-4122
>             Project: ActiveMQ
>          Issue Type: Bug
>    Affects Versions: 5.7.0
>         Environment: Java 7u9, SUSE 11, Mysql
>            Reporter: st.h
>            Assignee: Gary Tully
>         Attachments: activemq-kyle.xml, activemq.xml, activemq.xml
>
>
> We are using ActiveMQ 5.7.0 together with a mysql database and could not observe correct failover behavior with lease database locker.
> It seems that there is a race condition, which prevents the correct failover procedure.
> We noticed that when starting up two instances, both instance are becoming master.
> We did several test, including the following and could not observe intended functionality:
> - shutdown all instances
> - manipulate database lock that one node has lock and set expiry time in distance future
> - start up both instances. both instances are unable to acquire lock, as the lock hasn't expired, which should be correct behavior.
> - update the expiry time in database, so that the lock is expired.
> - first instance notices expired lock and becomes master
> - when second instance checks for lock, it also updates the database and becomes master.
> To my understanding the second instance should not be able to update the lock, as it is held by the first instance and should not be able to become master.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (AMQ-4122) Lease Database Locker failover broken

Posted by "Gary Tully (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AMQ-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gary Tully reassigned AMQ-4122:
-------------------------------

    Assignee: Gary Tully
    
> Lease Database Locker failover broken
> -------------------------------------
>
>                 Key: AMQ-4122
>                 URL: https://issues.apache.org/jira/browse/AMQ-4122
>             Project: ActiveMQ
>          Issue Type: Bug
>    Affects Versions: 5.7.0
>         Environment: Java 7u9, SUSE 11, Mysql
>            Reporter: st.h
>            Assignee: Gary Tully
>         Attachments: activemq.xml
>
>
> We are using ActiveMQ 5.7.0 together with a mysql database and could not observe correct failover behavior with lease database locker.
> It seems that there is a race condition, which prevents the correct failover procedure.
> We noticed that when starting up two instances, both instance are becoming master.
> We did several test, including the following and could not observe intended functionality:
> - shutdown all instances
> - manipulate database lock that one node has lock and set expiry time in distance future
> - start up both instances. both instances are unable to acquire lock, as the lock hasn't expired, which should be correct behavior.
> - update the expiry time in database, so that the lock is expired.
> - first instance notices expired lock and becomes master
> - when second instance checks for lock, it also updates the database and becomes master.
> To my understanding the second instance should not be able to update the lock, as it is held by the first instance and should not be able to become master.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (AMQ-4122) Lease Database Locker failover broken

Posted by "Justin Field (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AMQ-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Justin Field updated AMQ-4122:
------------------------------

    Comment: was deleted

(was: ActiveMQ config)
    
> Lease Database Locker failover broken
> -------------------------------------
>
>                 Key: AMQ-4122
>                 URL: https://issues.apache.org/jira/browse/AMQ-4122
>             Project: ActiveMQ
>          Issue Type: Bug
>    Affects Versions: 5.7.0
>         Environment: Java 7u9, SUSE 11, Mysql
>            Reporter: st.h
>            Assignee: Gary Tully
>         Attachments: activemq.xml, activemq.xml
>
>
> We are using ActiveMQ 5.7.0 together with a mysql database and could not observe correct failover behavior with lease database locker.
> It seems that there is a race condition, which prevents the correct failover procedure.
> We noticed that when starting up two instances, both instance are becoming master.
> We did several test, including the following and could not observe intended functionality:
> - shutdown all instances
> - manipulate database lock that one node has lock and set expiry time in distance future
> - start up both instances. both instances are unable to acquire lock, as the lock hasn't expired, which should be correct behavior.
> - update the expiry time in database, so that the lock is expired.
> - first instance notices expired lock and becomes master
> - when second instance checks for lock, it also updates the database and becomes master.
> To my understanding the second instance should not be able to update the lock, as it is held by the first instance and should not be able to become master.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (AMQ-4122) Lease Database Locker failover broken

Posted by "Justin Field (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AMQ-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13493329#comment-13493329 ] 

Justin Field edited comment on AMQ-4122 at 11/8/12 6:45 PM:
------------------------------------------------------------

I was never able to resolve the issue so i made the following work around
I basically have each broker query the db to get who the current Master should be (based of the brokers hostname)
compare it to its host name and check the isSlave value. if there is a discrepancy I shut the broker down and create a new one.

ActiveMQ Broker Monitor
https://gist.github.com/4040309

ActiveMQ bean definitions / configuration
https://gist.github.com/4040646

Parallel Ingestion ActiveMQ Manager
https://gist.github.com/4fc5669d41f25072d2f4

Broker Factory (see bean definition in ActiveMQ bean definitions)
https://gist.github.com/8ad9cf6ace9245c63f41
                
      was (Author: fieldju):
    I was never able to resolve the issue so i made the following work around

ActiveMQ Broker Monitor
https://gist.github.com/4040309

ActiveMQ bean definitions / configuration
https://gist.github.com/4040646

Parallel Ingestion ActiveMQ Manager
https://gist.github.com/4fc5669d41f25072d2f4

Broker Factory (see bean definition in ActiveMQ bean definitions)
https://gist.github.com/8ad9cf6ace9245c63f41
                  
> Lease Database Locker failover broken
> -------------------------------------
>
>                 Key: AMQ-4122
>                 URL: https://issues.apache.org/jira/browse/AMQ-4122
>             Project: ActiveMQ
>          Issue Type: Bug
>    Affects Versions: 5.7.0
>         Environment: Java 7u9, SUSE 11, Mysql
>            Reporter: st.h
>            Assignee: Gary Tully
>         Attachments: activemq-kyle.xml, activemq.xml, activemq.xml
>
>
> We are using ActiveMQ 5.7.0 together with a mysql database and could not observe correct failover behavior with lease database locker.
> It seems that there is a race condition, which prevents the correct failover procedure.
> We noticed that when starting up two instances, both instance are becoming master.
> We did several test, including the following and could not observe intended functionality:
> - shutdown all instances
> - manipulate database lock that one node has lock and set expiry time in distance future
> - start up both instances. both instances are unable to acquire lock, as the lock hasn't expired, which should be correct behavior.
> - update the expiry time in database, so that the lock is expired.
> - first instance notices expired lock and becomes master
> - when second instance checks for lock, it also updates the database and becomes master.
> To my understanding the second instance should not be able to update the lock, as it is held by the first instance and should not be able to become master.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (AMQ-4122) Lease Database Locker failover broken

Posted by "Justin Field (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AMQ-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13493329#comment-13493329 ] 

Justin Field commented on AMQ-4122:
-----------------------------------

I was never able to resolve the issue so i mad the following work around

https://gist.github.com/4040309
                
> Lease Database Locker failover broken
> -------------------------------------
>
>                 Key: AMQ-4122
>                 URL: https://issues.apache.org/jira/browse/AMQ-4122
>             Project: ActiveMQ
>          Issue Type: Bug
>    Affects Versions: 5.7.0
>         Environment: Java 7u9, SUSE 11, Mysql
>            Reporter: st.h
>            Assignee: Gary Tully
>         Attachments: activemq.xml, activemq.xml
>
>
> We are using ActiveMQ 5.7.0 together with a mysql database and could not observe correct failover behavior with lease database locker.
> It seems that there is a race condition, which prevents the correct failover procedure.
> We noticed that when starting up two instances, both instance are becoming master.
> We did several test, including the following and could not observe intended functionality:
> - shutdown all instances
> - manipulate database lock that one node has lock and set expiry time in distance future
> - start up both instances. both instances are unable to acquire lock, as the lock hasn't expired, which should be correct behavior.
> - update the expiry time in database, so that the lock is expired.
> - first instance notices expired lock and becomes master
> - when second instance checks for lock, it also updates the database and becomes master.
> To my understanding the second instance should not be able to update the lock, as it is held by the first instance and should not be able to become master.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (AMQ-4122) Lease Database Locker failover broken

Posted by "Kyle Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AMQ-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13493324#comment-13493324 ] 

Kyle Miller commented on AMQ-4122:
----------------------------------

We are seeing a similar issue.  After debugging, I've found some odd behavior. 

When the LockableServiceSupport class gets a "false" back from the LeaseDatabaseBaseLocker.keepAlive() method, it calls LockableServiceSupport.stopBroker().

On line 132 of LockableServiceSupport:

LOG.info(brokerService.getBrokerName() + ", no longer able to keep the exclusive lock so giving up being a master");

This fails for me with a NullPointerException, which kills the thread, but does not stop the broker.

It turns out, there is an org.apache.activemq.broker.BrokerService variable (brokerService) that is null.  However, there is also a org.apache.activemq.xbean.XBeanBrokerService variable (brokerService) that is not null.  This is odd.

I'm guessing that I have a problem with my configuration.  I will be posting mine as well.
                
> Lease Database Locker failover broken
> -------------------------------------
>
>                 Key: AMQ-4122
>                 URL: https://issues.apache.org/jira/browse/AMQ-4122
>             Project: ActiveMQ
>          Issue Type: Bug
>    Affects Versions: 5.7.0
>         Environment: Java 7u9, SUSE 11, Mysql
>            Reporter: st.h
>            Assignee: Gary Tully
>         Attachments: activemq.xml, activemq.xml
>
>
> We are using ActiveMQ 5.7.0 together with a mysql database and could not observe correct failover behavior with lease database locker.
> It seems that there is a race condition, which prevents the correct failover procedure.
> We noticed that when starting up two instances, both instance are becoming master.
> We did several test, including the following and could not observe intended functionality:
> - shutdown all instances
> - manipulate database lock that one node has lock and set expiry time in distance future
> - start up both instances. both instances are unable to acquire lock, as the lock hasn't expired, which should be correct behavior.
> - update the expiry time in database, so that the lock is expired.
> - first instance notices expired lock and becomes master
> - when second instance checks for lock, it also updates the database and becomes master.
> To my understanding the second instance should not be able to update the lock, as it is held by the first instance and should not be able to become master.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (AMQ-4122) Lease Database Locker failover broken

Posted by "Justin Field (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AMQ-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13493329#comment-13493329 ] 

Justin Field edited comment on AMQ-4122 at 11/8/12 5:41 PM:
------------------------------------------------------------

I was never able to resolve the issue so i made the following work around

https://gist.github.com/4040309
                
      was (Author: fieldju):
    I was never able to resolve the issue so i mad the following work around

https://gist.github.com/4040309
                  
> Lease Database Locker failover broken
> -------------------------------------
>
>                 Key: AMQ-4122
>                 URL: https://issues.apache.org/jira/browse/AMQ-4122
>             Project: ActiveMQ
>          Issue Type: Bug
>    Affects Versions: 5.7.0
>         Environment: Java 7u9, SUSE 11, Mysql
>            Reporter: st.h
>            Assignee: Gary Tully
>         Attachments: activemq.xml, activemq.xml
>
>
> We are using ActiveMQ 5.7.0 together with a mysql database and could not observe correct failover behavior with lease database locker.
> It seems that there is a race condition, which prevents the correct failover procedure.
> We noticed that when starting up two instances, both instance are becoming master.
> We did several test, including the following and could not observe intended functionality:
> - shutdown all instances
> - manipulate database lock that one node has lock and set expiry time in distance future
> - start up both instances. both instances are unable to acquire lock, as the lock hasn't expired, which should be correct behavior.
> - update the expiry time in database, so that the lock is expired.
> - first instance notices expired lock and becomes master
> - when second instance checks for lock, it also updates the database and becomes master.
> To my understanding the second instance should not be able to update the lock, as it is held by the first instance and should not be able to become master.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (AMQ-4122) Lease Database Locker failover broken

Posted by "Kyle Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AMQ-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13493533#comment-13493533 ] 

Kyle Miller commented on AMQ-4122:
----------------------------------

I looked at things a bit closer and realized that there is a bug that is contributing to this behavior.

org.apache.activemq.broker.LockableServiceSupport has a private BrokerService variable.  It implements BrokerServiceAware and has a method setBrokerService(...).

org.apache.activemq.store.jdbc.JDBCPersistenceAdapter (which extends from LockableServiceSupport) ALSO has a private BrokerService variable.  It ALSO implements BrokerServiceAware and has a method setBrokerService(...).  Because it extends from LockableServiceSupport, it overrides the setter and consequently, the private BrokerService variable will NEVER be set.

I think that the JDBCPersistenceAdapter class should get rid of its private BrokerService variable (and setter).  This will solve the issue that I was seeing with 2 active master brokers.
                
> Lease Database Locker failover broken
> -------------------------------------
>
>                 Key: AMQ-4122
>                 URL: https://issues.apache.org/jira/browse/AMQ-4122
>             Project: ActiveMQ
>          Issue Type: Bug
>    Affects Versions: 5.7.0
>         Environment: Java 7u9, SUSE 11, Mysql
>            Reporter: st.h
>            Assignee: Gary Tully
>         Attachments: activemq-kyle.xml, activemq.xml, activemq.xml
>
>
> We are using ActiveMQ 5.7.0 together with a mysql database and could not observe correct failover behavior with lease database locker.
> It seems that there is a race condition, which prevents the correct failover procedure.
> We noticed that when starting up two instances, both instance are becoming master.
> We did several test, including the following and could not observe intended functionality:
> - shutdown all instances
> - manipulate database lock that one node has lock and set expiry time in distance future
> - start up both instances. both instances are unable to acquire lock, as the lock hasn't expired, which should be correct behavior.
> - update the expiry time in database, so that the lock is expired.
> - first instance notices expired lock and becomes master
> - when second instance checks for lock, it also updates the database and becomes master.
> To my understanding the second instance should not be able to update the lock, as it is held by the first instance and should not be able to become master.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (AMQ-4122) Lease Database Locker failover broken

Posted by "Justin Field (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AMQ-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13485247#comment-13485247 ] 

Justin Field commented on AMQ-4122:
-----------------------------------

looks like I was using the depreciated setDatabaseLocker when i should have been using setLocker.

                
> Lease Database Locker failover broken
> -------------------------------------
>
>                 Key: AMQ-4122
>                 URL: https://issues.apache.org/jira/browse/AMQ-4122
>             Project: ActiveMQ
>          Issue Type: Bug
>    Affects Versions: 5.7.0
>         Environment: Java 7u9, SUSE 11, Mysql
>            Reporter: st.h
>            Assignee: Gary Tully
>         Attachments: activemq.xml, activemq.xml
>
>
> We are using ActiveMQ 5.7.0 together with a mysql database and could not observe correct failover behavior with lease database locker.
> It seems that there is a race condition, which prevents the correct failover procedure.
> We noticed that when starting up two instances, both instance are becoming master.
> We did several test, including the following and could not observe intended functionality:
> - shutdown all instances
> - manipulate database lock that one node has lock and set expiry time in distance future
> - start up both instances. both instances are unable to acquire lock, as the lock hasn't expired, which should be correct behavior.
> - update the expiry time in database, so that the lock is expired.
> - first instance notices expired lock and becomes master
> - when second instance checks for lock, it also updates the database and becomes master.
> To my understanding the second instance should not be able to update the lock, as it is held by the first instance and should not be able to become master.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (AMQ-4122) Lease Database Locker failover broken

Posted by "st.h (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AMQ-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

st.h updated AMQ-4122:
----------------------

    Environment: Java 7u9, SUSE 11, Mysql
    
> Lease Database Locker failover broken
> -------------------------------------
>
>                 Key: AMQ-4122
>                 URL: https://issues.apache.org/jira/browse/AMQ-4122
>             Project: ActiveMQ
>          Issue Type: Bug
>    Affects Versions: 5.7.0
>         Environment: Java 7u9, SUSE 11, Mysql
>            Reporter: st.h
>
> We are using ActiveMQ 5.7.0 together with a mysql database and could not observe correct failover behavior with lease database locker.
> It seems that there is a race condition, which prevents the correct failover procedure.
> We noticed that when starting up two instances, both instance are becoming master.
> We did several test, including the following and could not observe intended functionality:
> - shutdown all instances
> - manipulate database lock that one node has lock and set expiry time in distance future
> - start up both instances. both instances are unable to acquire lock, as the lock hasn't expired, which should be correct behavior.
> - update the expiry time in database, so that the lock is expired.
> - first instance notices expired lock and becomes master
> - when second instance checks for lock, it also updates the database and becomes master.
> To my understanding the second instance should not be able to update the lock, as it is held by the first instance and should not be able to become master.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (AMQ-4122) Lease Database Locker failover broken

Posted by "Gary Tully (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AMQ-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13482310#comment-13482310 ] 

Gary Tully commented on AMQ-4122:
---------------------------------

are the broker names unique for master and slave. If not, then you need to provide unique names to the locker via setLeaseHolderId in xml config.

If possible, could you try and make a variant of org.apache.activemq.store.jdbc.LeaseDatabaseLockerTest (from activemq-core) that demonstrates the problem you are seeing.
That test uses an embedded derby instance, but you can fire off sql to that instance to simulate whatever changes you want.
                
> Lease Database Locker failover broken
> -------------------------------------
>
>                 Key: AMQ-4122
>                 URL: https://issues.apache.org/jira/browse/AMQ-4122
>             Project: ActiveMQ
>          Issue Type: Bug
>    Affects Versions: 5.7.0
>         Environment: Java 7u9, SUSE 11, Mysql
>            Reporter: st.h
>
> We are using ActiveMQ 5.7.0 together with a mysql database and could not observe correct failover behavior with lease database locker.
> It seems that there is a race condition, which prevents the correct failover procedure.
> We noticed that when starting up two instances, both instance are becoming master.
> We did several test, including the following and could not observe intended functionality:
> - shutdown all instances
> - manipulate database lock that one node has lock and set expiry time in distance future
> - start up both instances. both instances are unable to acquire lock, as the lock hasn't expired, which should be correct behavior.
> - update the expiry time in database, so that the lock is expired.
> - first instance notices expired lock and becomes master
> - when second instance checks for lock, it also updates the database and becomes master.
> To my understanding the second instance should not be able to update the lock, as it is held by the first instance and should not be able to become master.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (AMQ-4122) Lease Database Locker failover broken

Posted by "Justin Field (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AMQ-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13485200#comment-13485200 ] 

Justin Field commented on AMQ-4122:
-----------------------------------

I can also confirm that I am having the same issue.

I have to brokers running on 2 different machines and when I kill the mysql DB
the 2 brokers both become the master.

I also attached my activeMQ config
                
> Lease Database Locker failover broken
> -------------------------------------
>
>                 Key: AMQ-4122
>                 URL: https://issues.apache.org/jira/browse/AMQ-4122
>             Project: ActiveMQ
>          Issue Type: Bug
>    Affects Versions: 5.7.0
>         Environment: Java 7u9, SUSE 11, Mysql
>            Reporter: st.h
>            Assignee: Gary Tully
>         Attachments: activemq.xml, activemq.xml
>
>
> We are using ActiveMQ 5.7.0 together with a mysql database and could not observe correct failover behavior with lease database locker.
> It seems that there is a race condition, which prevents the correct failover procedure.
> We noticed that when starting up two instances, both instance are becoming master.
> We did several test, including the following and could not observe intended functionality:
> - shutdown all instances
> - manipulate database lock that one node has lock and set expiry time in distance future
> - start up both instances. both instances are unable to acquire lock, as the lock hasn't expired, which should be correct behavior.
> - update the expiry time in database, so that the lock is expired.
> - first instance notices expired lock and becomes master
> - when second instance checks for lock, it also updates the database and becomes master.
> To my understanding the second instance should not be able to update the lock, as it is held by the first instance and should not be able to become master.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (AMQ-4122) Lease Database Locker failover broken

Posted by "Christoph Seyffer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AMQ-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13483260#comment-13483260 ] 

Christoph Seyffer edited comment on AMQ-4122 at 10/24/12 2:15 PM:
------------------------------------------------------------------

Hi Gary, I attached our current activemq.xml which was active on two nodes (master/slave). As you can see the broker attribute useLocalHostBrokerName is set to true. The broker registers with its unique hostname.
                
      was (Author: seyffchr):
    Hi Gary, this is our current config which was active on two nodes (master/slave). As you can see the broker attribute useLocalHostBrokerName is set to true. The broker registers with its unique hostname.
                  
> Lease Database Locker failover broken
> -------------------------------------
>
>                 Key: AMQ-4122
>                 URL: https://issues.apache.org/jira/browse/AMQ-4122
>             Project: ActiveMQ
>          Issue Type: Bug
>    Affects Versions: 5.7.0
>         Environment: Java 7u9, SUSE 11, Mysql
>            Reporter: st.h
>         Attachments: activemq.xml
>
>
> We are using ActiveMQ 5.7.0 together with a mysql database and could not observe correct failover behavior with lease database locker.
> It seems that there is a race condition, which prevents the correct failover procedure.
> We noticed that when starting up two instances, both instance are becoming master.
> We did several test, including the following and could not observe intended functionality:
> - shutdown all instances
> - manipulate database lock that one node has lock and set expiry time in distance future
> - start up both instances. both instances are unable to acquire lock, as the lock hasn't expired, which should be correct behavior.
> - update the expiry time in database, so that the lock is expired.
> - first instance notices expired lock and becomes master
> - when second instance checks for lock, it also updates the database and becomes master.
> To my understanding the second instance should not be able to update the lock, as it is held by the first instance and should not be able to become master.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (AMQ-4122) Lease Database Locker failover broken

Posted by "Justin Field (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AMQ-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Justin Field updated AMQ-4122:
------------------------------

    Attachment:     (was: activemq.xml)
    
> Lease Database Locker failover broken
> -------------------------------------
>
>                 Key: AMQ-4122
>                 URL: https://issues.apache.org/jira/browse/AMQ-4122
>             Project: ActiveMQ
>          Issue Type: Bug
>    Affects Versions: 5.7.0
>         Environment: Java 7u9, SUSE 11, Mysql
>            Reporter: st.h
>            Assignee: Gary Tully
>         Attachments: activemq.xml, activemq.xml
>
>
> We are using ActiveMQ 5.7.0 together with a mysql database and could not observe correct failover behavior with lease database locker.
> It seems that there is a race condition, which prevents the correct failover procedure.
> We noticed that when starting up two instances, both instance are becoming master.
> We did several test, including the following and could not observe intended functionality:
> - shutdown all instances
> - manipulate database lock that one node has lock and set expiry time in distance future
> - start up both instances. both instances are unable to acquire lock, as the lock hasn't expired, which should be correct behavior.
> - update the expiry time in database, so that the lock is expired.
> - first instance notices expired lock and becomes master
> - when second instance checks for lock, it also updates the database and becomes master.
> To my understanding the second instance should not be able to update the lock, as it is held by the first instance and should not be able to become master.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (AMQ-4122) Lease Database Locker failover broken

Posted by "Justin Field (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AMQ-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Justin Field updated AMQ-4122:
------------------------------

    Attachment: activemq.xml
    
> Lease Database Locker failover broken
> -------------------------------------
>
>                 Key: AMQ-4122
>                 URL: https://issues.apache.org/jira/browse/AMQ-4122
>             Project: ActiveMQ
>          Issue Type: Bug
>    Affects Versions: 5.7.0
>         Environment: Java 7u9, SUSE 11, Mysql
>            Reporter: st.h
>            Assignee: Gary Tully
>         Attachments: activemq.xml, activemq.xml
>
>
> We are using ActiveMQ 5.7.0 together with a mysql database and could not observe correct failover behavior with lease database locker.
> It seems that there is a race condition, which prevents the correct failover procedure.
> We noticed that when starting up two instances, both instance are becoming master.
> We did several test, including the following and could not observe intended functionality:
> - shutdown all instances
> - manipulate database lock that one node has lock and set expiry time in distance future
> - start up both instances. both instances are unable to acquire lock, as the lock hasn't expired, which should be correct behavior.
> - update the expiry time in database, so that the lock is expired.
> - first instance notices expired lock and becomes master
> - when second instance checks for lock, it also updates the database and becomes master.
> To my understanding the second instance should not be able to update the lock, as it is held by the first instance and should not be able to become master.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira