You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@activemq.apache.org by seleshmaster <se...@gmail.com> on 2009/03/28 12:29:34 UTC

Activemq master/slave : slave broker causing concurrency lock problem

Hello;

We are using activemq 4.1.1 with failover capability. The two brokers run on
a separate machine as master/slave and they share the same data source which
is oracle 10g for message persistence. So everything was working as expected
until I got a call from the DBA that the salve broker is causing a
concurrency lock problem. And due to that problem the wait time on oracle
database was increasing constantly. And that was not acceptable since this
may cause a database crash. This is according to the DBA administrator. So I
was trying to figure out why this was happening, i had to download the
source code for activemq 4.1.1.  This "concurrency lock problem" was coming
from the start method in DefaultDatabaseLockr.java class.

public void start() throws Exception {
        stopping = false;
        connection = dataSource.getConnection();
        connection.setAutoCommit(false);
        
        PreparedStatement statement =
connection.prepareStatement(statements.getLockCreateStatement());
        while (true) {
            try {
                log.info("Attempting to acquire the exclusive lock to become
the Master broker");
                boolean answer = statement.execute();
                if (answer) {
                    break;
                }
            }
            catch (Exception e) {
                if (stopping) { 
                    throw new Exception("Cannot start broker as being asked
to shut down. Interupted attempt to acquire lock: " + e, e);
                }          
                log.error("Failed to acquire lock: " + e, e);
            }
            log.debug("Sleeping for " + sleepTime + " milli(s) before trying
again to get the lock...");
            Thread.sleep(sleepTime);
        }
        
        log.info("Becoming the master on dataSource: " + dataSource);
    }


what's happening is when the SQL statement is executed where it says boolean
answer = statement.execute() , it never returns, instead it just hangs there
until the lock is released by the master broker. The SQL  statement that
gets executed  is defined in getLockCreateStatment() method in
Statments.java class is as follows

public String getLockCreateStatement() {
        if (lockCreateStatement == null) {
            lockCreateStatement = "SELECT * FROM " + getFullLockTableName();
            if (useLockCreateWhereClause) {
                lockCreateStatement += " WHERE ID = 1";
            }
            lockCreateStatement += " FOR UPDATE";
        }
        return lockCreateStatement;
    } 

So in order to fix this problem , I modified the SQL statement by adding the
"NOWAIT" at the end of the SQL. And that seems to fix the problem. Now the
slave broker does not hang, instead it checks the lock every second as it is
specified by the sleepTime.


But  adding the "NOWAIT" at the end of the SQL causes ORACLE to throw an
error since the master broker has the exclusive lock.  Since the catch block
in the start method catches all the exception and checks if the broker is
stopping or not. if it is stopping it just throws exception. Otherwise, it
logs whatever exception without stopping the process. But since we didn't
want to see the error log printed by log.error("Failed to acquire lock: " +
e, e); to the screen every second, I changed it to log.debug("Failed to
acquire lock: " + e, e);





A couple of things I noticed after I added the "NOWAIT",  the "Attempting to
acquire the exclusive lock to become the Master broker" prints to the screen
every second and also we were having problem 
executing the shutdown script under activemq/bin directory before the fix.
It was just hang and we were forced to use kill -9 pid (even kill pid) was
not killing the process. But after the change I made, now the shutdown
script is working as it supposed to.




So I am not sure this was done intentionally or not. If it is a bug and
there is already a fix for it, please someone post the patch id or something
(the reason I am asking this cause, I could not find any), Otherwise if
there is a good reason why it was implemented the way it is now, can someone
please explain the reason? 


thx
Seleshmaster


-- 
View this message in context: http://www.nabble.com/Activemq-master-slave--%3A-slave-broker-causing-concurrency-lock-problem-tp22755893p22755893.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.