You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@activemq.apache.org by Gary Tully <ga...@gmail.com> on 2012/06/15 12:37:51 UTC

Re: change strategy for determining failure of primary in JBDC-backed setup

just to close the loop on this. I implemented a lease based database
locker that can be tolerant to temporary database outage, like a
failover migration in a cluster or maintenance restart. It may provide
an alternative solution. It is available in a current 5.7-SNAPSHOT

Some  more detail in this jira comment:
https://issues.apache.org/jira/browse/AMQ-3654?focusedCommentId=13294679&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13294679


On 28 May 2012 11:45, Alex Hooper <ah...@bmjgroup.com> wrote:
> Gary Tully uttered:
>
>> There should be a single statements element with attributes, so something
>> like:
>
>
> *ahem*... To much staring and not enough looking, or something. Thanks, that
> has done the trick and now all is much better (except for my feeling like a
> moron).
>
> Many thanks,
>
> Alex.
>
>
>
>>
>>        <persistenceAdapter>
>>            <jdbcPersistenceAdapter dataDirectory=".."
>> lockKeepAlivePeriod="..">
>>                <statements>
>>                    <statements
>>                            lockCreateStatement="SELECT * FROM
>> ACTIVEMQ_LOCK FOR UPDATE WAIT 300"
>>                            stringIdDataType=".." >
>>                    </statements>
>>                </statements>
>>            </jdbcPersistenceAdapter>
>>        </persistenceAdapter>
>>
>> On 28 May 2012 07:15, Alex Hooper <ah...@bmjgroup.com> wrote:
>>>
>>> Gary Tully uttered:
>>>
>>>> you can set a specific statement string via the setter on statements
>>>> element.
>>>>
>>>> eg: peek for lockCreateStatement in the schema
>>>>
>>>> and have a look at
>>>> http://fusesource.com/docs/broker/5.4/persistence/JDBC-Customize.html
>>>>
>>> Hm, yes -- that does look spot on. However, when I try
>>>
>>> <persistenceAdapter>
>>>
>>>     <jdbcPersistenceAdapter brokerName="prod-s01"
>>>     dataDirectory="${activemq.base}/data" dataSource="#oracle-ds"
>>>     useDatabaseLock="true" lockKeepAlivePeriod="10">
>>>       <statements>
>>>         <statements stringIdDataType="VARCHAR(128)" />
>>>         <statements lockCreateStatement="SELECT * FROM ACTIVEMQ_LOCK FOR
>>> UPDATE WAIT 300" />
>>>       </statements>
>>>     </jdbcPersistenceAdapter>
>>>   </persistenceAdapter>
>>>
>>>
>>> Then activemq won't start up and the log stops at:
>>>
>>> 2012-05-28 05:55:45,554 | DEBUG | Found XML schema
>>> [http://www.springframework.org/schema/beans/spring-beans-2.0.xsd] in
>>> classpath: org/springframework/beans/factory/xml/spring-beans-2.0.xsd |
>>> org.springframework.beans.factory.xml.PluggableSchemaResolver | main
>>> 2012-05-28 05:55:45,629 | TRACE | Trying to resolve XML entity with
>>> public
>>> id [null] and system id
>>> [http://activemq.apache.org/schema/core/activemq-core.xsd] |
>>> org.springframework.beans.factory.xml.PluggableSchemaResolver | main
>>> 2012-05-28 05:55:45,645 | DEBUG | Found XML schema
>>> [http://activemq.apache.org/schema/core/activemq-core.xsd] in classpath:
>>> activemq.xsd |
>>> org.springframework.beans.factory.xml.PluggableSchemaResolver
>>> | main
>>>
>>> Which leaves me at a bit of a loss. The config looks right according to
>>> the
>>> schema, and I can't find any examples.
>>>
>>> For info, if I remove the lockCreateStatement line, it starts up
>>> normally:
>>>
>>> 2012-05-25 15:16:33,498 | DEBUG | Could not create JDBC tables; The
>>> message
>>> tabl
>>> e already existed. Failure was: ALTER TABLE ACTIVEMQ_MSGS ADD PRIORITY
>>> NUMBER Me
>>> ssage: ORA-01430: column being added already exists in table
>>>  SQLState: 72000 Vendor code: 1430 |
>>> org.apache.activemq.store.jdbc.adapter.Defa
>>> ultJDBCAdapter | main
>>> 2012-05-25 15:16:33,498 | DEBUG | Executing SQL: CREATE INDEX
>>> ACTIVEMQ_MSGS_PIDX ON ACTIVEMQ_MSGS (PRIORITY) |
>>> org.apache.activemq.store.jdbc.adapter.DefaultJDBCAdapter | main
>>> 2012-05-25 15:16:33,528 | DEBUG | Could not create JDBC tables; The
>>> message
>>> table already existed. Failure was: CREATE INDEX ACTIVEMQ_MSGS_PIDX ON
>>> ACTIVEMQ_MSGS (PRIORITY) Message: ORA-00955: name is already used by an
>>> existing object
>>>  SQLState: 42000 Vendor code: 955 |
>>> org.apache.activemq.store.jdbc.adapter.DefaultJDBCAdapter | main
>>> 2012-05-25 15:16:33,528 | DEBUG | Executing SQL: ALTER TABLE
>>> ACTIVEMQ_ACKS
>>> ADD PRIORITY NUMBER DEFAULT 5 NOT NULL |
>>> org.apache.activemq.store.jdbc.adapter.DefaultJDBCAdapter | main
>>> 2012-05-25 15:16:33,546 | DEBUG | Could not create JDBC tables; The
>>> message
>>> table already existed. Failure was: ALTER TABLE ACTIVEMQ_ACKS ADD
>>> PRIORITY
>>> NUMBER DEFAULT 5 NOT NULL Message: ORA-01430: column being added already
>>> exists in table
>>>  SQLState: 72000 Vendor code: 1430 |
>>> org.apache.activemq.store.jdbc.adapter.DefaultJDBCAdapter | main
>>> 2012-05-25 15:16:33,546 | DEBUG | Executing SQL: ALTER TABLE
>>> ACTIVEMQ_ACKS
>>> DROP PRIMARY KEY |
>>> org.apache.activemq.store.jdbc.adapter.DefaultJDBCAdapter
>>> | main
>>> 2012-05-25 15:16:33,694 | DEBUG | Executing SQL: ALTER TABLE
>>> ACTIVEMQ_ACKS
>>> ADD PRIMARY KEY (CONTAINER, CLIENT_ID, SUB_NAME, PRIORITY) |
>>> org.apache.activemq.store.jdbc.adapter.DefaultJDBCAdapter | main
>>> 2012-05-25 15:16:33,782 | INFO  | Database lock driver override not found
>>> for : [oracle_jdbc_driver].  Will use default implementation. |
>>> org.apache.activemq.store.jdbc.JDBCPersistenceAdapter | main
>>> 2012-05-25 15:16:33,784 | DEBUG | Using default JDBC Locker:
>>> org.apache.activemq.store.jdbc.DefaultDatabaseLocker@2d35da43 |
>>> org.apache.activemq.store.jdbc.JDBCPersistenceAdapter | main
>>> 2012-05-25 15:16:33,784 | INFO  | Attempting to acquire the exclusive
>>> lock
>>> to become the Master broker |
>>> org.apache.activemq.store.jdbc.DefaultDatabaseLocker | main
>>> 2012-05-25 15:16:33,784 | DEBUG | Locking Query is SELECT * FROM
>>> ACTIVEMQ_LOCK FOR UPDATE |
>>> org.apache.activemq.store.jdbc.DefaultDatabaseLocker | main
>>>
>>> Cheers,
>>>
>>>
>>> Alex.
>>>
>>>
>>>
>>>
>>>> On 25 May 2012 15:17, Alex Hooper <ah...@bmjgroup.com> wrote:
>>>>>
>>>>> Gary Tully uttered:
>>>>>
>>>>>> the keepAlive kicks in after the start() has successfully obtained the
>>>>>> lock, so a slave should just block, but a master should check the lock
>>>>>> status every period. By default it does an update using the connection
>>>>>> that has a pending transaction.
>>>>>
>>>>>
>>>>> Ah, right, the keepAlive is for extant locks, not for keeping-alive
>>>>> connections that are waiting for a lock. Which is, in hindsight,
>>>>> exactly
>>>>> what the name suggests.
>>>>>
>>>>>
>>>>>> It may be that that update has no need to hit the server till a
>>>>>> commit... not sure. May depend on the driver. But it should be
>>>>>> sufficient to validate the jdbc connection.
>>>>>
>>>>>
>>>>> I cannot see a way to validate the jdbc connection while the slave is
>>>>> blocked waiting for a response to its SELECT FOR UPDATE as the
>>>>> validation
>>>>> can only be done on idle connections and the connection is not idle, it
>>>>> is
>>>>> actively waiting for a response to its query. Even removeAbandoned
>>>>> won't
>>>>> touch it, as . . . it hasn't been abandoned.
>>>>>
>>>>>
>>>>>> Have peek at the source:
>>>>>>
>>>>>>
>>>>>>
>>>>>> http://svn.apache.org/viewvc/activemq/trunk/activemq-core/src/main/java/org/apache/activemq/store/jdbc/DefaultDatabaseLocker.java?view=markup
>>>>>>
>>>>> Yes, sorry -- I should have done this earlier. But it's so long since
>>>>> I've
>>>>> coded Java that I assumed finding the right bit would take me a
>>>>> geological
>>>>> age.
>>>>>
>>>>> Looking at that nice while loop, it occurs to me that another approach
>>>>> that
>>>>> would work, would be to alter the SQL used to grab the lock to so that
>>>>> it
>>>>> won't wait indefinitely, eg:
>>>>>
>>>>>  SELECT * FROM ACTIVEMQ_LOCK FOR UPDATE WAIT 300
>>>>>
>>>>> From looking at
>>>>> http://activemq.apache.org/schema/core/activemq-core-5.5.0.xsd, I
>>>>> cannot
>>>>> see
>>>>> a way to supply this SQL in config. Do you happen to know whether this
>>>>> is
>>>>> possible?
>>>>>
>>>>> Meanwhile, I shall get onto our hosting company about the half-open
>>>>> connection.
>>>>>
>>>>> Thanks again fro your help,
>>>>>
>>>>> Alex.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> On 25 May 2012 13:36, Alex Hooper <ah...@bmjgroup.com> wrote:
>>>>>>>
>>>>>>> Gary Tully uttered:
>>>>>>> [snip]
>>>>>>>
>>>>>>>> In your setup, it is odd that the dropped connection does not cause
>>>>>>>> the lock keepAlive to fail and the broker to terminate. It should,
>>>>>>>> unless there are tcp level options that need to kick in to see the
>>>>>>>> half close. Or some connection pool config that can pick up on the
>>>>>>>> failure, there are some validate options on commons jdbc pool that
>>>>>>>> could help there.
>>>>>>>>
>>>>>>> [snip]
>>>>>>>
>>>>>>>> Hopefully the above will help, but start with determining why in
>>>>>>>> your
>>>>>>>> current setup, the lock keepalive is not triggering for you when the
>>>>>>>> connection is dropped because that is a little odd. unless you have
>>>>>>>> the
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.activemq.store.jdbc.JDBCPersistenceAdapter#setLockKeepAlivePeriod
>>>>>>>> = 0.
>>>>>>>>
>>>>>>> How exactly does the lock keepalive mechanism work? I'm explicitly
>>>>>>> set
>>>>>>> it
>>>>>>> in
>>>>>>> the xml config now:
>>>>>>>
>>>>>>> <jdbcPersistenceAdapter brokerName="prod-s01"
>>>>>>>   dataDirectory="${activemq.base}/data" dataSource="#oracle-ds"
>>>>>>>   useDatabaseLock="true" lockKeepAlivePeriod="10">
>>>>>>>  <statements>
>>>>>>>   <statements stringIdDataType="VARCHAR(128)" />
>>>>>>>  </statements>
>>>>>>>  </jdbcPersistenceAdapter>
>>>>>>>
>>>>>>> But once the instance has started and issued its initial
>>>>>>> lock-requesting
>>>>>>> query, there is no further TCP activity at all. Maybe I've
>>>>>>> misunderstood
>>>>>>> the
>>>>>>> intent of this function; that's far from unlikely.
>>>>>>>
>>>>>>> Alex.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> On 24 May 2012 11:45, Alex Hooper <ah...@bmjgroup.com> wrote:
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> We are running activemq 5.5.1 in an active/passive failover
>>>>>>>>> configuration
>>>>>>>>> with JDBC Persistence to an Oracle backend. The default strategy
>>>>>>>>> for
>>>>>>>>> determining whether the current master has failed is for the
>>>>>>>>> secondary
>>>>>>>>> server to attempt to get a lock on the database table, waiting
>>>>>>>>> indefinitely
>>>>>>>>> for the lock to be granted.
>>>>>>>>>
>>>>>>>>> This is not working (at least in our context) as, after a
>>>>>>>>> relatively
>>>>>>>>> short
>>>>>>>>> time in operation (a handful of hours at most) the connection to
>>>>>>>>> Oracle
>>>>>>>>> is
>>>>>>>>> dropped. Activemq doesn't notice this, so the secondary sits there
>>>>>>>>> happily
>>>>>>>>> waiting for a lock it can now never get and, in the event of a
>>>>>>>>> failure,
>>>>>>>>> won't serve any clients as it is not a master.
>>>>>>>>>
>>>>>>>>> Is there some way to change the decision mechanism to, eg, a
>>>>>>>>> polling
>>>>>>>>> strategy? Or can anyone suggest another resolution to this problem?
>>>>>>>>>
>>>>>>>>> Alex.
>>>>>>>>> --
>>>>>>>>> Alex Hooper
>>>>>>>>> Operations Team Leader, BMJ Group, BMA House, London WC1H 9JR
>>>>>>>>> Tel: +44 (0) 20 7383 6049
>>>>>>>>> http://group.bmj.com/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________________________________
>>>>>>>>> The BMJ Group is one of the world's most trusted providers of
>>>>>>>>> medical
>>>>>>>>> information for doctors, researchers, health care workers and
>>>>>>>>> patients
>>>>>>>>> group.bmj.com.  This email and any attachments are confidential.
>>>>>>>>>  If
>>>>>>>>> you
>>>>>>>>> have received this email in error, please delete it and kindly
>>>>>>>>> notify
>>>>>>>>> us.
>>>>>>>>>  If the email contains personal views then the BMJ Group accepts no
>>>>>>>>> responsibility for these statements.  The recipient should check
>>>>>>>>> this
>>>>>>>>> email
>>>>>>>>> and attachments for viruses because the BMJ Group accepts no
>>>>>>>>> liability
>>>>>>>>> for
>>>>>>>>> any damage caused by viruses.  Emails sent or received by the BMJ
>>>>>>>>> Group
>>>>>>>>> may
>>>>>>>>> be monitored for size, traffic, distribution and content.  BMJ
>>>>>>>>> Publishing
>>>>>>>>> Group Limited trading as BMJ Group.  A private limited company,
>>>>>>>>> registered
>>>>>>>>> in England and Wales under registration number 03102371.
>>>>>>>>>  Registered
>>>>>>>>> office:
>>>>>>>>> BMA House, Tavistock Square, London WC1H 9JR, UK.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________________________________
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> --
>>>>>>> Alex Hooper
>>>>>>> Operations Team Leader, BMJ Group, BMA House, London WC1H 9JR
>>>>>>> Tel: +44 (0) 20 7383 6049
>>>>>>> http://group.bmj.com/
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________________________________
>>>>>>> The BMJ Group is one of the world's most trusted providers of medical
>>>>>>> information for doctors, researchers, health care workers and
>>>>>>> patients
>>>>>>> group.bmj.com.  This email and any attachments are confidential.  If
>>>>>>> you
>>>>>>> have received this email in error, please delete it and kindly notify
>>>>>>> us.
>>>>>>>  If the email contains personal views then the BMJ Group accepts no
>>>>>>> responsibility for these statements.  The recipient should check this
>>>>>>> email
>>>>>>> and attachments for viruses because the BMJ Group accepts no
>>>>>>> liability
>>>>>>> for
>>>>>>> any damage caused by viruses.  Emails sent or received by the BMJ
>>>>>>> Group
>>>>>>> may
>>>>>>> be monitored for size, traffic, distribution and content.  BMJ
>>>>>>> Publishing
>>>>>>> Group Limited trading as BMJ Group.  A private limited company,
>>>>>>> registered
>>>>>>> in England and Wales under registration number 03102371.  Registered
>>>>>>> office:
>>>>>>> BMA House, Tavistock Square, London WC1H 9JR, UK.
>>>>>>>
>>>>>>> _______________________________________________________________________
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> --
>>>>> Alex Hooper
>>>>> Operations Team Leader, BMJ Group, BMA House, London WC1H 9JR
>>>>> Tel: +44 (0) 20 7383 6049
>>>>> http://group.bmj.com/
>>>>>
>>>>> _______________________________________________________________________
>>>>> The BMJ Group is one of the world's most trusted providers of medical
>>>>> information for doctors, researchers, health care workers and patients
>>>>> group.bmj.com.  This email and any attachments are confidential.  If
>>>>> you
>>>>> have received this email in error, please delete it and kindly notify
>>>>> us.
>>>>>  If the email contains personal views then the BMJ Group accepts no
>>>>> responsibility for these statements.  The recipient should check this
>>>>> email
>>>>> and attachments for viruses because the BMJ Group accepts no liability
>>>>> for
>>>>> any damage caused by viruses.  Emails sent or received by the BMJ Group
>>>>> may
>>>>> be monitored for size, traffic, distribution and content.  BMJ
>>>>> Publishing
>>>>> Group Limited trading as BMJ Group.  A private limited company,
>>>>> registered
>>>>> in England and Wales under registration number 03102371.  Registered
>>>>> office:
>>>>> BMA House, Tavistock Square, London WC1H 9JR, UK.
>>>>> _______________________________________________________________________
>>>>
>>>>
>>>>
>>>>
>>>
>>> --
>>> Alex Hooper
>>> Operations Team Leader, BMJ Group, BMA House, London WC1H 9JR
>>> Tel: +44 (0) 20 7383 6049
>>> http://group.bmj.com/
>>>
>>> _______________________________________________________________________
>>> The BMJ Group is one of the world's most trusted providers of medical
>>> information for doctors, researchers, health care workers and patients
>>> group.bmj.com.  This email and any attachments are confidential.  If you
>>> have received this email in error, please delete it and kindly notify us.
>>>  If the email contains personal views then the BMJ Group accepts no
>>> responsibility for these statements.  The recipient should check this
>>> email
>>> and attachments for viruses because the BMJ Group accepts no liability
>>> for
>>> any damage caused by viruses.  Emails sent or received by the BMJ Group
>>> may
>>> be monitored for size, traffic, distribution and content.  BMJ Publishing
>>> Group Limited trading as BMJ Group.  A private limited company,
>>> registered
>>> in England and Wales under registration number 03102371.  Registered
>>> office:
>>> BMA House, Tavistock Square, London WC1H 9JR, UK.
>>> _______________________________________________________________________
>>
>>
>>
>>
>
>
> --
> Alex Hooper
> Operations Team Leader, BMJ Group, BMA House, London WC1H 9JR
> Tel: +44 (0) 20 7383 6049
> http://group.bmj.com/
>
> _______________________________________________________________________
> The BMJ Group is one of the world's most trusted providers of medical
> information for doctors, researchers, health care workers and patients
> group.bmj.com.  This email and any attachments are confidential.  If you
> have received this email in error, please delete it and kindly notify us.
>  If the email contains personal views then the BMJ Group accepts no
> responsibility for these statements.  The recipient should check this email
> and attachments for viruses because the BMJ Group accepts no liability for
> any damage caused by viruses.  Emails sent or received by the BMJ Group may
> be monitored for size, traffic, distribution and content.  BMJ Publishing
> Group Limited trading as BMJ Group.  A private limited company, registered
> in England and Wales under registration number 03102371.  Registered office:
> BMA House, Tavistock Square, London WC1H 9JR, UK.
> _______________________________________________________________________



-- 
http://fusesource.com
http://blog.garytully.com