You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@activemq.apache.org by "Roelof Naude (JIRA)" <ji...@apache.org> on 2010/10/06 14:03:40 UTC

[jira] Created: (AMQ-2962) Hanging at startup

Hanging at startup
------------------

                 Key: AMQ-2962
                 URL: https://issues.apache.org/activemq/browse/AMQ-2962
             Project: ActiveMQ
          Issue Type: Bug
          Components: Message Store
    Affects Versions: 5.4.1
         Environment: Fedora Core 12
            Reporter: Roelof Naude
            Priority: Minor


Noticed today the activemq was hanging during startup. Investigation revealed that it is tied to expiration of messages at startup. The only way to recover from this was to kill -9 the VM and delete the kahadb message store.

Further testing was done whereby message expiry time was set to longer period. AMQ started up just fine. Another test was performed whereby:
1. subscribe for messages
2. send a few messages
3. disconnect consumer
4. send more messages (1 or 2 will do)
5. shutdown AMQ
6. wait for a time period exceeded the expiration time of the messages
7. start AMQ. at this time it should be hanging

Logs and thread dumps attached.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AMQ-2962) Hanging at startup

Posted by "Roelof Naude (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/AMQ-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=62715#action_62715 ] 

Roelof Naude commented on AMQ-2962:
-----------------------------------

any feedback on this? we've rolled-out on production and hit this issue during a maintenance restart.

> Hanging at startup
> ------------------
>
>                 Key: AMQ-2962
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2962
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Message Store
>    Affects Versions: 5.4.1
>         Environment: Fedora Core 12
>            Reporter: Roelof Naude
>            Priority: Minor
>         Attachments: amq.ra.threads.txt, debug_traces_hanging.log, debug_traces_startup.log, thread_dump_during_hang.txt
>
>
> Noticed today the activemq was hanging during startup. Investigation revealed that it is tied to expiration of messages at startup. The only way to recover from this was to kill -9 the VM and delete the kahadb message store.
> Further testing was done whereby message expiry time was set to longer period. AMQ started up just fine. Another test was performed whereby:
> 1. subscribe for messages
> 2. send a few messages
> 3. disconnect consumer
> 4. send more messages (1 or 2 will do)
> 5. shutdown AMQ
> 6. wait for a time period exceeded the expiration time of the messages
> 7. start AMQ. at this time it should be hanging
> Logs and thread dumps attached.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AMQ-2962) Hanging at startup

Posted by "Roelof Naude (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/AMQ-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=62612#action_62612 ] 

Roelof Naude commented on AMQ-2962:
-----------------------------------

i've found something of interest. the write lock hangs are due to the read lock being taken. i also added LOG.debug statement before each lock and after each unlock in KahaDBStore.java as well as those places in MessageDatabase.java as indicated in the attached thread dumps.

2010-10-15 15:27:16,082 INFO  [org.apache.activemq.store.kahadb.MessageDatabase] KahaDB is version 2
2010-10-15 15:27:16,172 INFO  [org.apache.activemq.store.kahadb.MessageDatabase] Recovering from the journal ...
2010-10-15 15:27:16,180 DEBUG [org.apache.activemq.store.kahadb.MessageDatabase] process remove - lock[java.util.concurrent.locks.ReentrantReadWriteLock@30de3c87[Write locks = 2, Read locks = 0]]
2010-10-15 15:27:16,183 DEBUG [org.apache.activemq.store.kahadb.MessageDatabase] process remove - unlock[java.util.concurrent.locks.ReentrantReadWriteLock@30de3c87[Write locks = 2, Read locks = 0]]
2010-10-15 15:27:16,183 INFO  [org.apache.activemq.store.kahadb.MessageDatabase] Recovery replayed 3 operations from the journal in 0.081 seconds.
2010-10-15 15:27:16,198 INFO  [org.apache.activemq.broker.BrokerService] ActiveMQ 5.4.1 JMS Message Broker (localhost) is starting
2010-10-15 15:27:16,200 INFO  [org.apache.activemq.broker.BrokerService] For help or more information please see: http://activemq.apache.org/
2010-10-15 15:27:16,358 DEBUG [org.apache.activemq.store.kahadb.KahaDBStore] getDestinations - lock[java.util.concurrent.locks.ReentrantReadWriteLock@30de3c87[Write locks = 0, Read locks = 0]]
2010-10-15 15:27:16,360 DEBUG [org.apache.activemq.store.kahadb.KahaDBStore] getDestinations - unlock[java.util.concurrent.locks.ReentrantReadWriteLock@30de3c87[Write locks = 0, Read locks = 0]]
2010-10-15 15:27:16,362 DEBUG [org.apache.activemq.broker.region.AbstractRegion] localhost adding destination: queue://Consumer.superuser.VirtualTopic.ALARM
2010-10-15 15:27:16,394 DEBUG [org.apache.activemq.store.kahadb.KahaDBStore] getMessageCount - lock[java.util.concurrent.locks.ReentrantReadWriteLock@30de3c87[Write locks = 0, Read locks = 0]]
2010-10-15 15:27:16,395 DEBUG [org.apache.activemq.store.kahadb.KahaDBStore] getMessageCount - unlock[java.util.concurrent.locks.ReentrantReadWriteLock@30de3c87[Write locks = 0, Read locks = 0]]
2010-10-15 15:27:16,395 DEBUG [org.apache.activemq.store.kahadb.KahaDBStore] recover - lock[java.util.concurrent.locks.ReentrantReadWriteLock@30de3c87[Write locks = 0, Read locks = 0]]
2010-10-15 15:27:16,421 DEBUG [org.apache.activemq.broker.region.Queue] message expired: Message ID:segfault-43426-1287148162735-2:8:1:1:1 dropped=false acked=false locked=false
2010-10-15 15:27:16,421 DEBUG [org.apache.activemq.broker.region.RegionBroker] Message expired Message ID:segfault-43426-1287148162735-2:8:1:1:1 dropped=false acked=false locked=false
2010-10-15 15:27:16,425 DEBUG [org.apache.activemq.broker.region.AbstractRegion] localhost adding destination: topic://ActiveMQ.Advisory.MessageDLQd.Topic.VirtualTopic.ALARM
2010-10-15 15:27:16,448 DEBUG [org.apache.activemq.broker.region.AbstractRegion] localhost adding destination: topic://ActiveMQ.Advisory.Expired.Topic.VirtualTopic.ALARM
2010-10-15 15:27:16,464 DEBUG [org.apache.activemq.store.kahadb.MessageDatabase] process remove - lock[java.util.concurrent.locks.ReentrantReadWriteLock@30de3c87[Write locks = 0, Read locks = 1]]

take note of:
1. recover - lock (no unlock is printed)
2. MessageDatabase] process remove

(1) takes a read lock in KahaDBStore::recover. when (2) MessageDatabase::process (final KahaRemoveMessageCommand) comes along it blocks. ditto for MessageDatabase::checkpointCleanup

i guess the next step would be to figure out why KahaDBStore::recover is not releasing the lock?

> Hanging at startup
> ------------------
>
>                 Key: AMQ-2962
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2962
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Message Store
>    Affects Versions: 5.4.1
>         Environment: Fedora Core 12
>            Reporter: Roelof Naude
>            Priority: Minor
>         Attachments: amq.ra.threads.txt, debug_traces_hanging.log, debug_traces_startup.log, thread_dump_during_hang.txt
>
>
> Noticed today the activemq was hanging during startup. Investigation revealed that it is tied to expiration of messages at startup. The only way to recover from this was to kill -9 the VM and delete the kahadb message store.
> Further testing was done whereby message expiry time was set to longer period. AMQ started up just fine. Another test was performed whereby:
> 1. subscribe for messages
> 2. send a few messages
> 3. disconnect consumer
> 4. send more messages (1 or 2 will do)
> 5. shutdown AMQ
> 6. wait for a time period exceeded the expiration time of the messages
> 7. start AMQ. at this time it should be hanging
> Logs and thread dumps attached.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AMQ-2962) Hanging at startup

Posted by "Gary Tully (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/AMQ-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=62584#action_62584 ] 

Gary Tully commented on AMQ-2962:
---------------------------------

so it looks like it is hanging in the same way as before. can you generate another thread dump of the hung broker with the sun jdk?

The connection errors, it looks like the vm transport is being used and each connection is trying to create another broker. 
Try using the tcp transport in the brokerURL in the ra.xml or jboss ra config or use waitForStart=20000 on the vm url. check out the vm transport reference for more detail

> Hanging at startup
> ------------------
>
>                 Key: AMQ-2962
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2962
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Message Store
>    Affects Versions: 5.4.1
>         Environment: Fedora Core 12
>            Reporter: Roelof Naude
>            Priority: Minor
>         Attachments: debug_traces_hanging.log, debug_traces_startup.log, thread_dump_during_hang.txt
>
>
> Noticed today the activemq was hanging during startup. Investigation revealed that it is tied to expiration of messages at startup. The only way to recover from this was to kill -9 the VM and delete the kahadb message store.
> Further testing was done whereby message expiry time was set to longer period. AMQ started up just fine. Another test was performed whereby:
> 1. subscribe for messages
> 2. send a few messages
> 3. disconnect consumer
> 4. send more messages (1 or 2 will do)
> 5. shutdown AMQ
> 6. wait for a time period exceeded the expiration time of the messages
> 7. start AMQ. at this time it should be hanging
> Logs and thread dumps attached.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AMQ-2962) Hanging at startup

Posted by "Roelof Naude (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/AMQ-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=62577#action_62577 ] 

Roelof Naude commented on AMQ-2962:
-----------------------------------

have tried the latest sun jdk:
java version "1.6.0_21"
Java(TM) SE Runtime Environment (build 1.6.0_21-b06)
Java HotSpot(TM) 64-Bit Server VM (build 17.0-b16, mixed mode)


the "hang" is not happening after changing to the AMQ resource adapter as mentioned the the previous comment. BUT: the broker does not start when expiring messages during startup. The following can be seen in the logs:
2010-10-14 15:23:55,531 INFO  [org.apache.activemq.store.kahadb.MessageDatabase] KahaDB is version 2
2010-10-14 15:23:55,573 INFO  [org.apache.activemq.store.kahadb.MessageDatabase] Recovering from the journal ...
2010-10-14 15:23:55,573 INFO  [org.apache.activemq.store.kahadb.MessageDatabase] Recovery replayed 1 operations from the journal in 0.019 seconds.
2010-10-14 15:23:55,592 INFO  [org.apache.activemq.broker.BrokerService] ActiveMQ 5.4.1 JMS Message Broker (localhost) is starting
2010-10-14 15:23:55,592 INFO  [org.apache.activemq.broker.BrokerService] For help or more information please see: http://activemq.apache.org/
2010-10-14 15:23:55,754 DEBUG [org.apache.activemq.broker.region.AbstractRegion] localhost adding destination: queue://Consumer.superuser.VirtualTopic.ALARM
2010-10-14 15:23:55,806 DEBUG [org.apache.activemq.broker.region.Queue] message expired: Message ID:segfault-48347-1287061987558-2:4:1:1:1 dropped=false acked=false locked=false
2010-10-14 15:23:55,806 DEBUG [org.apache.activemq.broker.region.RegionBroker] Message expired Message ID:segfault-48347-1287061987558-2:4:1:1:1 dropped=false acked=false locked=false
2010-10-14 15:23:55,809 DEBUG [org.apache.activemq.broker.region.AbstractRegion] localhost adding destination: topic://ActiveMQ.Advisory.MessageDLQd.Topic.VirtualTopic.ALARM
2010-10-14 15:23:55,835 DEBUG [org.apache.activemq.broker.region.AbstractRegion] localhost adding destination: topic://ActiveMQ.Advisory.Expired.Topic.VirtualTopic.ALARM

note the "message expired:"

thereafter the broker never started properly. every component trying to establish a connection to the broker results in:
2010-10-14 15:24:12,614 DEBUG [org.apache.activemq.ra.ActiveMQEndpointWorker] Failed to connect: Could not create Transport. Reason: javax.management.InstanceAlreadyExistsException: org.apache.activemq:BrokerName=localhost,Type=Broker already registered.
javax.jms.JMSException: Could not create Transport. Reason: javax.management.InstanceAlreadyExistsException: org.apache.activemq:BrokerName=localhost,Type=Broker already registered.
        at org.apache.activemq.util.JMSExceptionSupport.create(JMSExceptionSupport.java:35)
        at org.apache.activemq.ActiveMQConnectionFactory.createTransport(ActiveMQConnectionFactory.java:245)
        at org.apache.activemq.ActiveMQConnectionFactory.createActiveMQConnection(ActiveMQConnectionFactory.java:258)
        at org.apache.activemq.ActiveMQConnectionFactory.createConnection(ActiveMQConnectionFactory.java:185)
        at org.apache.activemq.ra.ActiveMQResourceAdapter.makeConnection(ActiveMQResourceAdapter.java:133)
        at org.apache.activemq.ra.ActiveMQEndpointWorker$1.run(ActiveMQEndpointWorker.java:106)
        at org.jboss.resource.work.WorkWrapper.execute(WorkWrapper.java:204)
        at org.jboss.util.threadpool.BasicTaskWrapper.run(BasicTaskWrapper.java:275)
        at EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run(PooledExecutor.java:756)
        at java.lang.Thread.run(Thread.java:619)

the broker is indeed already registered with the mbean server, but it is not really functioning.

anything else that i can try from my side?

> Hanging at startup
> ------------------
>
>                 Key: AMQ-2962
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2962
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Message Store
>    Affects Versions: 5.4.1
>         Environment: Fedora Core 12
>            Reporter: Roelof Naude
>            Priority: Minor
>         Attachments: debug_traces_hanging.log, debug_traces_startup.log, thread_dump_during_hang.txt
>
>
> Noticed today the activemq was hanging during startup. Investigation revealed that it is tied to expiration of messages at startup. The only way to recover from this was to kill -9 the VM and delete the kahadb message store.
> Further testing was done whereby message expiry time was set to longer period. AMQ started up just fine. Another test was performed whereby:
> 1. subscribe for messages
> 2. send a few messages
> 3. disconnect consumer
> 4. send more messages (1 or 2 will do)
> 5. shutdown AMQ
> 6. wait for a time period exceeded the expiration time of the messages
> 7. start AMQ. at this time it should be hanging
> Logs and thread dumps attached.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AMQ-2962) Hanging at startup

Posted by "Roelof Naude (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/activemq/browse/AMQ-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Roelof Naude updated AMQ-2962:
------------------------------

    Attachment: amq.ra.threads.txt

Broker hang during startup (using resource adapter)

> Hanging at startup
> ------------------
>
>                 Key: AMQ-2962
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2962
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Message Store
>    Affects Versions: 5.4.1
>         Environment: Fedora Core 12
>            Reporter: Roelof Naude
>            Priority: Minor
>         Attachments: amq.ra.threads.txt, debug_traces_hanging.log, debug_traces_startup.log, thread_dump_during_hang.txt
>
>
> Noticed today the activemq was hanging during startup. Investigation revealed that it is tied to expiration of messages at startup. The only way to recover from this was to kill -9 the VM and delete the kahadb message store.
> Further testing was done whereby message expiry time was set to longer period. AMQ started up just fine. Another test was performed whereby:
> 1. subscribe for messages
> 2. send a few messages
> 3. disconnect consumer
> 4. send more messages (1 or 2 will do)
> 5. shutdown AMQ
> 6. wait for a time period exceeded the expiration time of the messages
> 7. start AMQ. at this time it should be hanging
> Logs and thread dumps attached.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AMQ-2962) Hanging at startup

Posted by "Roelof Naude (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/AMQ-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=62402#action_62402 ] 

Roelof Naude commented on AMQ-2962:
-----------------------------------

initially thought that the bug may be related to:
http://bugs.sun.com/view_bug.do?bug_id=6822370

adding -XX:+UseMembar to the VM startup arguments did not help. guess something else may be cause...

> Hanging at startup
> ------------------
>
>                 Key: AMQ-2962
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2962
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Message Store
>    Affects Versions: 5.4.1
>         Environment: Fedora Core 12
>            Reporter: Roelof Naude
>            Priority: Minor
>         Attachments: debug_traces_hanging.log, debug_traces_startup.log, thread_dump_during_hang.txt
>
>
> Noticed today the activemq was hanging during startup. Investigation revealed that it is tied to expiration of messages at startup. The only way to recover from this was to kill -9 the VM and delete the kahadb message store.
> Further testing was done whereby message expiry time was set to longer period. AMQ started up just fine. Another test was performed whereby:
> 1. subscribe for messages
> 2. send a few messages
> 3. disconnect consumer
> 4. send more messages (1 or 2 will do)
> 5. shutdown AMQ
> 6. wait for a time period exceeded the expiration time of the messages
> 7. start AMQ. at this time it should be hanging
> Logs and thread dumps attached.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AMQ-2962) Hanging at startup

Posted by "Roelof Naude (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/AMQ-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=62607#action_62607 ] 

Roelof Naude commented on AMQ-2962:
-----------------------------------

have also added the -XX:+UseMembar vm option. this had no effect.

> Hanging at startup
> ------------------
>
>                 Key: AMQ-2962
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2962
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Message Store
>    Affects Versions: 5.4.1
>         Environment: Fedora Core 12
>            Reporter: Roelof Naude
>            Priority: Minor
>         Attachments: amq.ra.threads.txt, debug_traces_hanging.log, debug_traces_startup.log, thread_dump_during_hang.txt
>
>
> Noticed today the activemq was hanging during startup. Investigation revealed that it is tied to expiration of messages at startup. The only way to recover from this was to kill -9 the VM and delete the kahadb message store.
> Further testing was done whereby message expiry time was set to longer period. AMQ started up just fine. Another test was performed whereby:
> 1. subscribe for messages
> 2. send a few messages
> 3. disconnect consumer
> 4. send more messages (1 or 2 will do)
> 5. shutdown AMQ
> 6. wait for a time period exceeded the expiration time of the messages
> 7. start AMQ. at this time it should be hanging
> Logs and thread dumps attached.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AMQ-2962) Hanging at startup

Posted by "Roelof Naude (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/AMQ-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=62526#action_62526 ] 

Roelof Naude commented on AMQ-2962:
-----------------------------------

Have started to make use of the activemq resource adapter. 

The same scenario still results in issues. Broker fails to startup. JBoss will try to restart broker. This results in "instance already registered" errors. Looks like the Broker mbean was registered and this fails on subsequent restarts of the Broker.

> Hanging at startup
> ------------------
>
>                 Key: AMQ-2962
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2962
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Message Store
>    Affects Versions: 5.4.1
>         Environment: Fedora Core 12
>            Reporter: Roelof Naude
>            Priority: Minor
>         Attachments: debug_traces_hanging.log, debug_traces_startup.log, thread_dump_during_hang.txt
>
>
> Noticed today the activemq was hanging during startup. Investigation revealed that it is tied to expiration of messages at startup. The only way to recover from this was to kill -9 the VM and delete the kahadb message store.
> Further testing was done whereby message expiry time was set to longer period. AMQ started up just fine. Another test was performed whereby:
> 1. subscribe for messages
> 2. send a few messages
> 3. disconnect consumer
> 4. send more messages (1 or 2 will do)
> 5. shutdown AMQ
> 6. wait for a time period exceeded the expiration time of the messages
> 7. start AMQ. at this time it should be hanging
> Logs and thread dumps attached.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AMQ-2962) Hanging at startup

Posted by "Roelof Naude (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/activemq/browse/AMQ-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Roelof Naude updated AMQ-2962:
------------------------------

       Priority: Major  (was: Minor)
    Environment: 
Fedora Core 12
OracleSun java 1.6:
java version "1.6.0_21"
Java(TM) SE Runtime Environment (build 1.6.0_21-b06)
Java HotSpot(TM) 64-Bit Server VM (build 17.0-b16, mixed mode)

JBoss 4.2.3 using AMQ resource adapter.

  was:Fedora Core 12

     Regression: [Regression]

> Hanging at startup
> ------------------
>
>                 Key: AMQ-2962
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2962
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Message Store
>    Affects Versions: 5.4.1
>         Environment: Fedora Core 12
> OracleSun java 1.6:
> java version "1.6.0_21"
> Java(TM) SE Runtime Environment (build 1.6.0_21-b06)
> Java HotSpot(TM) 64-Bit Server VM (build 17.0-b16, mixed mode)
> JBoss 4.2.3 using AMQ resource adapter.
>            Reporter: Roelof Naude
>         Attachments: amq.ra.threads.txt, debug_traces_hanging.log, debug_traces_startup.log, thread_dump_during_hang.txt
>
>
> Noticed today the activemq was hanging during startup. Investigation revealed that it is tied to expiration of messages at startup. The only way to recover from this was to kill -9 the VM and delete the kahadb message store.
> Further testing was done whereby message expiry time was set to longer period. AMQ started up just fine. Another test was performed whereby:
> 1. subscribe for messages
> 2. send a few messages
> 3. disconnect consumer
> 4. send more messages (1 or 2 will do)
> 5. shutdown AMQ
> 6. wait for a time period exceeded the expiration time of the messages
> 7. start AMQ. at this time it should be hanging
> Logs and thread dumps attached.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AMQ-2962) Hanging at startup

Posted by "Roelof Naude (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/AMQ-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=62613#action_62613 ] 

Roelof Naude commented on AMQ-2962:
-----------------------------------

KahaDBStore::recover locks itself:

2010-10-15 16:02:33,580 DEBUG [org.apache.activemq.store.kahadb.KahaDBStore] recover - lock[java.util.concurrent.locks.ReentrantReadWriteLock@26e7c832[Write locks = 0, Read locks = 0]]
2010-10-15 16:02:33,620 DEBUG [org.apache.activemq.broker.region.Queue] message expired: Message ID:segfault-52137-1287150445562-2:6:1:1:1 dropped=false acked=false locked=false
2010-10-15 16:02:33,620 DEBUG [org.apache.activemq.broker.region.RegionBroker] Message expired Message ID:segfault-52137-1287150445562-2:6:1:1:1 dropped=false acked=false locked=false
2010-10-15 16:02:33,624 DEBUG [org.apache.activemq.broker.region.AbstractRegion] localhost adding destination: topic://ActiveMQ.Advisory.MessageDLQd.Topic.VirtualTopic.ALARM
2010-10-15 16:02:33,648 DEBUG [org.apache.activemq.broker.region.AbstractRegion] localhost adding destination: topic://ActiveMQ.Advisory.Expired.Topic.VirtualTopic.ALARM
2010-10-15 16:02:33,664 DEBUG [org.apache.activemq.store.kahadb.MessageDatabase] process remove - lock[java.util.concurrent.locks.ReentrantReadWriteLock@26e7c832[Write locks = 0, Read locks = 1]]
java.lang.Exception: STACKTRACE
        at org.apache.activemq.store.kahadb.MessageDatabase.process(MessageDatabase.java:902)
        at org.apache.activemq.store.kahadb.MessageDatabase$13.visit(MessageDatabase.java:847)
        at org.apache.activemq.store.kahadb.data.KahaRemoveMessageCommand.visit(KahaRemoveMessageCommand.java:220)
        at org.apache.activemq.store.kahadb.MessageDatabase.process(MessageDatabase.java:839)
        at org.apache.activemq.store.kahadb.MessageDatabase.store(MessageDatabase.java:763)
        at org.apache.activemq.store.kahadb.KahaDBStore$KahaDBMessageStore.removeMessage(KahaDBStore.java:378)
        at org.apache.activemq.store.kahadb.KahaDBStore$KahaDBMessageStore.removeAsyncMessage(KahaDBStore.java:350)
        at org.apache.activemq.store.kahadb.KahaDBTransactionStore.removeAsyncMessage(KahaDBTransactionStore.java:457)
        at org.apache.activemq.store.kahadb.KahaDBTransactionStore$1.removeAsyncMessage(KahaDBTransactionStore.java:171)
        at org.apache.activemq.broker.region.Queue.acknowledge(Queue.java:803)
        at org.apache.activemq.broker.region.Queue.removeMessage(Queue.java:1534)
        at org.apache.activemq.broker.region.Queue.removeMessage(Queue.java:1526)
        at org.apache.activemq.broker.region.Queue.messageExpired(Queue.java:1591)
        at org.apache.activemq.broker.region.Queue.messageExpired(Queue.java:1581)
        at org.apache.activemq.broker.region.Queue$5.recoverMessage(Queue.java:292)
        at org.apache.activemq.store.kahadb.KahaDBStore$KahaDBMessageStore$4.execute(KahaDBStore.java:479)
        at org.apache.kahadb.page.Transaction.execute(Transaction.java:728)
        at org.apache.activemq.store.kahadb.KahaDBStore$KahaDBMessageStore.recover(KahaDBStore.java:471)


> Hanging at startup
> ------------------
>
>                 Key: AMQ-2962
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2962
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Message Store
>    Affects Versions: 5.4.1
>         Environment: Fedora Core 12
>            Reporter: Roelof Naude
>            Priority: Minor
>         Attachments: amq.ra.threads.txt, debug_traces_hanging.log, debug_traces_startup.log, thread_dump_during_hang.txt
>
>
> Noticed today the activemq was hanging during startup. Investigation revealed that it is tied to expiration of messages at startup. The only way to recover from this was to kill -9 the VM and delete the kahadb message store.
> Further testing was done whereby message expiry time was set to longer period. AMQ started up just fine. Another test was performed whereby:
> 1. subscribe for messages
> 2. send a few messages
> 3. disconnect consumer
> 4. send more messages (1 or 2 will do)
> 5. shutdown AMQ
> 6. wait for a time period exceeded the expiration time of the messages
> 7. start AMQ. at this time it should be hanging
> Logs and thread dumps attached.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AMQ-2962) Hanging at startup

Posted by "Roelof Naude (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/AMQ-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=62606#action_62606 ] 

Roelof Naude commented on AMQ-2962:
-----------------------------------

amq.ra.threads.txt contains the new thread stack dump. the root cause seems pretty much the same as before:

"Starting ActiveMQ Broker" daemon prio=10 tid=0x00007f43cce39000 nid=0x16ce waiting on condition [0x00007f43eceb8000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00007f441b27ae08> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:842)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1178)
        at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:807)
        at org.apache.activemq.store.kahadb.MessageDatabase.process(MessageDatabase.java:882)

"ActiveMQ Journal Checkpoint Worker" daemon prio=10 tid=0x00007f43f815c800 nid=0x16d2 waiting on condition [0x00007f43ec792000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00007f441b27ae08> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:842)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1178)
        at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:807)
        at org.apache.activemq.store.kahadb.MessageDatabase.checkpointCleanup(MessageDatabase.java:686)


> Hanging at startup
> ------------------
>
>                 Key: AMQ-2962
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2962
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Message Store
>    Affects Versions: 5.4.1
>         Environment: Fedora Core 12
>            Reporter: Roelof Naude
>            Priority: Minor
>         Attachments: amq.ra.threads.txt, debug_traces_hanging.log, debug_traces_startup.log, thread_dump_during_hang.txt
>
>
> Noticed today the activemq was hanging during startup. Investigation revealed that it is tied to expiration of messages at startup. The only way to recover from this was to kill -9 the VM and delete the kahadb message store.
> Further testing was done whereby message expiry time was set to longer period. AMQ started up just fine. Another test was performed whereby:
> 1. subscribe for messages
> 2. send a few messages
> 3. disconnect consumer
> 4. send more messages (1 or 2 will do)
> 5. shutdown AMQ
> 6. wait for a time period exceeded the expiration time of the messages
> 7. start AMQ. at this time it should be hanging
> Logs and thread dumps attached.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AMQ-2962) Hanging at startup

Posted by "Gary Tully (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/AMQ-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=62529#action_62529 ] 

Gary Tully commented on AMQ-2962:
---------------------------------

The stack trace, (in the hung thread dump) is odd, seems to be stuck on a reentrant writelock lock call, when it already has the writeLock, so it is as if the reentrant bit is not working!
Can you try the sun jdk?

> Hanging at startup
> ------------------
>
>                 Key: AMQ-2962
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2962
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Message Store
>    Affects Versions: 5.4.1
>         Environment: Fedora Core 12
>            Reporter: Roelof Naude
>            Priority: Minor
>         Attachments: debug_traces_hanging.log, debug_traces_startup.log, thread_dump_during_hang.txt
>
>
> Noticed today the activemq was hanging during startup. Investigation revealed that it is tied to expiration of messages at startup. The only way to recover from this was to kill -9 the VM and delete the kahadb message store.
> Further testing was done whereby message expiry time was set to longer period. AMQ started up just fine. Another test was performed whereby:
> 1. subscribe for messages
> 2. send a few messages
> 3. disconnect consumer
> 4. send more messages (1 or 2 will do)
> 5. shutdown AMQ
> 6. wait for a time period exceeded the expiration time of the messages
> 7. start AMQ. at this time it should be hanging
> Logs and thread dumps attached.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AMQ-2962) Hanging at startup

Posted by "Roelof Naude (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/AMQ-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=62388#action_62388 ] 

Roelof Naude commented on AMQ-2962:
-----------------------------------

Anything else that can be tried? Can i provide any other information?

> Hanging at startup
> ------------------
>
>                 Key: AMQ-2962
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2962
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Message Store
>    Affects Versions: 5.4.1
>         Environment: Fedora Core 12
>            Reporter: Roelof Naude
>            Priority: Minor
>         Attachments: debug_traces_hanging.log, debug_traces_startup.log, thread_dump_during_hang.txt
>
>
> Noticed today the activemq was hanging during startup. Investigation revealed that it is tied to expiration of messages at startup. The only way to recover from this was to kill -9 the VM and delete the kahadb message store.
> Further testing was done whereby message expiry time was set to longer period. AMQ started up just fine. Another test was performed whereby:
> 1. subscribe for messages
> 2. send a few messages
> 3. disconnect consumer
> 4. send more messages (1 or 2 will do)
> 5. shutdown AMQ
> 6. wait for a time period exceeded the expiration time of the messages
> 7. start AMQ. at this time it should be hanging
> Logs and thread dumps attached.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AMQ-2962) Hanging at startup

Posted by "Roelof Naude (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/activemq/browse/AMQ-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Roelof Naude updated AMQ-2962:
------------------------------

    Attachment: thread_dump_during_hang.txt
                debug_traces_startup.log
                debug_traces_hanging.log

debug_traces_hanging -> log file entries during hang
debug_traces_startup -> debug traces during startup with non-expired, persisted messages
thread_dump_during_hang -> thread stack dump during hang

> Hanging at startup
> ------------------
>
>                 Key: AMQ-2962
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2962
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Message Store
>    Affects Versions: 5.4.1
>         Environment: Fedora Core 12
>            Reporter: Roelof Naude
>            Priority: Minor
>         Attachments: debug_traces_hanging.log, debug_traces_startup.log, thread_dump_during_hang.txt
>
>
> Noticed today the activemq was hanging during startup. Investigation revealed that it is tied to expiration of messages at startup. The only way to recover from this was to kill -9 the VM and delete the kahadb message store.
> Further testing was done whereby message expiry time was set to longer period. AMQ started up just fine. Another test was performed whereby:
> 1. subscribe for messages
> 2. send a few messages
> 3. disconnect consumer
> 4. send more messages (1 or 2 will do)
> 5. shutdown AMQ
> 6. wait for a time period exceeded the expiration time of the messages
> 7. start AMQ. at this time it should be hanging
> Logs and thread dumps attached.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AMQ-2962) Hanging at startup

Posted by "Roelof Naude (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/activemq/browse/AMQ-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Roelof Naude updated AMQ-2962:
------------------------------

    Attachment: AMQ2962.patch

Patch that resolve the issue for us. This patch is ugly. Someone more knowledgeable on the order of locks may provide more insight.

> Hanging at startup
> ------------------
>
>                 Key: AMQ-2962
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2962
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Message Store
>    Affects Versions: 5.4.1
>         Environment: Fedora Core 12
> OracleSun java 1.6:
> java version "1.6.0_21"
> Java(TM) SE Runtime Environment (build 1.6.0_21-b06)
> Java HotSpot(TM) 64-Bit Server VM (build 17.0-b16, mixed mode)
> JBoss 4.2.3 using AMQ resource adapter.
>            Reporter: Roelof Naude
>         Attachments: amq.ra.threads.txt, AMQ2962.patch, debug_traces_hanging.log, debug_traces_startup.log, thread_dump_during_hang.txt
>
>
> Noticed today the activemq was hanging during startup. Investigation revealed that it is tied to expiration of messages at startup. The only way to recover from this was to kill -9 the VM and delete the kahadb message store.
> Further testing was done whereby message expiry time was set to longer period. AMQ started up just fine. Another test was performed whereby:
> 1. subscribe for messages
> 2. send a few messages
> 3. disconnect consumer
> 4. send more messages (1 or 2 will do)
> 5. shutdown AMQ
> 6. wait for a time period exceeded the expiration time of the messages
> 7. start AMQ. at this time it should be hanging
> Logs and thread dumps attached.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.