You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@activemq.apache.org by ragul rangarajan <ra...@gmail.com> on 2022/03/24 13:55:10 UTC

ActiveMQ not recovered automatically from error "Persistent store is Full" when system become normal.

Hi Team,

In my setup, we have an ActiveMQ service that will take files located in a
remote server via NFS and notify them in the queue.

Due to some issue, the NFS server was not reachable by the server which in
turn reflected in ActiveMQ. Later, the NFS server was recovered within a
few seconds.


>  kernel: nfs: server 10.**.**.** not responding, still trying
>  kernel: nfs: server 10.**.**.** OK


When this NFS fluctuation occurred, the queue got full and consumed
persistent store which reached 100%. This issue was resolved only after the
application is restarted.


WARN  | Usage(default:store) percentUsage=100%, usage=1073764014,
> limit=1073741824, percentUsageMinDelta=1%: Persistent store is Full, 100%
> of 1073741824. Stopping producer (ID:SERVER-1641938964016-1:1:3:1) to
> prevent flooding queue://filenotify. See
> http://activemq.apache.org/producer-flow-control.html for more info
> (blocking for: 72667s) | org.apache.activemq.broker.region.Queue | ActiveMQ
> Transport: tcp:///10.**.**.**:59904@61616


We would like to know, whether ActiveMQ can be recovered automatically in
any way? or Do we need a restart to recover from this issue?

Thanks and Regards,
Ragul R

Re: ActiveMQ not recovered automatically from error "Persistent store is Full" when system become normal.

Posted by Tim Bain <tb...@alumni.duke.edu>.

Can you give more details about the timeline and observed behavior? Did the
broker declare the store to be full while the NFS server was offline or
after it came back? If after, how long after? How much data was in the
persistent store before the NFS server dropped? What's the approximate rate
of persistent messages into the broker and the rate at which they're
typically consumed?

What do the broker's logs say during and immediately after the NFS outage?
I'm not sure what I'd expect a short NFS outage to look like from the
broker's perspective (would the O/S just hang on disk writes or would
exceptions be thrown, for example), so I'm not sure how the broker would
react, but understanding the logs might help us know how the O/S manifested
this to the broker.

Tim

On Thu, Mar 24, 2022, 7:55 AM ragul rangarajan <ra...@gmail.com>
wrote:

> Hi Team,
>
> In my setup, we have an ActiveMQ service that will take files located in a
> remote server via NFS and notify them in the queue.
>
> Due to some issue, the NFS server was not reachable by the server which in
> turn reflected in ActiveMQ. Later, the NFS server was recovered within a
> few seconds.
>
>
> >  kernel: nfs: server 10.**.**.** not responding, still trying
> >  kernel: nfs: server 10.**.**.** OK
>
>
> When this NFS fluctuation occurred, the queue got full and consumed
> persistent store which reached 100%. This issue was resolved only after the
> application is restarted.
>
>
> WARN  | Usage(default:store) percentUsage=100%, usage=1073764014,
> > limit=1073741824, percentUsageMinDelta=1%: Persistent store is Full, 100%
> > of 1073741824. Stopping producer (ID:SERVER-1641938964016-1:1:3:1) to
> > prevent flooding queue://filenotify. See
> > http://activemq.apache.org/producer-flow-control.html for more info
> > (blocking for: 72667s) | org.apache.activemq.broker.region.Queue |
> ActiveMQ
> > Transport: tcp:///10.**.**.**:59904@61616
>
>
> We would like to know, whether ActiveMQ can be recovered automatically in
> any way? or Do we need a restart to recover from this issue?
>
> Thanks and Regards,
> Ragul R
>

Re: ActiveMQ not recovered automatically from error "Persistent store is Full" when system become normal.

Posted by ragul rangarajan <ra...@gmail.com>.

Hi all,

Able to see a debugging tool GitHub - Hill30/amq-kahadb-tool
<https://github.com/Hill30/amq-kahadb-tool> to view kahadb data log and e
nabling TRACE level logging for kahaDB gave some insight about the issue to
sort it out.
Referred:
https://activemq.apache.org/why-do-kahadb-log-files-remain-after-cleanup

Thanks & Regards,
Ragul R


On Wed, Apr 27, 2022 at 9:25 PM ragul rangarajan <ra...@gmail.com>
wrote:

> Hi Tim,
>
> Sorry for the delayed response.
> I don't have the specific logs during the NFS outage now and am unable to
> reproduce the same issue again with NFS outage.
>
> But able to see the same scenario on some other occasions. In both
> scenarios, able to see the persistence memory - KahaDB getting increased
> over time until a restart and few logs specific to kaha during this period.
>
> Is there any way to view the kahadb log file to debug on this issue or any
> other way to check whats in persistance memory?
>
> 2022-04-22 19:17:06,818 | INFO  | Apache ActiveMQ 5.16.1 (messagebus,
>> ID:server-36914-1650526716088-0:1) is shutting down |
>> org.apache.activemq.broker.BrokerService | Thread-43
>> 2022-04-22 19:17:06,819 | INFO  | Connector openwire stopped |
>> org.apache.activemq.broker.TransportConnector | Thread-43
>> 2022-04-22 19:17:06,824 | INFO  | PListStore:[/var/opt/na/messagebus/tmp]
>> stopped | org.apache.activemq.store.kahadb.plist.PListStoreImpl | Thread-43
>> 2022-04-22 19:17:06,824 | INFO  | Stopping async queue tasks |
>> org.apache.activemq.store.kahadb.KahaDBStore | Thread-43
>> 2022-04-22 19:17:06,824 | INFO  | Stopping async topic tasks |
>> org.apache.activemq.store.kahadb.KahaDBStore | Thread-43
>> 2022-04-22 19:17:06,824 | INFO  | Stopped KahaDB |
>> org.apache.activemq.store.kahadb.KahaDBStore | Thread-43
>> 2022-04-22 19:17:06,836 | INFO  | Apache ActiveMQ 5.16.1 (messagebus,
>> ID:server-36914-1650526716088-0:1) uptime 1 day 6 hours |
>> org.apache.activemq.broker.BrokerService | Thread-43
>> 2022-04-22 19:17:06,836 | INFO  | Apache ActiveMQ 5.16.1 (messagebus,
>> ID:server-36914-1650526716088-0:1) is shutdown |
>> org.apache.activemq.broker.BrokerService | Thread-43
>> 2022-04-22 19:17:25,022 | INFO  | DaemonStatusService listening on
>> http://127.0.0.1:14061/status |
>> com.alcatel.access.commons.util.spring.status.DaemonStatusService | main
>> 2022-04-22 19:17:25,263 | INFO  | Using Persistence Adapter:
>> KahaDBPersistenceAdapter[/var/opt/na/messagebus/kahadb] |
>> org.apache.activemq.broker.BrokerService | main
>> 2022-04-22 19:17:25,370 | INFO  | KahaDB is version 7 |
>> org.apache.activemq.store.kahadb.MessageDatabase | main
>> 2022-04-22 19:17:25,651 | INFO  | PListStore:[/var/opt/na/messagebus/tmp]
>> started | org.apache.activemq.store.kahadb.plist.PListStoreImpl | main
>> 2022-04-22 19:17:25,787 | INFO  | Apache ActiveMQ 5.16.1 (messagebus,
>> ID:server-42901-1650635245663-0:1) is starting |
>> org.apache.activemq.broker.BrokerService | main
>> 2022-04-22 19:17:25,967 | INFO  | Listening for connections at:
>> tcp://localhost:61616 |
>> org.apache.activemq.transport.TransportServerThreadSupport | main
>> 2022-04-22 19:17:25,967 | INFO  | Connector openwire started |
>> org.apache.activemq.broker.TransportConnector | main
>> 2022-04-22 19:17:25,968 | INFO  | Apache ActiveMQ 5.16.1 (na-messagebus,
>> ID:motiveche028.in.alcatel-lucent.com-42901-1650635245663-0:1) started |
>> org.apache.activemq.broker.BrokerService | main
>> 2022-04-22 19:17:25,968 | INFO  | For help or more information please
>> see: http://activemq.apache.org |
>> org.apache.activemq.broker.BrokerService | main
>> 2022-04-22 19:17:27,331 | INFO  | Starting munin node server on
>> 127.0.0.1:13001 | org.munin4j.core.Server | muninThread
>> 2022-04-22 19:17:27,454 | INFO  | PrometheusHttpRequestHandler listening
>> on http://127.0.0.1:15060/prometheus |
>> access.commons.prometheus.client.webserver.PrometheusWebServer | main
>> 2022-04-22 19:17:27,580 | INFO  | process id is : 8095 |
>> com.alcatel.access.commons.util.spring.status.DaemonStatusService | main
>
>
> Able to capture the same via Munin graph.
> [image: image.png]
>
> Regards,
> Ragul
>
> On Thu, Mar 24, 2022 at 7:25 PM ragul rangarajan <
> ragulrangarajan@gmail.com> wrote:
>
>> Hi Team,
>>
>> In my setup, we have an ActiveMQ service that will take files located in
>> a remote server via NFS and notify them in the queue.
>>
>> Due to some issue, the NFS server was not reachable by the server which
>> in turn reflected in ActiveMQ. Later, the NFS server was recovered within a
>> few seconds.
>>
>>
>>>  kernel: nfs: server 10.**.**.** not responding, still trying
>>>  kernel: nfs: server 10.**.**.** OK
>>
>>
>> When this NFS fluctuation occurred, the queue got full and consumed
>> persistent store which reached 100%. This issue was resolved only after the
>> application is restarted.
>>
>>
>> WARN  | Usage(default:store) percentUsage=100%, usage=1073764014,
>>> limit=1073741824, percentUsageMinDelta=1%: Persistent store is Full, 100%
>>> of 1073741824. Stopping producer (ID:SERVER-1641938964016-1:1:3:1) to
>>> prevent flooding queue://filenotify. See
>>> http://activemq.apache.org/producer-flow-control.html for more info
>>> (blocking for: 72667s) | org.apache.activemq.broker.region.Queue | ActiveMQ
>>> Transport: tcp:///10.**.**.**:59904@61616
>>
>>
>> We would like to know, whether ActiveMQ can be recovered automatically in
>> any way? or Do we need a restart to recover from this issue?
>>
>> Thanks and Regards,
>> Ragul R
>>
>>
>

Re: ActiveMQ not recovered automatically from error "Persistent store is Full" when system become normal.

Posted by ragul rangarajan <ra...@gmail.com>.

Hi Tim,

Sorry for the delayed response.
I don't have the specific logs during the NFS outage now and am unable to
reproduce the same issue again with NFS outage.

But able to see the same scenario on some other occasions. In both
scenarios, able to see the persistence memory - KahaDB getting increased
over time until a restart and few logs specific to kaha during this period.

Is there any way to view the kahadb log file to debug on this issue or any
other way to check whats in persistance memory?

2022-04-22 19:17:06,818 | INFO  | Apache ActiveMQ 5.16.1 (messagebus,
> ID:server-36914-1650526716088-0:1) is shutting down |
> org.apache.activemq.broker.BrokerService | Thread-43
> 2022-04-22 19:17:06,819 | INFO  | Connector openwire stopped |
> org.apache.activemq.broker.TransportConnector | Thread-43
> 2022-04-22 19:17:06,824 | INFO  | PListStore:[/var/opt/na/messagebus/tmp]
> stopped | org.apache.activemq.store.kahadb.plist.PListStoreImpl | Thread-43
> 2022-04-22 19:17:06,824 | INFO  | Stopping async queue tasks |
> org.apache.activemq.store.kahadb.KahaDBStore | Thread-43
> 2022-04-22 19:17:06,824 | INFO  | Stopping async topic tasks |
> org.apache.activemq.store.kahadb.KahaDBStore | Thread-43
> 2022-04-22 19:17:06,824 | INFO  | Stopped KahaDB |
> org.apache.activemq.store.kahadb.KahaDBStore | Thread-43
> 2022-04-22 19:17:06,836 | INFO  | Apache ActiveMQ 5.16.1 (messagebus,
> ID:server-36914-1650526716088-0:1) uptime 1 day 6 hours |
> org.apache.activemq.broker.BrokerService | Thread-43
> 2022-04-22 19:17:06,836 | INFO  | Apache ActiveMQ 5.16.1 (messagebus,
> ID:server-36914-1650526716088-0:1) is shutdown |
> org.apache.activemq.broker.BrokerService | Thread-43
> 2022-04-22 19:17:25,022 | INFO  | DaemonStatusService listening on
> http://127.0.0.1:14061/status |
> com.alcatel.access.commons.util.spring.status.DaemonStatusService | main
> 2022-04-22 19:17:25,263 | INFO  | Using Persistence Adapter:
> KahaDBPersistenceAdapter[/var/opt/na/messagebus/kahadb] |
> org.apache.activemq.broker.BrokerService | main
> 2022-04-22 19:17:25,370 | INFO  | KahaDB is version 7 |
> org.apache.activemq.store.kahadb.MessageDatabase | main
> 2022-04-22 19:17:25,651 | INFO  | PListStore:[/var/opt/na/messagebus/tmp]
> started | org.apache.activemq.store.kahadb.plist.PListStoreImpl | main
> 2022-04-22 19:17:25,787 | INFO  | Apache ActiveMQ 5.16.1 (messagebus,
> ID:server-42901-1650635245663-0:1) is starting |
> org.apache.activemq.broker.BrokerService | main
> 2022-04-22 19:17:25,967 | INFO  | Listening for connections at:
> tcp://localhost:61616 |
> org.apache.activemq.transport.TransportServerThreadSupport | main
> 2022-04-22 19:17:25,967 | INFO  | Connector openwire started |
> org.apache.activemq.broker.TransportConnector | main
> 2022-04-22 19:17:25,968 | INFO  | Apache ActiveMQ 5.16.1 (na-messagebus,
> ID:motiveche028.in.alcatel-lucent.com-42901-1650635245663-0:1) started |
> org.apache.activemq.broker.BrokerService | main
> 2022-04-22 19:17:25,968 | INFO  | For help or more information please see:
> http://activemq.apache.org | org.apache.activemq.broker.BrokerService |
> main
> 2022-04-22 19:17:27,331 | INFO  | Starting munin node server on
> 127.0.0.1:13001 | org.munin4j.core.Server | muninThread
> 2022-04-22 19:17:27,454 | INFO  | PrometheusHttpRequestHandler listening
> on http://127.0.0.1:15060/prometheus |
> access.commons.prometheus.client.webserver.PrometheusWebServer | main
> 2022-04-22 19:17:27,580 | INFO  | process id is : 8095 |
> com.alcatel.access.commons.util.spring.status.DaemonStatusService | main


Able to capture the same via Munin graph.
[image: image.png]

Regards,
Ragul

On Thu, Mar 24, 2022 at 7:25 PM ragul rangarajan <ra...@gmail.com>
wrote:

> Hi Team,
>
> In my setup, we have an ActiveMQ service that will take files located in a
> remote server via NFS and notify them in the queue.
>
> Due to some issue, the NFS server was not reachable by the server which in
> turn reflected in ActiveMQ. Later, the NFS server was recovered within a
> few seconds.
>
>
>>  kernel: nfs: server 10.**.**.** not responding, still trying
>>  kernel: nfs: server 10.**.**.** OK
>
>
> When this NFS fluctuation occurred, the queue got full and consumed
> persistent store which reached 100%. This issue was resolved only after the
> application is restarted.
>
>
> WARN  | Usage(default:store) percentUsage=100%, usage=1073764014,
>> limit=1073741824, percentUsageMinDelta=1%: Persistent store is Full, 100%
>> of 1073741824. Stopping producer (ID:SERVER-1641938964016-1:1:3:1) to
>> prevent flooding queue://filenotify. See
>> http://activemq.apache.org/producer-flow-control.html for more info
>> (blocking for: 72667s) | org.apache.activemq.broker.region.Queue | ActiveMQ
>> Transport: tcp:///10.**.**.**:59904@61616
>
>
> We would like to know, whether ActiveMQ can be recovered automatically in
> any way? or Do we need a restart to recover from this issue?
>
> Thanks and Regards,
> Ragul R
>
>