You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@activemq.apache.org by Robert Davies <ra...@gmail.com> on 2013/05/28 23:18:34 UTC
Re: ActiveMQ crashes frequently
Hi Mark,
could you produce a test case for your problem - it would help us identify the problem a lot quicker
thanks,
Rob
On 30 Apr 2013, at 16:40, fenbers <Ma...@noaa.gov> wrote:
> Zagan wrote
>> Can you please check if your .log files in the /data directory are cleaned
>> up? On basis of the information I suppose this behaviour is due to a
>> misconfiguration of your clients.
>> If this is the case often broken log file cleanup is a symptom.
>
> I get the same error as brought up in this thread (KahaDB failed to store to
> Journal). And yes, I also have a problem with the numbered .log files not
> all getting cleaned up (most files are removed appropriately). I have
> suspected a client configuration problem for a long time, but can't figure
> out what's wrong -- even with TRACE logging turned on. In the meantime, I
> have to cope with ActiveMQ crashing (i.e., shutting itself down) about every
> two days. The logs point to a disk storage problem, but I have plenty of
> space, so that's not the issue! I've tried a couple of different Linux
> boxes and both local and NFS mounts, and this issue occurs on both of them.
>
> I'm at a loss!! I'm running 5.8.0...
>
> Mark
>
>
>
> --
> View this message in context: http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4666469.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
Re: ActiveMQ crashes frequently
Posted by Matt Pavlovich <ma...@gmail.com>.
> often taking 10 seconds to list just 5 files in an NFS-mounted
That screams network issue.
+ Check your network interface stats for errors (ifconfig)
+ Check your duplex setting -- many switches do not really do "auto" well. Hard-coding to full-duplex helps, but if you have a managed switch, you need to make sure the switch ports are set to FD as well.
+ Swap network cables (I know, I know.. but its happened)
+ Are you using any centralized user id store for user accounts? A long 'ls' could also be a slow libnss query trying to resolve uid -> user name
+ syslog and 'dmesg' for kernel errors, as Johan mentioned
The error occurring with local disk is puzzling-- are you using any VM technology?
Re: ActiveMQ crashes frequently
Posted by Johan Edstrom <se...@gmail.com>.
It sounds like your switch fabric might be the issue?
Those types of hangs should show pretty frequent kernel alarms.
On Jun 2, 2013, at 21:10, Christian Posta <ch...@gmail.com> wrote:
> You should checkout the failover transport to handle reconnecting.
>
> On Sunday, June 2, 2013, fenbers wrote:
>
>>
>>
>>
>>
>>
>> I don't know how to determine the NFS version but we are running on
>> RHEL 5.5.
>>
>> I have not checked the syslog. Thanks for the tip. I will
>> do that
>> after our morning Operations.
>>
>> We are also very inclined to believe this is an NFS issue, based on
>> behaviors network-wide which have nothing to do with ActiveMQ, e.g,
>> often taking 10 seconds to list just 5 files in an NFS-mounted
>> directory.
>>
>> So, we are creating an action plan this weekend to eliminate as many
>> NFS mount points as possible, and seeing how that helps the
>> situation. The plan needs approval/buy-in from key people to be
>> implemented, so it may be a couple of weeks to implement the
>> plan.
>> In the meantime, ActiveMQ either shuts itself down or behaves in
>> rather despondent ways, so we find we are having to restart ActiveMQ
>> every 3 or 4 hours (and this frequency is slowly increasing).
>>
>> Once ActiveMQ is rebooted, we find that our producers and our
>> consumers have to be shut down and relaunched in order to
>> reestablish the connection with ActiveMQ. This is a royal
>> pain!
>> However, a producer will throw an exception whenever it tries to
>> send a message through a lost connection, and so I catch the
>> exception where I close the connection and reopen it. Thus, my
>> producers are able to reconnect automatically in the event ActiveMQ
>> is restarted.
>>
>> But with the consumers, no exception is thrown as it waits for
>> notifications. It simply waits for a notification that never
>> happens after the connection with ActiveMQ is lost. So what is
>> your
>> recommended method for a consumer to check for a disconnection??
>> (Maybe I should post his question as a separate thread...)
>>
>> Mark
>>
>>
>> On 5/29/2013 3:21 AM, rajdavies [via
>> ActiveMQ] wrote:
>>
>> Ultimately I'm pretty confident this problem is an
>> NFS problem - and as Johan has already let the cat out of the
>> bag
>> ;) - let me ask the following:
>>
>>
>> Which version of NFS 4 are you using and which environment?
>>
>> Have you checked the system logs for NFS errors on all the
>> machines running ActiveMQ brokers ?
>>
>>
>> thanks,
>>
>>
>> Rob
>>
>>
>> On 29 May 2013, at 00:46, Christian Posta < [hidden email] >
>> wrote:
>>
>>
>> > I can make two recommendations.
>>
>> >
>> > #1, being the preferred, create a test case that shows
>> this... that will
>>
>> > give us the best chance of finding out what's going on...
>> take a look at
>>
>> > the following test cases in the activemq source code to
>> give you an idea
>>
>> > about how to go about doing it...
>>
>> >
>> >
>> http://svn.apache.org/viewvc/activemq/trunk/activemq-unit-tests/src/test/java/org/apache/activemq/usecases/
>> >
>> >
>> http://svn.apache.org/viewvc/activemq/trunk/activemq-unit-tests/src/test/java/org/apache/activemq/bugs/
>> >
>> >
>> http://svn.apache.org/viewvc/activemq/trunk/activemq-unit-tests/src/test/java/org/apache/activemq/test/JmsTopicSendReceiveTest.java?view=markup
>> >
>> >
>> > #2, if creating a test case doesn't sound like something
>> you want to get
>>
>> > into.. i guess, give us the exact configs of broker,
>> clients, number of
>>
>> > consumers, number of topics, message sizes, etc, etc all
>> details and if one
>>
>> > of us gets the urge we can try it out on our boxes. this
>> will not be nearly
>>
>> > as good as #1, and will provide a higher barrier to entry
>> because we spend
>>
>> > our spare time doing this and like to spend that time
>> debugging and fixing,
>>
>> > and not setting up environments and usecases which may not
>> even show a bug
>>
>> > :)
>>
>> >
>> >
>> >
>> >
>> > On Tue, May 28, 2013 at 4:34 PM, fenbers < [hidden email]
>> >
>> wrote:
>>
>> >
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> I'm getting the Sync exception on both,
>> local and
>> NFS.&nbsp;
>>
>> >> Originally,
>>
>> >> I was only using a local disk, but there
>> wasn't much
>> disk space for
>>
>> >> the ever growing list of 33MB enumerated
>> .log files
>> that weren't
>>
>> >> cleaned up.&nbsp; So I reconfigured
>> ActiveMQ to
>> put these db files on
>>
>> >> an
>>
>> >> NFS mount.&nbsp; But the sync exceptions
>> occurred either way.
>>
>> >>
>> >> I've changed *all* my consumers to
>> AUTO_ACKNOWLEDGE,
>> thinking that
>>
>> >> maybe an ACKNOWLEDGEment leak was causing the
>> undeleted files.&nbsp;
>>
>> >> That
>>
>> >> didn't help...&nbsp; The TRACE level
>> logging
>> points to only two of my 5
>>
>> >> topics that accumulate these undeleted db
>> files.&nbsp; So I've
>>
>> >> concentrated by scrutiny over consumers of
>> these two
>> topics.&nbsp; But
>>
>> >> have not found anything out of the
>> ordinary.&nbsp;
>>
>> >>
>> >> What is puzzling me still, is that the
>> frequency of
>> the log file
>>
>> >> build-up and the frequency of exceptions
>> continues
>> to increase even
>>
>> >> though the amount of messages sent per day
>> by the
>> producers remains
>>
>> >> nearly constant...
>>
>> >> Mark
>>
>> >>
>> >> On 5/28/2013 6:06 PM, ceposta [via
>>
>> >> ActiveMQ] wrote:
>>
>> >>
>> >> Sounds like there's multiple issues...
>>
>> >>
>> >> You're journal files aren't being
>> cleaned up, AND
>> you're getting
>>
>> >> the Sync
>>
>> >>
>> >> exception?
>>
>> >>
>> >> You get the sync exception on local
>> disk mount? Or
>> just NFS?
>>
>> >>
>> >>
>> >> If the journals aren't being cleaned
>> up, are your
>> consumers
>>
>> >> properly
>>
>> >>
>> >> ack'ing messages?
>>
>> >>
>> >>
>> >>
>> >> On Tue, May 28, 2013 at 2:42 PM,
>> fenbers &lt;
>> [hidden email] &gt;
>>
>> >> wrote:
>>
>> >>
>> >>
>> >> &gt;
>>
>> >>
>> >> &gt;
>>
>> >>
>> >> &gt;
>>
>> >>
>> >> &gt;
>>
>> >>
>> >> &gt;
>>
>> >>
>> >> &gt; &nbsp; &nbsp;
>> I would LOVE to
>> help you help me!&amp;nbsp; But
>>
>> >> I have
>>
>> >> no idea how to go
>>
>> >>
>> >> &gt; &nbsp; &nbsp;
>> about making a
>> test case.&amp;nbsp; If you
>>
>> >> could drop
>>
>> >> some hints in this
>>
>> >>
>> >> &gt; &nbsp; &nbsp;
>> regard, I might
>> be able to produce one.
>>
>> >>
>> >> &gt;
>>
>> >>
>> >> &gt; &nbsp; &nbsp;
>> My ActiveMQ
>> issues seem to be related to network
>>
>> >> slowness, which we
>>
>> >>
>> >> &gt; &nbsp; &nbsp;
>> are diagnosing
>> separately.&amp;nbsp; Or maybe
>>
>> >> it is the
>>
>> >> other way around,
>>
>> >>
>> >> &gt; &nbsp; &nbsp;
>> where ActiveMQ
>> problems are causing network
>>
>> >> sluggishness.&amp;nbsp;
>> Either
>>
>> >>
>> >> &gt; &nbsp; &nbsp;
>> way, there seems
>> to be a correlation, except
>>
>> >> that when
>>
>> >> network
>>
>> >>
>> >> &gt; &nbsp; &nbsp;
>> responsiveness
>> improves, ActiveMQ does not.
>>
>> >>
>> >> &gt;
>>
>> >>
>> >> &gt; &nbsp; &nbsp;
>> The problem I'm
>> having with AMQ is progressive,
>>
>> >> which
>>
>> >> is even more
>>
>> >>
>> >> &gt; &nbsp; &nbsp;
>> puzzling, because
>> we are not adding to the
>>
>> >> number of
>>
>> >> messages that
>>
>> >>
>> >> &gt; &nbsp; &nbsp;
>> AMQ has to
>> handle.&amp;nbsp; Today, we were up
>>
>> >> to 191
>>
>> >> undeleted db-NNN.log
>>
>> >>
>> >> &gt; &nbsp; &nbsp;
>> files in the
>> database directory before I
>>
>> >> stopped AMQ
>>
>> >> and deleted
>>
>> >>
>> >> &gt; &nbsp; &nbsp;
>> them.&amp;nbsp;&amp;nbsp; NNN was up to 451, so
>>
>> >> 260
>>
>> >> files had been cleaned up
>>
>> >>
>> >> &gt; by AMQ's
>>
>> >>
>> >> &gt; &nbsp; &nbsp;
>> automatic
>> processes...
>>
>> >>
>> >> &gt;
>>
>> >>
>> >> &gt; &nbsp; &nbsp;
>> Will log files
>> assist you in helping
>>
>> >> me?&amp;nbsp; I
>>
>> >> have TRACE level
>>
>> >>
>> >> &gt; &nbsp; &nbsp;
>> messages turned
>> on, so they are quite large.
>>
>> >>
>> >> &gt;
>>
>> >>
>>
>> <
>
>
>
> --
> *Christian Posta*
> http://www.christianposta.com/blog
> twitter: @christianposta
Re: ActiveMQ crashes frequently
Posted by Christian Posta <ch...@gmail.com>.
You should checkout the failover transport to handle reconnecting.
On Sunday, June 2, 2013, fenbers wrote:
>
>
>
>
>
> I don't know how to determine the NFS version but we are running on
> RHEL 5.5.
>
> I have not checked the syslog. Thanks for the tip. I will
> do that
> after our morning Operations.
>
> We are also very inclined to believe this is an NFS issue, based on
> behaviors network-wide which have nothing to do with ActiveMQ, e.g,
> often taking 10 seconds to list just 5 files in an NFS-mounted
> directory.
>
> So, we are creating an action plan this weekend to eliminate as many
> NFS mount points as possible, and seeing how that helps the
> situation. The plan needs approval/buy-in from key people to be
> implemented, so it may be a couple of weeks to implement the
> plan.
> In the meantime, ActiveMQ either shuts itself down or behaves in
> rather despondent ways, so we find we are having to restart ActiveMQ
> every 3 or 4 hours (and this frequency is slowly increasing).
>
> Once ActiveMQ is rebooted, we find that our producers and our
> consumers have to be shut down and relaunched in order to
> reestablish the connection with ActiveMQ. This is a royal
> pain!
> However, a producer will throw an exception whenever it tries to
> send a message through a lost connection, and so I catch the
> exception where I close the connection and reopen it. Thus, my
> producers are able to reconnect automatically in the event ActiveMQ
> is restarted.
>
> But with the consumers, no exception is thrown as it waits for
> notifications. It simply waits for a notification that never
> happens after the connection with ActiveMQ is lost. So what is
> your
> recommended method for a consumer to check for a disconnection??
> (Maybe I should post his question as a separate thread...)
>
> Mark
>
>
> On 5/29/2013 3:21 AM, rajdavies [via
> ActiveMQ] wrote:
>
> Ultimately I'm pretty confident this problem is an
> NFS problem - and as Johan has already let the cat out of the
> bag
> ;) - let me ask the following:
>
>
> Which version of NFS 4 are you using and which environment?
>
> Have you checked the system logs for NFS errors on all the
> machines running ActiveMQ brokers ?
>
>
> thanks,
>
>
> Rob
>
>
> On 29 May 2013, at 00:46, Christian Posta < [hidden email] >
> wrote:
>
>
> > I can make two recommendations.
>
> >
> > #1, being the preferred, create a test case that shows
> this... that will
>
> > give us the best chance of finding out what's going on...
> take a look at
>
> > the following test cases in the activemq source code to
> give you an idea
>
> > about how to go about doing it...
>
> >
> >
> http://svn.apache.org/viewvc/activemq/trunk/activemq-unit-tests/src/test/java/org/apache/activemq/usecases/
> >
> >
> http://svn.apache.org/viewvc/activemq/trunk/activemq-unit-tests/src/test/java/org/apache/activemq/bugs/
> >
> >
> http://svn.apache.org/viewvc/activemq/trunk/activemq-unit-tests/src/test/java/org/apache/activemq/test/JmsTopicSendReceiveTest.java?view=markup
> >
> >
> > #2, if creating a test case doesn't sound like something
> you want to get
>
> > into.. i guess, give us the exact configs of broker,
> clients, number of
>
> > consumers, number of topics, message sizes, etc, etc all
> details and if one
>
> > of us gets the urge we can try it out on our boxes. this
> will not be nearly
>
> > as good as #1, and will provide a higher barrier to entry
> because we spend
>
> > our spare time doing this and like to spend that time
> debugging and fixing,
>
> > and not setting up environments and usecases which may not
> even show a bug
>
> > :)
>
> >
> >
> >
> >
> > On Tue, May 28, 2013 at 4:34 PM, fenbers < [hidden email]
> >
> wrote:
>
> >
> >>
> >>
> >>
> >>
> >>
> >> I'm getting the Sync exception on both,
> local and
> NFS.&nbsp;
>
> >> Originally,
>
> >> I was only using a local disk, but there
> wasn't much
> disk space for
>
> >> the ever growing list of 33MB enumerated
> .log files
> that weren't
>
> >> cleaned up.&nbsp; So I reconfigured
> ActiveMQ to
> put these db files on
>
> >> an
>
> >> NFS mount.&nbsp; But the sync exceptions
> occurred either way.
>
> >>
> >> I've changed *all* my consumers to
> AUTO_ACKNOWLEDGE,
> thinking that
>
> >> maybe an ACKNOWLEDGEment leak was causing the
> undeleted files.&nbsp;
>
> >> That
>
> >> didn't help...&nbsp; The TRACE level
> logging
> points to only two of my 5
>
> >> topics that accumulate these undeleted db
> files.&nbsp; So I've
>
> >> concentrated by scrutiny over consumers of
> these two
> topics.&nbsp; But
>
> >> have not found anything out of the
> ordinary.&nbsp;
>
> >>
> >> What is puzzling me still, is that the
> frequency of
> the log file
>
> >> build-up and the frequency of exceptions
> continues
> to increase even
>
> >> though the amount of messages sent per day
> by the
> producers remains
>
> >> nearly constant...
>
> >> Mark
>
> >>
> >> On 5/28/2013 6:06 PM, ceposta [via
>
> >> ActiveMQ] wrote:
>
> >>
> >> Sounds like there's multiple issues...
>
> >>
> >> You're journal files aren't being
> cleaned up, AND
> you're getting
>
> >> the Sync
>
> >>
> >> exception?
>
> >>
> >> You get the sync exception on local
> disk mount? Or
> just NFS?
>
> >>
> >>
> >> If the journals aren't being cleaned
> up, are your
> consumers
>
> >> properly
>
> >>
> >> ack'ing messages?
>
> >>
> >>
> >>
> >> On Tue, May 28, 2013 at 2:42 PM,
> fenbers &lt;
> [hidden email] &gt;
>
> >> wrote:
>
> >>
> >>
> >> &gt;
>
> >>
> >> &gt;
>
> >>
> >> &gt;
>
> >>
> >> &gt;
>
> >>
> >> &gt;
>
> >>
> >> &gt; &nbsp; &nbsp;
> I would LOVE to
> help you help me!&amp;nbsp; But
>
> >> I have
>
> >> no idea how to go
>
> >>
> >> &gt; &nbsp; &nbsp;
> about making a
> test case.&amp;nbsp; If you
>
> >> could drop
>
> >> some hints in this
>
> >>
> >> &gt; &nbsp; &nbsp;
> regard, I might
> be able to produce one.
>
> >>
> >> &gt;
>
> >>
> >> &gt; &nbsp; &nbsp;
> My ActiveMQ
> issues seem to be related to network
>
> >> slowness, which we
>
> >>
> >> &gt; &nbsp; &nbsp;
> are diagnosing
> separately.&amp;nbsp; Or maybe
>
> >> it is the
>
> >> other way around,
>
> >>
> >> &gt; &nbsp; &nbsp;
> where ActiveMQ
> problems are causing network
>
> >> sluggishness.&amp;nbsp;
> Either
>
> >>
> >> &gt; &nbsp; &nbsp;
> way, there seems
> to be a correlation, except
>
> >> that when
>
> >> network
>
> >>
> >> &gt; &nbsp; &nbsp;
> responsiveness
> improves, ActiveMQ does not.
>
> >>
> >> &gt;
>
> >>
> >> &gt; &nbsp; &nbsp;
> The problem I'm
> having with AMQ is progressive,
>
> >> which
>
> >> is even more
>
> >>
> >> &gt; &nbsp; &nbsp;
> puzzling, because
> we are not adding to the
>
> >> number of
>
> >> messages that
>
> >>
> >> &gt; &nbsp; &nbsp;
> AMQ has to
> handle.&amp;nbsp; Today, we were up
>
> >> to 191
>
> >> undeleted db-NNN.log
>
> >>
> >> &gt; &nbsp; &nbsp;
> files in the
> database directory before I
>
> >> stopped AMQ
>
> >> and deleted
>
> >>
> >> &gt; &nbsp; &nbsp;
> them.&amp;nbsp;&amp;nbsp; NNN was up to 451, so
>
> >> 260
>
> >> files had been cleaned up
>
> >>
> >> &gt; by AMQ's
>
> >>
> >> &gt; &nbsp; &nbsp;
> automatic
> processes...
>
> >>
> >> &gt;
>
> >>
> >> &gt; &nbsp; &nbsp;
> Will log files
> assist you in helping
>
> >> me?&amp;nbsp; I
>
> >> have TRACE level
>
> >>
> >> &gt; &nbsp; &nbsp;
> messages turned
> on, so they are quite large.
>
> >>
> >> &gt;
>
> >>
>
> <
--
*Christian Posta*
http://www.christianposta.com/blog
twitter: @christianposta
Re: ActiveMQ crashes frequently
Posted by fenbers <Ma...@noaa.gov>.
I don't know how to determine the NFS version but we are running on
RHEL 5.5.
I have not checked the syslog. Thanks for the tip. I will do that
after our morning Operations.
We are also very inclined to believe this is an NFS issue, based on
behaviors network-wide which have nothing to do with ActiveMQ, e.g,
often taking 10 seconds to list just 5 files in an NFS-mounted
directory.
So, we are creating an action plan this weekend to eliminate as many
NFS mount points as possible, and seeing how that helps the
situation. The plan needs approval/buy-in from key people to be
implemented, so it may be a couple of weeks to implement the plan.
In the meantime, ActiveMQ either shuts itself down or behaves in
rather despondent ways, so we find we are having to restart ActiveMQ
every 3 or 4 hours (and this frequency is slowly increasing).
Once ActiveMQ is rebooted, we find that our producers and our
consumers have to be shut down and relaunched in order to
reestablish the connection with ActiveMQ. This is a royal pain!
However, a producer will throw an exception whenever it tries to
send a message through a lost connection, and so I catch the
exception where I close the connection and reopen it. Thus, my
producers are able to reconnect automatically in the event ActiveMQ
is restarted.
But with the consumers, no exception is thrown as it waits for
notifications. It simply waits for a notification that never
happens after the connection with ActiveMQ is lost. So what is your
recommended method for a consumer to check for a disconnection??
(Maybe I should post his question as a separate thread...)
Mark
On 5/29/2013 3:21 AM, rajdavies [via
ActiveMQ] wrote:
Ultimately I'm pretty confident this problem is an
NFS problem - and as Johan has already let the cat out of the bag
;) - let me ask the following:
Which version of NFS 4 are you using and which environment?
Have you checked the system logs for NFS errors on all the
machines running ActiveMQ brokers ?
thanks,
Rob
On 29 May 2013, at 00:46, Christian Posta < [hidden email] >
wrote:
> I can make two recommendations.
>
> #1, being the preferred, create a test case that shows
this... that will
> give us the best chance of finding out what's going on...
take a look at
> the following test cases in the activemq source code to
give you an idea
> about how to go about doing it...
>
> http://svn.apache.org/viewvc/activemq/trunk/activemq-unit-tests/src/test/java/org/apache/activemq/usecases/
>
> http://svn.apache.org/viewvc/activemq/trunk/activemq-unit-tests/src/test/java/org/apache/activemq/bugs/
>
> http://svn.apache.org/viewvc/activemq/trunk/activemq-unit-tests/src/test/java/org/apache/activemq/test/JmsTopicSendReceiveTest.java?view=markup
>
>
> #2, if creating a test case doesn't sound like something
you want to get
> into.. i guess, give us the exact configs of broker,
clients, number of
> consumers, number of topics, message sizes, etc, etc all
details and if one
> of us gets the urge we can try it out on our boxes. this
will not be nearly
> as good as #1, and will provide a higher barrier to entry
because we spend
> our spare time doing this and like to spend that time
debugging and fixing,
> and not setting up environments and usecases which may not
even show a bug
> :)
>
>
>
>
> On Tue, May 28, 2013 at 4:34 PM, fenbers < [hidden email] >
wrote:
>
>>
>>
>>
>>
>>
>> I'm getting the Sync exception on both, local and
NFS.&nbsp;
>> Originally,
>> I was only using a local disk, but there wasn't much
disk space for
>> the ever growing list of 33MB enumerated .log files
that weren't
>> cleaned up.&nbsp; So I reconfigured ActiveMQ to
put these db files on
>> an
>> NFS mount.&nbsp; But the sync exceptions
occurred either way.
>>
>> I've changed *all* my consumers to AUTO_ACKNOWLEDGE,
thinking that
>> maybe an ACKNOWLEDGEment leak was causing the
undeleted files.&nbsp;
>> That
>> didn't help...&nbsp; The TRACE level logging
points to only two of my 5
>> topics that accumulate these undeleted db
files.&nbsp; So I've
>> concentrated by scrutiny over consumers of these two
topics.&nbsp; But
>> have not found anything out of the
ordinary.&nbsp;
>>
>> What is puzzling me still, is that the frequency of
the log file
>> build-up and the frequency of exceptions continues
to increase even
>> though the amount of messages sent per day by the
producers remains
>> nearly constant...
>> Mark
>>
>> On 5/28/2013 6:06 PM, ceposta [via
>> ActiveMQ] wrote:
>>
>> Sounds like there's multiple issues...
>>
>> You're journal files aren't being cleaned up, AND
you're getting
>> the Sync
>>
>> exception?
>>
>> You get the sync exception on local disk mount? Or
just NFS?
>>
>>
>> If the journals aren't being cleaned up, are your
consumers
>> properly
>>
>> ack'ing messages?
>>
>>
>>
>> On Tue, May 28, 2013 at 2:42 PM, fenbers &lt;
[hidden email] &gt;
>> wrote:
>>
>>
>> &gt;
>>
>> &gt;
>>
>> &gt;
>>
>> &gt;
>>
>> &gt;
>>
>> &gt; &nbsp; &nbsp; I would LOVE to
help you help me!&amp;nbsp; But
>> I have
>> no idea how to go
>>
>> &gt; &nbsp; &nbsp; about making a
test case.&amp;nbsp; If you
>> could drop
>> some hints in this
>>
>> &gt; &nbsp; &nbsp; regard, I might
be able to produce one.
>>
>> &gt;
>>
>> &gt; &nbsp; &nbsp; My ActiveMQ
issues seem to be related to network
>> slowness, which we
>>
>> &gt; &nbsp; &nbsp; are diagnosing
separately.&amp;nbsp; Or maybe
>> it is the
>> other way around,
>>
>> &gt; &nbsp; &nbsp; where ActiveMQ
problems are causing network
>> sluggishness.&amp;nbsp; Either
>>
>> &gt; &nbsp; &nbsp; way, there seems
to be a correlation, except
>> that when
>> network
>>
>> &gt; &nbsp; &nbsp; responsiveness
improves, ActiveMQ does not.
>>
>> &gt;
>>
>> &gt; &nbsp; &nbsp; The problem I'm
having with AMQ is progressive,
>> which
>> is even more
>>
>> &gt; &nbsp; &nbsp; puzzling, because
we are not adding to the
>> number of
>> messages that
>>
>> &gt; &nbsp; &nbsp; AMQ has to
handle.&amp;nbsp; Today, we were up
>> to 191
>> undeleted db-NNN.log
>>
>> &gt; &nbsp; &nbsp; files in the
database directory before I
>> stopped AMQ
>> and deleted
>>
>> &gt; &nbsp; &nbsp;
them.&amp;nbsp;&amp;nbsp; NNN was up to 451, so
>> 260
>> files had been cleaned up
>>
>> &gt; by AMQ's
>>
>> &gt; &nbsp; &nbsp; automatic
processes...
>>
>> &gt;
>>
>> &gt; &nbsp; &nbsp; Will log files
assist you in helping
>> me?&amp;nbsp; I
>> have TRACE level
>>
>> &gt; &nbsp; &nbsp; messages turned
on, so they are quite large.
>>
>> &gt;
>>
>> &gt; &nbsp; &nbsp; Mark
>>
>> &gt;
>>
>> &gt; &nbsp; &nbsp; On 5/28/2013 5:22
PM, rajdavies [via
>>
>> &gt; &nbsp; &nbsp; &nbsp;
ActiveMQ] wrote:
>>
>> &gt;
>>
>> &gt; &nbsp; &nbsp; &nbsp;Hi
Mark,
>>
>> &gt;
>>
>> &gt;
>>
>> &gt; &nbsp; &nbsp; &nbsp; could
you produce a test case for your
>> problem - it
>> would help us
>>
>> &gt; &nbsp; &nbsp; &nbsp;
identify the problem a lot quicker
>>
>> &gt;
>>
>> &gt;
>>
>> &gt; &nbsp; &nbsp; &nbsp;
thanks,
>>
>> &gt;
>>
>> &gt;
>>
>> &gt; &nbsp; &nbsp; &nbsp; Rob
>>
>> &gt;
>>
>> &gt; &nbsp; &nbsp; &nbsp; On 30
Apr 2013, at 16:40, fenbers
>> &amp;lt; [hidden
>> email] &amp;gt;
>>
>> &gt; &nbsp; &nbsp; &nbsp; wrote:
>>
>> &gt;
>>
>> &gt;
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; &amp;gt; Zagan wrote
>>
>> &gt;
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; &amp;gt;&amp;gt; Can you please
>> check if your .log
>> files in the /data
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; directory are cleaned
>>
>> &gt;
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; &amp;gt;&amp;gt; up? On basis of
>> the information I
>> suppose this
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; behaviour is due to a
>>
>> &gt;
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; &amp;gt;&amp;gt; misconfiguration
>> of your clients.
>>
>> &gt;
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; &amp;gt;&amp;gt; If this is the
>> case often broken
>> log file cleanup is a
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; symptom.
>>
>> &gt;
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; &amp;gt;
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; &amp;gt; I get the same error as
>> brought up in this
>> thread (KahaDB
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; failed to store to
>>
>> &gt;
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; &amp;gt; Journal). &amp;nbsp;And
>> yes, I also have a
>> problem with the
>>
>> &gt; numbered
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; .log files not
>>
>> &gt;
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; &amp;gt; all getting cleaned up
>> (most files are
>> removed
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; appropriately). &amp;nbsp;I have
>>
>> &gt;
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; &amp;gt; suspected a client
>> configuration problem
>> for a long time,
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; but can't figure
>>
>> &gt;
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; &amp;gt; out what's wrong -- even
>> with TRACE
>> logging turned on.
>>
>> &gt; &amp;nbsp;In
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; the meantime, I
>>
>> &gt;
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; &amp;gt; have to cope with
>> ActiveMQ crashing (i.e.,
>> shutting itself
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; down) about every
>>
>> &gt;
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; &amp;gt; two days. &amp;nbsp;The
>> logs point to a
>> disk storage problem, but
>>
>> &gt; I
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; have plenty of
>>
>> &gt;
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; &amp;gt; space, so that's not the
>> issue!
>> &amp;nbsp;I've tried a couple of
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; different Linux
>>
>> &gt;
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; &amp;gt; boxes and both local and
>> NFS mounts, and
>> this issue occurs
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; on both of them.
>>
>> &gt;
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; &amp;gt;
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; &amp;gt; I'm at a loss!!
>> &amp;nbsp;I'm running
>> 5.8.0...
>>
>> &gt;
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; &amp;gt;
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; &amp;gt; Mark
>>
>> &gt;
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; &amp;gt;
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; &amp;gt;
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; &amp;gt;
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; &amp;gt; --
>>
>> &gt;
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; &amp;gt; View this message in
>> context:
>>
>> &gt;
>> http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4666469.html
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; &amp;gt; Sent from the ActiveMQ -
>> User mailing list
>> archive at
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; Nabble.com.
>>
>> &gt;
>>
>> &gt;
>>
>> &gt;
>>
>> &gt;
>>
>> &gt;
>>
>> &gt;
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; If you reply to this email, your
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; &nbsp; message will be added to
>> the discussion below:
>>
>> &gt;
>>
>> &gt;
>> http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4667572.html
>> &gt;
>>
>> &gt;
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; To unsubscribe from ActiveMQ
>> crashes frequently,
>> click
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; &nbsp; here .
>>
>> &gt; &nbsp; &nbsp; &nbsp;
&nbsp; NAML
>>
>> &gt;
>>
>> &gt;
>>
>> &gt;
>>
>> &gt;
>>
>> &gt;
>>
>> &gt;
>>
>> &gt; mark_fenbers.vcf (360 bytes) &lt;
>>
>> &gt;
>> http://activemq.2283324.n4.nabble.com/attachment/4667574/0/mark_fenbers.vcf
>> &gt; &gt;
>>
>> &gt;
>>
>> &gt;
>>
>> &gt;
>>
>> &gt;
>>
>> &gt; --
>>
>> &gt; View this message in context:
>>
>> &gt;
>> http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4667574.html
>> &gt; Sent from the ActiveMQ - User mailing
list archive at
>> Nabble.com.
>>
>> &gt;
>>
>>
>>
>>
>> --
>> *Christian Posta*
>>
>> http://www.christianposta.com/blog
>> twitter: @christianposta
>>
>> http://www.christianposta.com/blog
>>
>>
>>
>>
>>
>> If you reply to this email, your
>> message will be added to the discussion below:
>>
>> http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4667575.html
>>
>>
>> To unsubscribe from ActiveMQ crashes frequently,
click
>> here .
>> NAML
>>
>>
>>
>>
>>
>>
>> mark_fenbers.vcf (360 bytes) <
>> http://activemq.2283324.n4.nabble.com/attachment/4667583/0/mark_fenbers.vcf
>>>
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4667583.html
>> Sent from the ActiveMQ - User mailing list archive at
Nabble.com.
>>
>
>
>
> --
> *Christian Posta*
> http://www.christianposta.com/blog
> twitter: @christianposta
If you reply to this email, your
message will be added to the discussion below:
http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4667591.html
To unsubscribe from ActiveMQ crashes frequently, click
here .
NAML
mark_fenbers.vcf (360 bytes) <http://activemq.2283324.n4.nabble.com/attachment/4667732/0/mark_fenbers.vcf>
--
View this message in context: http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4667732.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.
Re: ActiveMQ crashes frequently
Posted by Robert Davies <ra...@gmail.com>.
Ultimately I'm pretty confident this problem is an NFS problem - and as Johan has already let the cat out of the bag ;) - let me ask the following:
Which version of NFS 4 are you using and which environment?
Have you checked the system logs for NFS errors on all the machines running ActiveMQ brokers ?
thanks,
Rob
On 29 May 2013, at 00:46, Christian Posta <ch...@gmail.com> wrote:
> I can make two recommendations.
>
> #1, being the preferred, create a test case that shows this... that will
> give us the best chance of finding out what's going on... take a look at
> the following test cases in the activemq source code to give you an idea
> about how to go about doing it...
>
> http://svn.apache.org/viewvc/activemq/trunk/activemq-unit-tests/src/test/java/org/apache/activemq/usecases/
>
> http://svn.apache.org/viewvc/activemq/trunk/activemq-unit-tests/src/test/java/org/apache/activemq/bugs/
>
> http://svn.apache.org/viewvc/activemq/trunk/activemq-unit-tests/src/test/java/org/apache/activemq/test/JmsTopicSendReceiveTest.java?view=markup
>
>
> #2, if creating a test case doesn't sound like something you want to get
> into.. i guess, give us the exact configs of broker, clients, number of
> consumers, number of topics, message sizes, etc, etc all details and if one
> of us gets the urge we can try it out on our boxes. this will not be nearly
> as good as #1, and will provide a higher barrier to entry because we spend
> our spare time doing this and like to spend that time debugging and fixing,
> and not setting up environments and usecases which may not even show a bug
> :)
>
>
>
>
> On Tue, May 28, 2013 at 4:34 PM, fenbers <Ma...@noaa.gov> wrote:
>
>>
>>
>>
>>
>>
>> I'm getting the Sync exception on both, local and NFS.
>> Originally,
>> I was only using a local disk, but there wasn't much disk space for
>> the ever growing list of 33MB enumerated .log files that weren't
>> cleaned up. So I reconfigured ActiveMQ to put these db files on
>> an
>> NFS mount. But the sync exceptions occurred either way.
>>
>> I've changed *all* my consumers to AUTO_ACKNOWLEDGE, thinking that
>> maybe an ACKNOWLEDGEment leak was causing the undeleted files.
>> That
>> didn't help... The TRACE level logging points to only two of my 5
>> topics that accumulate these undeleted db files. So I've
>> concentrated by scrutiny over consumers of these two topics. But
>> have not found anything out of the ordinary.
>>
>> What is puzzling me still, is that the frequency of the log file
>> build-up and the frequency of exceptions continues to increase even
>> though the amount of messages sent per day by the producers remains
>> nearly constant...
>> Mark
>>
>> On 5/28/2013 6:06 PM, ceposta [via
>> ActiveMQ] wrote:
>>
>> Sounds like there's multiple issues...
>>
>> You're journal files aren't being cleaned up, AND you're getting
>> the Sync
>>
>> exception?
>>
>> You get the sync exception on local disk mount? Or just NFS?
>>
>>
>> If the journals aren't being cleaned up, are your consumers
>> properly
>>
>> ack'ing messages?
>>
>>
>>
>> On Tue, May 28, 2013 at 2:42 PM, fenbers < [hidden email] >
>> wrote:
>>
>>
>> >
>>
>> >
>>
>> >
>>
>> >
>>
>> >
>>
>> > I would LOVE to help you help me!&nbsp; But
>> I have
>> no idea how to go
>>
>> > about making a test case.&nbsp; If you
>> could drop
>> some hints in this
>>
>> > regard, I might be able to produce one.
>>
>> >
>>
>> > My ActiveMQ issues seem to be related to network
>> slowness, which we
>>
>> > are diagnosing separately.&nbsp; Or maybe
>> it is the
>> other way around,
>>
>> > where ActiveMQ problems are causing network
>> sluggishness.&nbsp; Either
>>
>> > way, there seems to be a correlation, except
>> that when
>> network
>>
>> > responsiveness improves, ActiveMQ does not.
>>
>> >
>>
>> > The problem I'm having with AMQ is progressive,
>> which
>> is even more
>>
>> > puzzling, because we are not adding to the
>> number of
>> messages that
>>
>> > AMQ has to handle.&nbsp; Today, we were up
>> to 191
>> undeleted db-NNN.log
>>
>> > files in the database directory before I
>> stopped AMQ
>> and deleted
>>
>> > them.&nbsp;&nbsp; NNN was up to 451, so
>> 260
>> files had been cleaned up
>>
>> > by AMQ's
>>
>> > automatic processes...
>>
>> >
>>
>> > Will log files assist you in helping
>> me?&nbsp; I
>> have TRACE level
>>
>> > messages turned on, so they are quite large.
>>
>> >
>>
>> > Mark
>>
>> >
>>
>> > On 5/28/2013 5:22 PM, rajdavies [via
>>
>> > ActiveMQ] wrote:
>>
>> >
>>
>> > Hi Mark,
>>
>> >
>>
>> >
>>
>> > could you produce a test case for your
>> problem - it
>> would help us
>>
>> > identify the problem a lot quicker
>>
>> >
>>
>> >
>>
>> > thanks,
>>
>> >
>>
>> >
>>
>> > Rob
>>
>> >
>>
>> > On 30 Apr 2013, at 16:40, fenbers
>> &lt; [hidden
>> email] &gt;
>>
>> > wrote:
>>
>> >
>>
>> >
>>
>> > &gt; Zagan wrote
>>
>> >
>>
>> > &gt;&gt; Can you please
>> check if your .log
>> files in the /data
>>
>> > directory are cleaned
>>
>> >
>>
>> > &gt;&gt; up? On basis of
>> the information I
>> suppose this
>>
>> > behaviour is due to a
>>
>> >
>>
>> > &gt;&gt; misconfiguration
>> of your clients.
>>
>> >
>>
>> > &gt;&gt; If this is the
>> case often broken
>> log file cleanup is a
>>
>> > symptom.
>>
>> >
>>
>> > &gt;
>>
>> > &gt; I get the same error as
>> brought up in this
>> thread (KahaDB
>>
>> > failed to store to
>>
>> >
>>
>> > &gt; Journal). &nbsp;And
>> yes, I also have a
>> problem with the
>>
>> > numbered
>>
>> > .log files not
>>
>> >
>>
>> > &gt; all getting cleaned up
>> (most files are
>> removed
>>
>> > appropriately). &nbsp;I have
>>
>> >
>>
>> > &gt; suspected a client
>> configuration problem
>> for a long time,
>>
>> > but can't figure
>>
>> >
>>
>> > &gt; out what's wrong -- even
>> with TRACE
>> logging turned on.
>>
>> > &nbsp;In
>>
>> > the meantime, I
>>
>> >
>>
>> > &gt; have to cope with
>> ActiveMQ crashing (i.e.,
>> shutting itself
>>
>> > down) about every
>>
>> >
>>
>> > &gt; two days. &nbsp;The
>> logs point to a
>> disk storage problem, but
>>
>> > I
>>
>> > have plenty of
>>
>> >
>>
>> > &gt; space, so that's not the
>> issue!
>> &nbsp;I've tried a couple of
>>
>> > different Linux
>>
>> >
>>
>> > &gt; boxes and both local and
>> NFS mounts, and
>> this issue occurs
>>
>> > on both of them.
>>
>> >
>>
>> > &gt;
>>
>> > &gt; I'm at a loss!!
>> &nbsp;I'm running
>> 5.8.0...
>>
>> >
>>
>> > &gt;
>>
>> > &gt; Mark
>>
>> >
>>
>> > &gt;
>>
>> > &gt;
>>
>> > &gt;
>>
>> > &gt; --
>>
>> >
>>
>> > &gt; View this message in
>> context:
>>
>> >
>> http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4666469.html
>> > &gt; Sent from the ActiveMQ -
>> User mailing list
>> archive at
>>
>> > Nabble.com.
>>
>> >
>>
>> >
>>
>> >
>>
>> >
>>
>> >
>>
>> >
>>
>> > If you reply to this email, your
>>
>> > message will be added to
>> the discussion below:
>>
>> >
>>
>> >
>> http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4667572.html
>> >
>>
>> >
>>
>> > To unsubscribe from ActiveMQ
>> crashes frequently,
>> click
>>
>> > here .
>>
>> > NAML
>>
>> >
>>
>> >
>>
>> >
>>
>> >
>>
>> >
>>
>> >
>>
>> > mark_fenbers.vcf (360 bytes) <
>>
>> >
>> http://activemq.2283324.n4.nabble.com/attachment/4667574/0/mark_fenbers.vcf
>> > >
>>
>> >
>>
>> >
>>
>> >
>>
>> >
>>
>> > --
>>
>> > View this message in context:
>>
>> >
>> http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4667574.html
>> > Sent from the ActiveMQ - User mailing list archive at
>> Nabble.com.
>>
>> >
>>
>>
>>
>>
>> --
>> *Christian Posta*
>>
>> http://www.christianposta.com/blog
>> twitter: @christianposta
>>
>> http://www.christianposta.com/blog
>>
>>
>>
>>
>>
>> If you reply to this email, your
>> message will be added to the discussion below:
>>
>> http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4667575.html
>>
>>
>> To unsubscribe from ActiveMQ crashes frequently, click
>> here .
>> NAML
>>
>>
>>
>>
>>
>>
>> mark_fenbers.vcf (360 bytes) <
>> http://activemq.2283324.n4.nabble.com/attachment/4667583/0/mark_fenbers.vcf
>>>
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4667583.html
>> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>>
>
>
>
> --
> *Christian Posta*
> http://www.christianposta.com/blog
> twitter: @christianposta
Re: ActiveMQ crashes frequently
Posted by Christian Posta <ch...@gmail.com>.
I can make two recommendations.
#1, being the preferred, create a test case that shows this... that will
give us the best chance of finding out what's going on... take a look at
the following test cases in the activemq source code to give you an idea
about how to go about doing it...
http://svn.apache.org/viewvc/activemq/trunk/activemq-unit-tests/src/test/java/org/apache/activemq/usecases/
http://svn.apache.org/viewvc/activemq/trunk/activemq-unit-tests/src/test/java/org/apache/activemq/bugs/
http://svn.apache.org/viewvc/activemq/trunk/activemq-unit-tests/src/test/java/org/apache/activemq/test/JmsTopicSendReceiveTest.java?view=markup
#2, if creating a test case doesn't sound like something you want to get
into.. i guess, give us the exact configs of broker, clients, number of
consumers, number of topics, message sizes, etc, etc all details and if one
of us gets the urge we can try it out on our boxes. this will not be nearly
as good as #1, and will provide a higher barrier to entry because we spend
our spare time doing this and like to spend that time debugging and fixing,
and not setting up environments and usecases which may not even show a bug
:)
On Tue, May 28, 2013 at 4:34 PM, fenbers <Ma...@noaa.gov> wrote:
>
>
>
>
>
> I'm getting the Sync exception on both, local and NFS.
> Originally,
> I was only using a local disk, but there wasn't much disk space for
> the ever growing list of 33MB enumerated .log files that weren't
> cleaned up. So I reconfigured ActiveMQ to put these db files on
> an
> NFS mount. But the sync exceptions occurred either way.
>
> I've changed *all* my consumers to AUTO_ACKNOWLEDGE, thinking that
> maybe an ACKNOWLEDGEment leak was causing the undeleted files.
> That
> didn't help... The TRACE level logging points to only two of my 5
> topics that accumulate these undeleted db files. So I've
> concentrated by scrutiny over consumers of these two topics. But
> have not found anything out of the ordinary.
>
> What is puzzling me still, is that the frequency of the log file
> build-up and the frequency of exceptions continues to increase even
> though the amount of messages sent per day by the producers remains
> nearly constant...
> Mark
>
> On 5/28/2013 6:06 PM, ceposta [via
> ActiveMQ] wrote:
>
> Sounds like there's multiple issues...
>
> You're journal files aren't being cleaned up, AND you're getting
> the Sync
>
> exception?
>
> You get the sync exception on local disk mount? Or just NFS?
>
>
> If the journals aren't being cleaned up, are your consumers
> properly
>
> ack'ing messages?
>
>
>
> On Tue, May 28, 2013 at 2:42 PM, fenbers < [hidden email] >
> wrote:
>
>
> >
>
> >
>
> >
>
> >
>
> >
>
> > I would LOVE to help you help me!&nbsp; But
> I have
> no idea how to go
>
> > about making a test case.&nbsp; If you
> could drop
> some hints in this
>
> > regard, I might be able to produce one.
>
> >
>
> > My ActiveMQ issues seem to be related to network
> slowness, which we
>
> > are diagnosing separately.&nbsp; Or maybe
> it is the
> other way around,
>
> > where ActiveMQ problems are causing network
> sluggishness.&nbsp; Either
>
> > way, there seems to be a correlation, except
> that when
> network
>
> > responsiveness improves, ActiveMQ does not.
>
> >
>
> > The problem I'm having with AMQ is progressive,
> which
> is even more
>
> > puzzling, because we are not adding to the
> number of
> messages that
>
> > AMQ has to handle.&nbsp; Today, we were up
> to 191
> undeleted db-NNN.log
>
> > files in the database directory before I
> stopped AMQ
> and deleted
>
> > them.&nbsp;&nbsp; NNN was up to 451, so
> 260
> files had been cleaned up
>
> > by AMQ's
>
> > automatic processes...
>
> >
>
> > Will log files assist you in helping
> me?&nbsp; I
> have TRACE level
>
> > messages turned on, so they are quite large.
>
> >
>
> > Mark
>
> >
>
> > On 5/28/2013 5:22 PM, rajdavies [via
>
> > ActiveMQ] wrote:
>
> >
>
> > Hi Mark,
>
> >
>
> >
>
> > could you produce a test case for your
> problem - it
> would help us
>
> > identify the problem a lot quicker
>
> >
>
> >
>
> > thanks,
>
> >
>
> >
>
> > Rob
>
> >
>
> > On 30 Apr 2013, at 16:40, fenbers
> &lt; [hidden
> email] &gt;
>
> > wrote:
>
> >
>
> >
>
> > &gt; Zagan wrote
>
> >
>
> > &gt;&gt; Can you please
> check if your .log
> files in the /data
>
> > directory are cleaned
>
> >
>
> > &gt;&gt; up? On basis of
> the information I
> suppose this
>
> > behaviour is due to a
>
> >
>
> > &gt;&gt; misconfiguration
> of your clients.
>
> >
>
> > &gt;&gt; If this is the
> case often broken
> log file cleanup is a
>
> > symptom.
>
> >
>
> > &gt;
>
> > &gt; I get the same error as
> brought up in this
> thread (KahaDB
>
> > failed to store to
>
> >
>
> > &gt; Journal). &nbsp;And
> yes, I also have a
> problem with the
>
> > numbered
>
> > .log files not
>
> >
>
> > &gt; all getting cleaned up
> (most files are
> removed
>
> > appropriately). &nbsp;I have
>
> >
>
> > &gt; suspected a client
> configuration problem
> for a long time,
>
> > but can't figure
>
> >
>
> > &gt; out what's wrong -- even
> with TRACE
> logging turned on.
>
> > &nbsp;In
>
> > the meantime, I
>
> >
>
> > &gt; have to cope with
> ActiveMQ crashing (i.e.,
> shutting itself
>
> > down) about every
>
> >
>
> > &gt; two days. &nbsp;The
> logs point to a
> disk storage problem, but
>
> > I
>
> > have plenty of
>
> >
>
> > &gt; space, so that's not the
> issue!
> &nbsp;I've tried a couple of
>
> > different Linux
>
> >
>
> > &gt; boxes and both local and
> NFS mounts, and
> this issue occurs
>
> > on both of them.
>
> >
>
> > &gt;
>
> > &gt; I'm at a loss!!
> &nbsp;I'm running
> 5.8.0...
>
> >
>
> > &gt;
>
> > &gt; Mark
>
> >
>
> > &gt;
>
> > &gt;
>
> > &gt;
>
> > &gt; --
>
> >
>
> > &gt; View this message in
> context:
>
> >
> http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4666469.html
> > &gt; Sent from the ActiveMQ -
> User mailing list
> archive at
>
> > Nabble.com.
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> > If you reply to this email, your
>
> > message will be added to
> the discussion below:
>
> >
>
> >
> http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4667572.html
> >
>
> >
>
> > To unsubscribe from ActiveMQ
> crashes frequently,
> click
>
> > here .
>
> > NAML
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> > mark_fenbers.vcf (360 bytes) <
>
> >
> http://activemq.2283324.n4.nabble.com/attachment/4667574/0/mark_fenbers.vcf
> > >
>
> >
>
> >
>
> >
>
> >
>
> > --
>
> > View this message in context:
>
> >
> http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4667574.html
> > Sent from the ActiveMQ - User mailing list archive at
> Nabble.com.
>
> >
>
>
>
>
> --
> *Christian Posta*
>
> http://www.christianposta.com/blog
> twitter: @christianposta
>
> http://www.christianposta.com/blog
>
>
>
>
>
> If you reply to this email, your
> message will be added to the discussion below:
>
> http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4667575.html
>
>
> To unsubscribe from ActiveMQ crashes frequently, click
> here .
> NAML
>
>
>
>
>
>
> mark_fenbers.vcf (360 bytes) <
> http://activemq.2283324.n4.nabble.com/attachment/4667583/0/mark_fenbers.vcf
> >
>
>
>
>
> --
> View this message in context:
> http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4667583.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>
--
*Christian Posta*
http://www.christianposta.com/blog
twitter: @christianposta
Re: ActiveMQ crashes frequently
Posted by fenbers <Ma...@noaa.gov>.
I'm getting the Sync exception on both, local and NFS. Originally,
I was only using a local disk, but there wasn't much disk space for
the ever growing list of 33MB enumerated .log files that weren't
cleaned up. So I reconfigured ActiveMQ to put these db files on an
NFS mount. But the sync exceptions occurred either way.
I've changed *all* my consumers to AUTO_ACKNOWLEDGE, thinking that
maybe an ACKNOWLEDGEment leak was causing the undeleted files. That
didn't help... The TRACE level logging points to only two of my 5
topics that accumulate these undeleted db files. So I've
concentrated by scrutiny over consumers of these two topics. But
have not found anything out of the ordinary.
What is puzzling me still, is that the frequency of the log file
build-up and the frequency of exceptions continues to increase even
though the amount of messages sent per day by the producers remains
nearly constant...
Mark
On 5/28/2013 6:06 PM, ceposta [via
ActiveMQ] wrote:
Sounds like there's multiple issues...
You're journal files aren't being cleaned up, AND you're getting
the Sync
exception?
You get the sync exception on local disk mount? Or just NFS?
If the journals aren't being cleaned up, are your consumers
properly
ack'ing messages?
On Tue, May 28, 2013 at 2:42 PM, fenbers < [hidden email] >
wrote:
>
>
>
>
>
> I would LOVE to help you help me!&nbsp; But I have
no idea how to go
> about making a test case.&nbsp; If you could drop
some hints in this
> regard, I might be able to produce one.
>
> My ActiveMQ issues seem to be related to network
slowness, which we
> are diagnosing separately.&nbsp; Or maybe it is the
other way around,
> where ActiveMQ problems are causing network
sluggishness.&nbsp; Either
> way, there seems to be a correlation, except that when
network
> responsiveness improves, ActiveMQ does not.
>
> The problem I'm having with AMQ is progressive, which
is even more
> puzzling, because we are not adding to the number of
messages that
> AMQ has to handle.&nbsp; Today, we were up to 191
undeleted db-NNN.log
> files in the database directory before I stopped AMQ
and deleted
> them.&nbsp;&nbsp; NNN was up to 451, so 260
files had been cleaned up
> by AMQ's
> automatic processes...
>
> Will log files assist you in helping me?&nbsp; I
have TRACE level
> messages turned on, so they are quite large.
>
> Mark
>
> On 5/28/2013 5:22 PM, rajdavies [via
> ActiveMQ] wrote:
>
> Hi Mark,
>
>
> could you produce a test case for your problem - it
would help us
> identify the problem a lot quicker
>
>
> thanks,
>
>
> Rob
>
> On 30 Apr 2013, at 16:40, fenbers &lt; [hidden
email] &gt;
> wrote:
>
>
> &gt; Zagan wrote
>
> &gt;&gt; Can you please check if your .log
files in the /data
> directory are cleaned
>
> &gt;&gt; up? On basis of the information I
suppose this
> behaviour is due to a
>
> &gt;&gt; misconfiguration of your clients.
>
> &gt;&gt; If this is the case often broken
log file cleanup is a
> symptom.
>
> &gt;
> &gt; I get the same error as brought up in this
thread (KahaDB
> failed to store to
>
> &gt; Journal). &nbsp;And yes, I also have a
problem with the
> numbered
> .log files not
>
> &gt; all getting cleaned up (most files are
removed
> appropriately). &nbsp;I have
>
> &gt; suspected a client configuration problem
for a long time,
> but can't figure
>
> &gt; out what's wrong -- even with TRACE
logging turned on.
> &nbsp;In
> the meantime, I
>
> &gt; have to cope with ActiveMQ crashing (i.e.,
shutting itself
> down) about every
>
> &gt; two days. &nbsp;The logs point to a
disk storage problem, but
> I
> have plenty of
>
> &gt; space, so that's not the issue!
&nbsp;I've tried a couple of
> different Linux
>
> &gt; boxes and both local and NFS mounts, and
this issue occurs
> on both of them.
>
> &gt;
> &gt; I'm at a loss!! &nbsp;I'm running
5.8.0...
>
> &gt;
> &gt; Mark
>
> &gt;
> &gt;
> &gt;
> &gt; --
>
> &gt; View this message in context:
> http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4666469.html
> &gt; Sent from the ActiveMQ - User mailing list
archive at
> Nabble.com.
>
>
>
>
>
>
> If you reply to this email, your
> message will be added to the discussion below:
>
> http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4667572.html
>
>
> To unsubscribe from ActiveMQ crashes frequently,
click
> here .
> NAML
>
>
>
>
>
>
> mark_fenbers.vcf (360 bytes) <
> http://activemq.2283324.n4.nabble.com/attachment/4667574/0/mark_fenbers.vcf
> >
>
>
>
>
> --
> View this message in context:
> http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4667574.html
> Sent from the ActiveMQ - User mailing list archive at
Nabble.com.
>
--
*Christian Posta*
http://www.christianposta.com/blog
twitter: @christianposta
http://www.christianposta.com/blog
If you reply to this email, your
message will be added to the discussion below:
http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4667575.html
To unsubscribe from ActiveMQ crashes frequently, click
here .
NAML
mark_fenbers.vcf (360 bytes) <http://activemq.2283324.n4.nabble.com/attachment/4667583/0/mark_fenbers.vcf>
--
View this message in context: http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4667583.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.
Re: ActiveMQ crashes frequently
Posted by Christian Posta <ch...@gmail.com>.
Sounds like there's multiple issues...
You're journal files aren't being cleaned up, AND you're getting the Sync
exception?
You get the sync exception on local disk mount? Or just NFS?
If the journals aren't being cleaned up, are your consumers properly
ack'ing messages?
On Tue, May 28, 2013 at 2:42 PM, fenbers <Ma...@noaa.gov> wrote:
>
>
>
>
>
> I would LOVE to help you help me! But I have no idea how to go
> about making a test case. If you could drop some hints in this
> regard, I might be able to produce one.
>
> My ActiveMQ issues seem to be related to network slowness, which we
> are diagnosing separately. Or maybe it is the other way around,
> where ActiveMQ problems are causing network sluggishness. Either
> way, there seems to be a correlation, except that when network
> responsiveness improves, ActiveMQ does not.
>
> The problem I'm having with AMQ is progressive, which is even more
> puzzling, because we are not adding to the number of messages that
> AMQ has to handle. Today, we were up to 191 undeleted db-NNN.log
> files in the database directory before I stopped AMQ and deleted
> them. NNN was up to 451, so 260 files had been cleaned up
> by AMQ's
> automatic processes...
>
> Will log files assist you in helping me? I have TRACE level
> messages turned on, so they are quite large.
>
> Mark
>
> On 5/28/2013 5:22 PM, rajdavies [via
> ActiveMQ] wrote:
>
> Hi Mark,
>
>
> could you produce a test case for your problem - it would help us
> identify the problem a lot quicker
>
>
> thanks,
>
>
> Rob
>
> On 30 Apr 2013, at 16:40, fenbers < [hidden email] >
> wrote:
>
>
> > Zagan wrote
>
> >> Can you please check if your .log files in the /data
> directory are cleaned
>
> >> up? On basis of the information I suppose this
> behaviour is due to a
>
> >> misconfiguration of your clients.
>
> >> If this is the case often broken log file cleanup is a
> symptom.
>
> >
> > I get the same error as brought up in this thread (KahaDB
> failed to store to
>
> > Journal). And yes, I also have a problem with the
> numbered
> .log files not
>
> > all getting cleaned up (most files are removed
> appropriately). I have
>
> > suspected a client configuration problem for a long time,
> but can't figure
>
> > out what's wrong -- even with TRACE logging turned on.
> In
> the meantime, I
>
> > have to cope with ActiveMQ crashing (i.e., shutting itself
> down) about every
>
> > two days. The logs point to a disk storage problem, but
> I
> have plenty of
>
> > space, so that's not the issue! I've tried a couple of
> different Linux
>
> > boxes and both local and NFS mounts, and this issue occurs
> on both of them.
>
> >
> > I'm at a loss!! I'm running 5.8.0...
>
> >
> > Mark
>
> >
> >
> >
> > --
>
> > View this message in context:
> http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4666469.html
> > Sent from the ActiveMQ - User mailing list archive at
> Nabble.com.
>
>
>
>
>
>
> If you reply to this email, your
> message will be added to the discussion below:
>
> http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4667572.html
>
>
> To unsubscribe from ActiveMQ crashes frequently, click
> here .
> NAML
>
>
>
>
>
>
> mark_fenbers.vcf (360 bytes) <
> http://activemq.2283324.n4.nabble.com/attachment/4667574/0/mark_fenbers.vcf
> >
>
>
>
>
> --
> View this message in context:
> http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4667574.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>
--
*Christian Posta*
http://www.christianposta.com/blog
twitter: @christianposta
Re: ActiveMQ crashes frequently
Posted by fenbers <Ma...@noaa.gov>.
I would LOVE to help you help me! But I have no idea how to go
about making a test case. If you could drop some hints in this
regard, I might be able to produce one.
My ActiveMQ issues seem to be related to network slowness, which we
are diagnosing separately. Or maybe it is the other way around,
where ActiveMQ problems are causing network sluggishness. Either
way, there seems to be a correlation, except that when network
responsiveness improves, ActiveMQ does not.
The problem I'm having with AMQ is progressive, which is even more
puzzling, because we are not adding to the number of messages that
AMQ has to handle. Today, we were up to 191 undeleted db-NNN.log
files in the database directory before I stopped AMQ and deleted
them. NNN was up to 451, so 260 files had been cleaned up by AMQ's
automatic processes...
Will log files assist you in helping me? I have TRACE level
messages turned on, so they are quite large.
Mark
On 5/28/2013 5:22 PM, rajdavies [via
ActiveMQ] wrote:
Hi Mark,
could you produce a test case for your problem - it would help us
identify the problem a lot quicker
thanks,
Rob
On 30 Apr 2013, at 16:40, fenbers < [hidden email] >
wrote:
> Zagan wrote
>> Can you please check if your .log files in the /data
directory are cleaned
>> up? On basis of the information I suppose this
behaviour is due to a
>> misconfiguration of your clients.
>> If this is the case often broken log file cleanup is a
symptom.
>
> I get the same error as brought up in this thread (KahaDB
failed to store to
> Journal). And yes, I also have a problem with the numbered
.log files not
> all getting cleaned up (most files are removed
appropriately). I have
> suspected a client configuration problem for a long time,
but can't figure
> out what's wrong -- even with TRACE logging turned on. In
the meantime, I
> have to cope with ActiveMQ crashing (i.e., shutting itself
down) about every
> two days. The logs point to a disk storage problem, but I
have plenty of
> space, so that's not the issue! I've tried a couple of
different Linux
> boxes and both local and NFS mounts, and this issue occurs
on both of them.
>
> I'm at a loss!! I'm running 5.8.0...
>
> Mark
>
>
>
> --
> View this message in context: http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4666469.html
> Sent from the ActiveMQ - User mailing list archive at
Nabble.com.
If you reply to this email, your
message will be added to the discussion below:
http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4667572.html
To unsubscribe from ActiveMQ crashes frequently, click
here .
NAML
mark_fenbers.vcf (360 bytes) <http://activemq.2283324.n4.nabble.com/attachment/4667574/0/mark_fenbers.vcf>
--
View this message in context: http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4667574.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.
Re: ActiveMQ crashes frequently
Posted by Johan Edstrom <se...@gmail.com>.
I smelled the word NFS too.....
On May 28, 2013, at 3:18 PM, Robert Davies <ra...@gmail.com> wrote:
> Hi Mark,
>
> could you produce a test case for your problem - it would help us identify the problem a lot quicker
>
> thanks,
>
> Rob
> On 30 Apr 2013, at 16:40, fenbers <Ma...@noaa.gov> wrote:
>
>> Zagan wrote
>>> Can you please check if your .log files in the /data directory are cleaned
>>> up? On basis of the information I suppose this behaviour is due to a
>>> misconfiguration of your clients.
>>> If this is the case often broken log file cleanup is a symptom.
>>
>> I get the same error as brought up in this thread (KahaDB failed to store to
>> Journal). And yes, I also have a problem with the numbered .log files not
>> all getting cleaned up (most files are removed appropriately). I have
>> suspected a client configuration problem for a long time, but can't figure
>> out what's wrong -- even with TRACE logging turned on. In the meantime, I
>> have to cope with ActiveMQ crashing (i.e., shutting itself down) about every
>> two days. The logs point to a disk storage problem, but I have plenty of
>> space, so that's not the issue! I've tried a couple of different Linux
>> boxes and both local and NFS mounts, and this issue occurs on both of them.
>>
>> I'm at a loss!! I'm running 5.8.0...
>>
>> Mark
>>
>>
>>
>> --
>> View this message in context: http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4666469.html
>> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>