You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@activemq.apache.org by Gilles Harloux <gi...@gmail.com> on 2012/10/23 10:15:57 UTC

KahaDB: No messages, but log files not reclaimed and StorePercentUsage above 100

Hi,

I have a system embedding a broker with KahaDB as a store. I am trying
to get a feel about disaster recovery behavior. So what I am basically
doing is randomly kill & restart the process. I see a condition (after
killing & restarting multiple times) where the messages get consumed,
but kahaDB journal files (db-*.log) don't get reclaimed. As I set up a
storeUsage limit, it ends up blocking the system.

I say the messages get consumed, because I can see my application
logging what messages it gets to handle; Also, JMX tells me that the
broker's TotalMesageCount is zero, while the StorePercentUsage is
above 100 (depending on parameters such as message size & rate, I saw
anything from 112 to 293 percent usage).

So, two questions:
 - How is it possible (for any reasons) to get a usage percentage
above 100? (In other situations, I saw it happen with memory too).
 - Why is it I can't get kahaDB to reclaim seemingly unused journal files?

TIA,

Re: KahaDB: No messages, but log files not reclaimed and StorePercentUsage above 100

Posted by Gilles Harloux <gi...@gmail.com>.
filed as https://issues.apache.org/jira/browse/AMQ-4128. I agree that
AMQ-3866 doesn't say a word about mKahaDB. It just caught my eye
because of the symtoms, and the logs look alike; Then, I saw you asked
if there was XA involved, hence my reaction: I am not using XA
explicitly, my best guess is that its use comes from mKahaDB /
cross-store transactions.

Anyway, thanks for your help!

On Tue, Oct 23, 2012 at 4:13 PM, Gary Tully <ga...@gmail.com> wrote:
> ok, maybe open a  new issue, and attach your logs and configure a data
> small journal data file so that it can easily be attached once you
> find the block in gc.
>
> If recovery of local xa multi store transactions is the problem, I
> would expect some log information from:
> org.apache.activemq.store.kahadb.MultiKahaDBTransactionStore
>
> use setJournalMaxFileLength=24k (it needs to be > 8k) on the
> persistence adapter.
>
> It may require killing the broker in the debugger to get to the bottom
> of this or using a system property that can kill the broker to force
> recovery in this case.
>
> An issue/jira ticket will put it on the radar in any event. I don't
> see that AMQ-3866 is using mkahadb.
>
> On 23 October 2012 14:47, Gilles Harloux <gi...@gmail.com> wrote:
>> I doubt I can reproduce it as a unit test: What I see happens
>> (sometimes, sometimes not) because I sigkill the process. If I replace
>> the kill with an orderly shutdown, everything works out ok.
>>
>> On Tue, Oct 23, 2012 at 3:38 PM, Gary Tully <ga...@gmail.com> wrote:
>>> do you think you could come up with a unit test that shows growth. So
>>> something that uses mkahadb and produces or consumed in a transaction
>>> from destinations that span the stores.
>>>
>>> You could configure the size of journal data files to 16k or
>>> something, such that any issue shows up very quickly.
>>>
>>> A test case like that would help clarify the use case and if it can
>>> reproduce, would be a great start to getting a quick resolution.
>>>
>>> There is a test that does transacted send received across stores, but
>>> does not check for cleanup. It may be easy to extend it in that way.
>>>
>>> in activemq-core
>>> org.apache.activemq.store.StorePerDestinationTest#testTransactedSendReceiveAcrossStores
>>>
>>>
>>> On 23 October 2012 14:20, Gilles Harloux <gi...@gmail.com> wrote:
>>>> I found that link while you were answering, it seems <g>. I also found
>>>> https://issues.apache.org/jira/browse/AMQ-3866 and it looks a lot like
>>>> what I experience.
>>>>
>>>> I tried to delete the index, start a broker and debug as explained in
>>>> that ticket. Breakpointing at MessageDatabase#process, line 904
>>>> (version 5.7.0) tells me it seems there is XA in the mix (data is a
>>>> KahaAddMessageCommand instance, with
>>>> data.f_transactionInfo.f_localTransactionId == null and
>>>> data.f_transactionInfo.f_xaTransactionId != null). I guess it's due to
>>>> the use of mKahaDB, where I am transacting across both stores.
>>>>
>>>> In that ticket, you asked for the relevant data file. I can share mine
>>>> if there's any interest -- I just have to find a convenient way to
>>>> provide you with a 32 Mb file.
>>>>
>>>> On Tue, Oct 23, 2012 at 12:24 PM, Gary Tully <ga...@gmail.com> wrote:
>>>>> have a read of http://activemq.apache.org/why-do-kahadb-log-files-remain-after-cleanup.html
>>>>>
>>>>> The logging referenced there will show you what destinations are
>>>>> holding on to references to the journal data files.
>>>>>
>>>>> w.r.t the usage %, the journal size increases in chunks of data file
>>>>> size, so a new journal data file can push the usage over the limit.
>>>>>
>>>>> On 23 October 2012 09:15, Gilles Harloux <gi...@gmail.com> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I have a system embedding a broker with KahaDB as a store. I am trying
>>>>>> to get a feel about disaster recovery behavior. So what I am basically
>>>>>> doing is randomly kill & restart the process. I see a condition (after
>>>>>> killing & restarting multiple times) where the messages get consumed,
>>>>>> but kahaDB journal files (db-*.log) don't get reclaimed. As I set up a
>>>>>> storeUsage limit, it ends up blocking the system.
>>>>>>
>>>>>> I say the messages get consumed, because I can see my application
>>>>>> logging what messages it gets to handle; Also, JMX tells me that the
>>>>>> broker's TotalMesageCount is zero, while the StorePercentUsage is
>>>>>> above 100 (depending on parameters such as message size & rate, I saw
>>>>>> anything from 112 to 293 percent usage).
>>>>>>
>>>>>> So, two questions:
>>>>>>  - How is it possible (for any reasons) to get a usage percentage
>>>>>> above 100? (In other situations, I saw it happen with memory too).
>>>>>>  - Why is it I can't get kahaDB to reclaim seemingly unused journal files?
>>>>>>
>>>>>> TIA,
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> http://redhat.com
>>>>> http://blog.garytully.com
>>>
>>>
>>>
>>> --
>>> http://redhat.com
>>> http://blog.garytully.com
>
>
>
> --
> http://redhat.com
> http://blog.garytully.com

Re: KahaDB: No messages, but log files not reclaimed and StorePercentUsage above 100

Posted by Gary Tully <ga...@gmail.com>.
ok, maybe open a  new issue, and attach your logs and configure a data
small journal data file so that it can easily be attached once you
find the block in gc.

If recovery of local xa multi store transactions is the problem, I
would expect some log information from:
org.apache.activemq.store.kahadb.MultiKahaDBTransactionStore

use setJournalMaxFileLength=24k (it needs to be > 8k) on the
persistence adapter.

It may require killing the broker in the debugger to get to the bottom
of this or using a system property that can kill the broker to force
recovery in this case.

An issue/jira ticket will put it on the radar in any event. I don't
see that AMQ-3866 is using mkahadb.

On 23 October 2012 14:47, Gilles Harloux <gi...@gmail.com> wrote:
> I doubt I can reproduce it as a unit test: What I see happens
> (sometimes, sometimes not) because I sigkill the process. If I replace
> the kill with an orderly shutdown, everything works out ok.
>
> On Tue, Oct 23, 2012 at 3:38 PM, Gary Tully <ga...@gmail.com> wrote:
>> do you think you could come up with a unit test that shows growth. So
>> something that uses mkahadb and produces or consumed in a transaction
>> from destinations that span the stores.
>>
>> You could configure the size of journal data files to 16k or
>> something, such that any issue shows up very quickly.
>>
>> A test case like that would help clarify the use case and if it can
>> reproduce, would be a great start to getting a quick resolution.
>>
>> There is a test that does transacted send received across stores, but
>> does not check for cleanup. It may be easy to extend it in that way.
>>
>> in activemq-core
>> org.apache.activemq.store.StorePerDestinationTest#testTransactedSendReceiveAcrossStores
>>
>>
>> On 23 October 2012 14:20, Gilles Harloux <gi...@gmail.com> wrote:
>>> I found that link while you were answering, it seems <g>. I also found
>>> https://issues.apache.org/jira/browse/AMQ-3866 and it looks a lot like
>>> what I experience.
>>>
>>> I tried to delete the index, start a broker and debug as explained in
>>> that ticket. Breakpointing at MessageDatabase#process, line 904
>>> (version 5.7.0) tells me it seems there is XA in the mix (data is a
>>> KahaAddMessageCommand instance, with
>>> data.f_transactionInfo.f_localTransactionId == null and
>>> data.f_transactionInfo.f_xaTransactionId != null). I guess it's due to
>>> the use of mKahaDB, where I am transacting across both stores.
>>>
>>> In that ticket, you asked for the relevant data file. I can share mine
>>> if there's any interest -- I just have to find a convenient way to
>>> provide you with a 32 Mb file.
>>>
>>> On Tue, Oct 23, 2012 at 12:24 PM, Gary Tully <ga...@gmail.com> wrote:
>>>> have a read of http://activemq.apache.org/why-do-kahadb-log-files-remain-after-cleanup.html
>>>>
>>>> The logging referenced there will show you what destinations are
>>>> holding on to references to the journal data files.
>>>>
>>>> w.r.t the usage %, the journal size increases in chunks of data file
>>>> size, so a new journal data file can push the usage over the limit.
>>>>
>>>> On 23 October 2012 09:15, Gilles Harloux <gi...@gmail.com> wrote:
>>>>> Hi,
>>>>>
>>>>> I have a system embedding a broker with KahaDB as a store. I am trying
>>>>> to get a feel about disaster recovery behavior. So what I am basically
>>>>> doing is randomly kill & restart the process. I see a condition (after
>>>>> killing & restarting multiple times) where the messages get consumed,
>>>>> but kahaDB journal files (db-*.log) don't get reclaimed. As I set up a
>>>>> storeUsage limit, it ends up blocking the system.
>>>>>
>>>>> I say the messages get consumed, because I can see my application
>>>>> logging what messages it gets to handle; Also, JMX tells me that the
>>>>> broker's TotalMesageCount is zero, while the StorePercentUsage is
>>>>> above 100 (depending on parameters such as message size & rate, I saw
>>>>> anything from 112 to 293 percent usage).
>>>>>
>>>>> So, two questions:
>>>>>  - How is it possible (for any reasons) to get a usage percentage
>>>>> above 100? (In other situations, I saw it happen with memory too).
>>>>>  - Why is it I can't get kahaDB to reclaim seemingly unused journal files?
>>>>>
>>>>> TIA,
>>>>
>>>>
>>>>
>>>> --
>>>> http://redhat.com
>>>> http://blog.garytully.com
>>
>>
>>
>> --
>> http://redhat.com
>> http://blog.garytully.com



-- 
http://redhat.com
http://blog.garytully.com

Re: KahaDB: No messages, but log files not reclaimed and StorePercentUsage above 100

Posted by Gilles Harloux <gi...@gmail.com>.
I doubt I can reproduce it as a unit test: What I see happens
(sometimes, sometimes not) because I sigkill the process. If I replace
the kill with an orderly shutdown, everything works out ok.

On Tue, Oct 23, 2012 at 3:38 PM, Gary Tully <ga...@gmail.com> wrote:
> do you think you could come up with a unit test that shows growth. So
> something that uses mkahadb and produces or consumed in a transaction
> from destinations that span the stores.
>
> You could configure the size of journal data files to 16k or
> something, such that any issue shows up very quickly.
>
> A test case like that would help clarify the use case and if it can
> reproduce, would be a great start to getting a quick resolution.
>
> There is a test that does transacted send received across stores, but
> does not check for cleanup. It may be easy to extend it in that way.
>
> in activemq-core
> org.apache.activemq.store.StorePerDestinationTest#testTransactedSendReceiveAcrossStores
>
>
> On 23 October 2012 14:20, Gilles Harloux <gi...@gmail.com> wrote:
>> I found that link while you were answering, it seems <g>. I also found
>> https://issues.apache.org/jira/browse/AMQ-3866 and it looks a lot like
>> what I experience.
>>
>> I tried to delete the index, start a broker and debug as explained in
>> that ticket. Breakpointing at MessageDatabase#process, line 904
>> (version 5.7.0) tells me it seems there is XA in the mix (data is a
>> KahaAddMessageCommand instance, with
>> data.f_transactionInfo.f_localTransactionId == null and
>> data.f_transactionInfo.f_xaTransactionId != null). I guess it's due to
>> the use of mKahaDB, where I am transacting across both stores.
>>
>> In that ticket, you asked for the relevant data file. I can share mine
>> if there's any interest -- I just have to find a convenient way to
>> provide you with a 32 Mb file.
>>
>> On Tue, Oct 23, 2012 at 12:24 PM, Gary Tully <ga...@gmail.com> wrote:
>>> have a read of http://activemq.apache.org/why-do-kahadb-log-files-remain-after-cleanup.html
>>>
>>> The logging referenced there will show you what destinations are
>>> holding on to references to the journal data files.
>>>
>>> w.r.t the usage %, the journal size increases in chunks of data file
>>> size, so a new journal data file can push the usage over the limit.
>>>
>>> On 23 October 2012 09:15, Gilles Harloux <gi...@gmail.com> wrote:
>>>> Hi,
>>>>
>>>> I have a system embedding a broker with KahaDB as a store. I am trying
>>>> to get a feel about disaster recovery behavior. So what I am basically
>>>> doing is randomly kill & restart the process. I see a condition (after
>>>> killing & restarting multiple times) where the messages get consumed,
>>>> but kahaDB journal files (db-*.log) don't get reclaimed. As I set up a
>>>> storeUsage limit, it ends up blocking the system.
>>>>
>>>> I say the messages get consumed, because I can see my application
>>>> logging what messages it gets to handle; Also, JMX tells me that the
>>>> broker's TotalMesageCount is zero, while the StorePercentUsage is
>>>> above 100 (depending on parameters such as message size & rate, I saw
>>>> anything from 112 to 293 percent usage).
>>>>
>>>> So, two questions:
>>>>  - How is it possible (for any reasons) to get a usage percentage
>>>> above 100? (In other situations, I saw it happen with memory too).
>>>>  - Why is it I can't get kahaDB to reclaim seemingly unused journal files?
>>>>
>>>> TIA,
>>>
>>>
>>>
>>> --
>>> http://redhat.com
>>> http://blog.garytully.com
>
>
>
> --
> http://redhat.com
> http://blog.garytully.com

Re: KahaDB: No messages, but log files not reclaimed and StorePercentUsage above 100

Posted by Gary Tully <ga...@gmail.com>.
do you think you could come up with a unit test that shows growth. So
something that uses mkahadb and produces or consumed in a transaction
from destinations that span the stores.

You could configure the size of journal data files to 16k or
something, such that any issue shows up very quickly.

A test case like that would help clarify the use case and if it can
reproduce, would be a great start to getting a quick resolution.

There is a test that does transacted send received across stores, but
does not check for cleanup. It may be easy to extend it in that way.

in activemq-core
org.apache.activemq.store.StorePerDestinationTest#testTransactedSendReceiveAcrossStores


On 23 October 2012 14:20, Gilles Harloux <gi...@gmail.com> wrote:
> I found that link while you were answering, it seems <g>. I also found
> https://issues.apache.org/jira/browse/AMQ-3866 and it looks a lot like
> what I experience.
>
> I tried to delete the index, start a broker and debug as explained in
> that ticket. Breakpointing at MessageDatabase#process, line 904
> (version 5.7.0) tells me it seems there is XA in the mix (data is a
> KahaAddMessageCommand instance, with
> data.f_transactionInfo.f_localTransactionId == null and
> data.f_transactionInfo.f_xaTransactionId != null). I guess it's due to
> the use of mKahaDB, where I am transacting across both stores.
>
> In that ticket, you asked for the relevant data file. I can share mine
> if there's any interest -- I just have to find a convenient way to
> provide you with a 32 Mb file.
>
> On Tue, Oct 23, 2012 at 12:24 PM, Gary Tully <ga...@gmail.com> wrote:
>> have a read of http://activemq.apache.org/why-do-kahadb-log-files-remain-after-cleanup.html
>>
>> The logging referenced there will show you what destinations are
>> holding on to references to the journal data files.
>>
>> w.r.t the usage %, the journal size increases in chunks of data file
>> size, so a new journal data file can push the usage over the limit.
>>
>> On 23 October 2012 09:15, Gilles Harloux <gi...@gmail.com> wrote:
>>> Hi,
>>>
>>> I have a system embedding a broker with KahaDB as a store. I am trying
>>> to get a feel about disaster recovery behavior. So what I am basically
>>> doing is randomly kill & restart the process. I see a condition (after
>>> killing & restarting multiple times) where the messages get consumed,
>>> but kahaDB journal files (db-*.log) don't get reclaimed. As I set up a
>>> storeUsage limit, it ends up blocking the system.
>>>
>>> I say the messages get consumed, because I can see my application
>>> logging what messages it gets to handle; Also, JMX tells me that the
>>> broker's TotalMesageCount is zero, while the StorePercentUsage is
>>> above 100 (depending on parameters such as message size & rate, I saw
>>> anything from 112 to 293 percent usage).
>>>
>>> So, two questions:
>>>  - How is it possible (for any reasons) to get a usage percentage
>>> above 100? (In other situations, I saw it happen with memory too).
>>>  - Why is it I can't get kahaDB to reclaim seemingly unused journal files?
>>>
>>> TIA,
>>
>>
>>
>> --
>> http://redhat.com
>> http://blog.garytully.com



-- 
http://redhat.com
http://blog.garytully.com

Re: KahaDB: No messages, but log files not reclaimed and StorePercentUsage above 100

Posted by Gilles Harloux <gi...@gmail.com>.
I found that link while you were answering, it seems <g>. I also found
https://issues.apache.org/jira/browse/AMQ-3866 and it looks a lot like
what I experience.

I tried to delete the index, start a broker and debug as explained in
that ticket. Breakpointing at MessageDatabase#process, line 904
(version 5.7.0) tells me it seems there is XA in the mix (data is a
KahaAddMessageCommand instance, with
data.f_transactionInfo.f_localTransactionId == null and
data.f_transactionInfo.f_xaTransactionId != null). I guess it's due to
the use of mKahaDB, where I am transacting across both stores.

In that ticket, you asked for the relevant data file. I can share mine
if there's any interest -- I just have to find a convenient way to
provide you with a 32 Mb file.

On Tue, Oct 23, 2012 at 12:24 PM, Gary Tully <ga...@gmail.com> wrote:
> have a read of http://activemq.apache.org/why-do-kahadb-log-files-remain-after-cleanup.html
>
> The logging referenced there will show you what destinations are
> holding on to references to the journal data files.
>
> w.r.t the usage %, the journal size increases in chunks of data file
> size, so a new journal data file can push the usage over the limit.
>
> On 23 October 2012 09:15, Gilles Harloux <gi...@gmail.com> wrote:
>> Hi,
>>
>> I have a system embedding a broker with KahaDB as a store. I am trying
>> to get a feel about disaster recovery behavior. So what I am basically
>> doing is randomly kill & restart the process. I see a condition (after
>> killing & restarting multiple times) where the messages get consumed,
>> but kahaDB journal files (db-*.log) don't get reclaimed. As I set up a
>> storeUsage limit, it ends up blocking the system.
>>
>> I say the messages get consumed, because I can see my application
>> logging what messages it gets to handle; Also, JMX tells me that the
>> broker's TotalMesageCount is zero, while the StorePercentUsage is
>> above 100 (depending on parameters such as message size & rate, I saw
>> anything from 112 to 293 percent usage).
>>
>> So, two questions:
>>  - How is it possible (for any reasons) to get a usage percentage
>> above 100? (In other situations, I saw it happen with memory too).
>>  - Why is it I can't get kahaDB to reclaim seemingly unused journal files?
>>
>> TIA,
>
>
>
> --
> http://redhat.com
> http://blog.garytully.com

Re: KahaDB: No messages, but log files not reclaimed and StorePercentUsage above 100

Posted by Gary Tully <ga...@gmail.com>.
have a read of http://activemq.apache.org/why-do-kahadb-log-files-remain-after-cleanup.html

The logging referenced there will show you what destinations are
holding on to references to the journal data files.

w.r.t the usage %, the journal size increases in chunks of data file
size, so a new journal data file can push the usage over the limit.

On 23 October 2012 09:15, Gilles Harloux <gi...@gmail.com> wrote:
> Hi,
>
> I have a system embedding a broker with KahaDB as a store. I am trying
> to get a feel about disaster recovery behavior. So what I am basically
> doing is randomly kill & restart the process. I see a condition (after
> killing & restarting multiple times) where the messages get consumed,
> but kahaDB journal files (db-*.log) don't get reclaimed. As I set up a
> storeUsage limit, it ends up blocking the system.
>
> I say the messages get consumed, because I can see my application
> logging what messages it gets to handle; Also, JMX tells me that the
> broker's TotalMesageCount is zero, while the StorePercentUsage is
> above 100 (depending on parameters such as message size & rate, I saw
> anything from 112 to 293 percent usage).
>
> So, two questions:
>  - How is it possible (for any reasons) to get a usage percentage
> above 100? (In other situations, I saw it happen with memory too).
>  - Why is it I can't get kahaDB to reclaim seemingly unused journal files?
>
> TIA,



-- 
http://redhat.com
http://blog.garytully.com