You are viewing a plain text version of this content. The canonical link for it is here.
Posted to server-dev@james.apache.org by Norman Maurer <no...@apache.org> on 2010/10/11 19:28:07 UTC

IMAP Fetch OOM [WAS: Re: ActiveMQ the cause?]

Hi there,

I tried to get my head around the cause of the OOM and I think I found
a good solution. Let me outline the idea...

The MessageManager interface expose many methods which expose kinds of
Iterator<...> objects. So the idea is to build up some "batch retrieve
Iterator" which then retrieve the needed data in batches. So the GC
has a chance to kick in and free up resources. So here is an example..

When call MessageManager.getMessages(...) (which returns
Iterator<MessageResult> ) and use a MessageRange of 1:1000 we would
fetch the MessagResult 1:100 as starting point. Then wait till
Iterator.hasNext() or Iterator.next() is called and we have no
MessageResult left we would fetch 101:200 and so on..

I think this should work and will be much more efficient in terms of
memory. All this could get done in the "store" implementations and
could be configurable. Like use 100 as default batch count and be able
to set it via a setter.

WDYT ?

Bye,
Norman


2010/10/11 Norman Maurer <no...@apache.org>:
> Hi Eric,
>
> Did you also have a look with wireshark what the exact command and
> argument was which triggered the OOM?
>
> Thx
> Norman
>
> 2010/10/11, Eric Charles <er...@apache.org>:
>> Hi Norman,
>>
>> There were 2 main problems:
>> 1. The amq one which is now resolved tks to your last commit
>> 2. James no more responding on imap which is always caused by OOM (I
>> missed some log the first time).
>>
>> For the second one, analysis of memory dump shows oom comes from huge
>> usage of memory due to loading of message, headers,... (in case of
>> 10.000 message fetch for example).
>> I don't benefit from Lob streaming on derby database, but it won't help
>> much because jpaheader for example also take much memory.
>>
>> Tks,
>>
>> Eric
>>
>> On 11/10/2010 13:10, Norman Maurer wrote:
>>> Ok 4/5 is fixed now... Just to keep you updated..
>>>
>>> Bye.
>>> Norman
>>>
>>> 2010/10/11 Norman Maurer<no...@apache.org>:
>>>> Ok at least you can reproduce it, thats good ;) Did you take a  thread
>>>> dump ?
>>>>
>>>> Bye,
>>>> Norman
>>>>
>>>>
>>>> 2010/10/11 Eric Charles<er...@apache.org>:
>>>>> It's the same with latest thunderbird
>>>>> I restarted disabling 'Check for new messages on startup on all my
>>>>> accounts.
>>>>> If I go quickly from one folder to another, I fall back in the endless
>>>>> 'downloading'/'indexing'...
>>>>> However, if I quietly click on 'Get Mail' folder per folder, it's ok.
>>>>>
>>>>> I think we are still with Bug 1 (Bug 2 and 3 should be resolved if 1 is
>>>>> resolved) for IMAP, fetching simultaneously some folders.
>>>>> Bug 4 is for amq.
>>>>>
>>>>> Tks,
>>>>>
>>>>> Eric
>>>>>
>>>>>
>>>>> On 10/10/2010 20:03, Eric Charles wrote:
>>>>>> I tried to resync thunderbird without clicking on any folder.
>>>>>> Still the same behaviour : "downloading xxx on yyy", www on zzz,...
>>>>>>
>>>>>> Wireshark tells me more: I never saw such red/black lines in the tcp
>>>>>> stream (one red/black on every 5/10 tcp packet: "segment lost").
>>>>>> 1783    8.626604    91.183.38.48    192.168.1.12    IMAP    [TCP
>>>>>> Previous
>>>>>> segment lost] Response:
>>>>>> ss.properties?rev=1005079&r1=1005078&r2=1005079&view=diff
>>>>>>
>>>>>> I was wondering if my cable was right:
>>>>>> - tested plain http via cable: wireshark is green.
>>>>>> - tested thunderbird/james via wifi : same black/red lines in
>>>>>> wireshark.
>>>>>>
>>>>>> I have saved the dump and will analyze further tomorrow, but a tcp
>>>>>> conversation selected from a "segment lost" seems ok.
>>>>>>
>>>>>> So for now (this may change), I think we have:
>>>>>>
>>>>>> 1. A client is in a stage that causes the "segment lost" tcp errors ==>
>>>>>> Bug 1
>>>>>> 2. Client/server conversation loops endless ==>  Bug 2
>>>>>> 3.1. James finally hangs ==>  Bug 3
>>>>>> 3.2. James finally gets oom ==>  Bug 3
>>>>>> 4. Manual stop is needed.
>>>>>> 5. After manual stop in state 3.1 or 3.2, there's a activemq
>>>>>> java.io.EOFException: Chunk stream does not exist at page: 0 ==>  Bug 4
>>>>>>
>>>>>> So 4 bugs ?
>>>>>> I will upgrade my thunderbird 3.0.3 on linux to the latest version and
>>>>>> see
>>>>>> if bug 1 is not resolved.
>>>>>> Bug 4 may be resolved with 5.4.1 and latest commits for the james stop
>>>>>> procedure.
>>>>>>
>>>>>> Tks,
>>>>>>
>>>>>> Eric
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 10/10/2010 18:31, Eric Charles wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I have on James 3 (trunk of 2 week ago) my INBOX with 10 subfolders,
>>>>>>> some
>>>>>>> of these subfolders having more than 10.000 mails.
>>>>>>> I mainly use a PC, so the IMAP sync is done regulary along the day.
>>>>>>>
>>>>>>> I also have another PC I synchronize once a week.
>>>>>>> During the IMAP sync of that PC, I selected randomly some subfolders
>>>>>>> and
>>>>>>> saw (this occured twice, but not always...):
>>>>>>> - Thunderbird syncs well during a some minutes (10?)
>>>>>>> - After, Thunderbird begins to say "downloading xx of yy mails"..
>>>>>>> .when
>>>>>>> yy is reached, he says "downloading ww of zz" where zz is a little
>>>>>>> greater
>>>>>>> than yy.
>>>>>>> - I wait, wait, and finally have timeout, and the mails are no more
>>>>>>> viewable in thunderbird.
>>>>>>>
>>>>>>> James is stucked.
>>>>>>> The first time I had OOM (I think), today, I had no OOM, but James was
>>>>>>> no
>>>>>>> more reachable via IMAP, though accepting mails via SMTP.
>>>>>>>
>>>>>>> I stopped, and when restarting, I had the following exception (James
>>>>>>> was
>>>>>>> not usable anymore):
>>>>>>> INFO  18:16:37,646 | org.apache.activemq.store.kahadb.plist.PListStore
>>>>>>> |
>>>>>>> PListStore:activemq-data/localhost/tmp_storage started
>>>>>>> INFO  18:16:37,648 | org.apache.activemq.broker.BrokerService | Using
>>>>>>> Persistence Adapter:
>>>>>>> KahaDBPersistenceAdapter[activemq-data/localhost/KahaDB]
>>>>>>> INFO  18:16:38,248 | org.apache.activemq.store.kahadb.plist.PListStore
>>>>>>> |
>>>>>>> PListStore:../data/localhost/tmp_storage started
>>>>>>> ERROR 18:16:38,301 | org.apache.activemq.broker.BrokerService | Failed
>>>>>>> to
>>>>>>> start ActiveMQ JMS Message Broker. Reason: java.io.EOFException: Chunk
>>>>>>> stream does not exist at page: 0
>>>>>>> java.io.EOFException: Chunk stream does not exist at page: 0
>>>>>>>         at
>>>>>>> org.apache.kahadb.page.Transaction$2.readPage(Transaction.java:454)
>>>>>>>         at
>>>>>>> org.apache.kahadb.page.Transaction$2.<init>(Transaction.java:431)
>>>>>>>         at
>>>>>>> org.apache.kahadb.page.Transaction.openInputStream(Transaction.java:428)
>>>>>>>         at
>>>>>>> org.apache.kahadb.page.Transaction.load(Transaction.java:404)
>>>>>>>         at
>>>>>>> org.apache.kahadb.page.Transaction.load(Transaction.java:361)
>>>>>>>         at
>>>>>>> org.apache.activemq.store.kahadb.MessageDatabase$1.execute(MessageDatabase.java:243)
>>>>>>>         at
>>>>>>> org.apache.kahadb.page.Transaction.execute(Transaction.java:728)
>>>>>>>         at
>>>>>>> org.apache.activemq.store.kahadb.MessageDatabase.loadPageFile(MessageDatabase.java:230)
>>>>>>>         at
>>>>>>> org.apache.activemq.store.kahadb.MessageDatabase.open(MessageDatabase.java:309)
>>>>>>>         at
>>>>>>> org.apache.activemq.store.kahadb.MessageDatabase.load(MessageDatabase.java:353)
>>>>>>>         at
>>>>>>> org.apache.activemq.store.kahadb.MessageDatabase.doStart(MessageDatabase.java:217)
>>>>>>>         at
>>>>>>> org.apache.activemq.store.kahadb.KahaDBStore.doStart(KahaDBStore.java:178)
>>>>>>>
>>>>>>> Sounds l ike https://issues.apache.org/activemq/browse/AMQ-2935.
>>>>>>>
>>>>>>> To solve it, I had to remove the activemq-data directory (btw, 2 weeks
>>>>>>> ago was activemq 5.4.0 with 2 brokers started and activemq-data in bin
>>>>>>> directory).
>>>>>>>
>>>>>>> I made a test to restart from scratch my account in thunderbird, and
>>>>>>> it
>>>>>>> was OK.
>>>>>>>
>>>>>>> Is it because it does a incremental sync and I select different
>>>>>>> folders
>>>>>>> (just to make things complicated :) ) during the download ?
>>>>>>>
>>>>>>> Anyway, it is not easy to reproduce.
>>>>>>> Activemq 5.4.1. may be worth to try, but I'm not sure it the the
>>>>>>> cause...
>>>>>>>
>>>>>>> Tks,
>>>>>>>
>>>>>>> Eric
>>>>>>>
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
>>>>>>> For additional commands, e-mail: server-dev-help@james.apache.org
>>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
>>>>>> For additional commands, e-mail: server-dev-help@james.apache.org
>>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
>>>>> For additional commands, e-mail: server-dev-help@james.apache.org
>>>>>
>>>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
>>> For additional commands, e-mail: server-dev-help@james.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
>> For additional commands, e-mail: server-dev-help@james.apache.org
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org