You are viewing a plain text version of this content. The canonical link for it is here.
Posted to server-dev@james.apache.org by Matthieu Baechler <ma...@apache.org> on 2021/01/04 13:42:29 UTC

Re: The case of javax.mail MimeMessage CopyOnWrite optimization

Hi there,

Thank you for bringing this topic to the mailing-list.

To me safety and correctness is much more important than raw
performance so I would like the always-copy implementation to replace
the COW version.

However, keep in mind that the JMH benchmark figures did not told the
full story about the consequences of this change and be ready to
experience slower real-world performances.

Cheers,

-- Matthieu Baechler

On Tue, 2020-12-29 at 12:54 +0700, Tellier Benoit wrote:
> Hello there!
> 
> We had been discussing on GitHub recently about an optimization in
> james
> core around the usage of MimeMessage.
> 
> Javax.mail MimeMessage is currently used to represent a message of an
> email as part of the mail processing in James. It is part of the Mail
> interface (mailet-api).
> 
> As Mail envelope is composed of several recipients, mail related
> operations are performed once for all these recipients (we enqueue
> the
> mail one time, we strip bcc one time etc...). Troubles arise when we
> need different behaviors as part of mail processing across recipients
> (think remote recipients, that needs there mail to be relayed, versus
> local recipients that needs to be locally delivered). The email get's
> duplicated (in MatcherSplitter) and the processing will then be
> distinct
> for both entities. The underlying MimeMessage may - or may not be
> modified.
> 
> In order to prevent MimeMessage duplication in the event the
> underlying
> MimeMessage is not modified, a Copy On Write mechanism was introduced
> (I
> guess... Sorry, I was not there yet).
> 
> Upon his CI effort, Jean Helou with the help of Matthieu Baechler
> made
> he unpleasant finding that this was not thread safe, that was leading
> to
> build instabilities. The mailet processing happens in Camel, which is
> multi-threaded. Concurrency issues arised between modifications, and
> message disposal, when a same MimeMessage instance was shared. [1]
> 
> A first effort was to try to achieve thread-safety, which leaded to a
> brittle double reetrant read-write locks in order to govern data
> access.
> However, another performance enhancement bypassed these lock
> mechanism
> (MimeMessageWrapper allows accessing the data as an InputStream
> instead
> of requiring to copy it). The effort seemed overwhelming, not to
> metion
> possible risks of dead-locks. [2]
> 
> We then came up with an always copy implementation [3]. Simpler,
> safer... The underlying logic is to avoid trying being smarter than
> mutability, and leverage immutability to achieve thread safety, which
> is
> a classic functional programming idiom.
> 
> JMH benchmarks were conducted. We highlighter little performance
> difference for small messages, in the percent realm for both memory
> allocation and compute time. Differences are however higher for
> bigger
> messages (~10%) for both metrics.
> 
> Please note that above 100KB the MimeMessage would be stored on disk,
> thus limiting memory impact (see MimeMessageInputStreamSource). Maybe
> we
> should make the threshold configurable, via a system property for
> instance?
> 
> I just want to further mentioned I encountered that very issue on a
> production instance: the underlying email had been corrupted by the
> above mentioned COW bug and kept throwing NullPointerExceptions every
> time the content was accessed. This resulted (on top of
> distributed-james) in a RabbitMQ nack of the message, that ended up
> in a
> dead-letter queue. Replaying its processing required admin
> intervention
> and had been interpreted by the user as an email loss...
> 
> To conclude this effort we (Jean an I) would like to merge the
> "Always
> copy" pull request.
> 
> Also, would it be beneficial to write an ADR about this topic?
> 
> Thoughts?
> 
> Cheers,
> 
> Benoit
> 
> [1] The unfamous COW bug:
> https://github.com/apache/james-project/pull/280/commits/09b5554bbcbbb98757910d59bac54f97ee1f8b4f
> 
> [2] The double nested reetrant read-write lock fix attempt:
> https://github.com/apache/james-project/pull/280
> [3] The always copy fix:
> https://github.com/apache/james-project/pull/282
> [4] Benchmarks:
> https://github.com/apache/james-project/pull/280#issuecomment-745211736
> &
> https://github.com/apache/james-project/pull/280#issuecomment-745701937
> [5] The JIRA ticket: https://issues.apache.org/jira/browse/JAMES-3477
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
> For additional commands, e-mail: server-dev-help@james.apache.org
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org


Re: The case of javax.mail MimeMessage CopyOnWrite optimization

Posted by Tellier Benoit <bt...@apache.org>.
Hello Jean,

I do support merging https://github.com/apache/james-project/pull/282
along with https://github.com/apache/james-project/pull/286 as soon as
we get a green build.

I did finally succeed to run an SMTP benchmark against a simple workload
mixing small and big messages, the outcome was good, eventually
answering people wanting Gatling performance tests to complete JMH
benchmarks:
https://github.com/apache/james-project/pull/286#issuecomment-780221455

Cheers,

Benoit

Le 15/02/2021 à 14:58, Jean Helou a écrit :
> Hi everyone,
> 
> In light of benoit's latest comments on the related PRs, I want to report
> that I spent some time setting up an infra to do benchmarks against a full
> server (mx only) I'm not an ops at core, I don't know james all that well
> and spare time is limited and I got sidetracked quite a few times, so I
> still haven't run the benchmarks but I'm getting there.
> 
> I am strongly in favor of integrating at least #282.
> 
> #285 is nice to have for me but I won't have time to investigate the config
> file option rapidly.
> 
> Jean
> 
> Le jeu. 7 janv. 2021 à 18:04, Raphaël Ouazana <ro...@linagora.com> a
> écrit :
> 
>> Hi,
>>
>> I completely agree with what Matthieu said.
>>
>> Correctness is needed, so we have to fix this issue.
>>
>> But before merging on master and/or deploying this version I would
>> expect some Gatling runs to show which potential performance decrease we
>> can expect on real use cases.
>>
>> Thanks,
>>
>> Raphaël.
>>
>> Le 04/01/2021 à 14:42, Matthieu Baechler a écrit :
>>> Hi there,
>>>
>>> Thank you for bringing this topic to the mailing-list.
>>>
>>> To me safety and correctness is much more important than raw
>>> performance so I would like the always-copy implementation to replace
>>> the COW version.
>>>
>>> However, keep in mind that the JMH benchmark figures did not told the
>>> full story about the consequences of this change and be ready to
>>> experience slower real-world performances.
>>>
>>> Cheers,
>>>
>>> -- Matthieu Baechler
>>>
>>> On Tue, 2020-12-29 at 12:54 +0700, Tellier Benoit wrote:
>>>> Hello there!
>>>>
>>>> We had been discussing on GitHub recently about an optimization in
>>>> james
>>>> core around the usage of MimeMessage.
>>>>
>>>> Javax.mail MimeMessage is currently used to represent a message of an
>>>> email as part of the mail processing in James. It is part of the Mail
>>>> interface (mailet-api).
>>>>
>>>> As Mail envelope is composed of several recipients, mail related
>>>> operations are performed once for all these recipients (we enqueue
>>>> the
>>>> mail one time, we strip bcc one time etc...). Troubles arise when we
>>>> need different behaviors as part of mail processing across recipients
>>>> (think remote recipients, that needs there mail to be relayed, versus
>>>> local recipients that needs to be locally delivered). The email get's
>>>> duplicated (in MatcherSplitter) and the processing will then be
>>>> distinct
>>>> for both entities. The underlying MimeMessage may - or may not be
>>>> modified.
>>>>
>>>> In order to prevent MimeMessage duplication in the event the
>>>> underlying
>>>> MimeMessage is not modified, a Copy On Write mechanism was introduced
>>>> (I
>>>> guess... Sorry, I was not there yet).
>>>>
>>>> Upon his CI effort, Jean Helou with the help of Matthieu Baechler
>>>> made
>>>> he unpleasant finding that this was not thread safe, that was leading
>>>> to
>>>> build instabilities. The mailet processing happens in Camel, which is
>>>> multi-threaded. Concurrency issues arised between modifications, and
>>>> message disposal, when a same MimeMessage instance was shared. [1]
>>>>
>>>> A first effort was to try to achieve thread-safety, which leaded to a
>>>> brittle double reetrant read-write locks in order to govern data
>>>> access.
>>>> However, another performance enhancement bypassed these lock
>>>> mechanism
>>>> (MimeMessageWrapper allows accessing the data as an InputStream
>>>> instead
>>>> of requiring to copy it). The effort seemed overwhelming, not to
>>>> metion
>>>> possible risks of dead-locks. [2]
>>>>
>>>> We then came up with an always copy implementation [3]. Simpler,
>>>> safer... The underlying logic is to avoid trying being smarter than
>>>> mutability, and leverage immutability to achieve thread safety, which
>>>> is
>>>> a classic functional programming idiom.
>>>>
>>>> JMH benchmarks were conducted. We highlighter little performance
>>>> difference for small messages, in the percent realm for both memory
>>>> allocation and compute time. Differences are however higher for
>>>> bigger
>>>> messages (~10%) for both metrics.
>>>>
>>>> Please note that above 100KB the MimeMessage would be stored on disk,
>>>> thus limiting memory impact (see MimeMessageInputStreamSource). Maybe
>>>> we
>>>> should make the threshold configurable, via a system property for
>>>> instance?
>>>>
>>>> I just want to further mentioned I encountered that very issue on a
>>>> production instance: the underlying email had been corrupted by the
>>>> above mentioned COW bug and kept throwing NullPointerExceptions every
>>>> time the content was accessed. This resulted (on top of
>>>> distributed-james) in a RabbitMQ nack of the message, that ended up
>>>> in a
>>>> dead-letter queue. Replaying its processing required admin
>>>> intervention
>>>> and had been interpreted by the user as an email loss...
>>>>
>>>> To conclude this effort we (Jean an I) would like to merge the
>>>> "Always
>>>> copy" pull request.
>>>>
>>>> Also, would it be beneficial to write an ADR about this topic?
>>>>
>>>> Thoughts?
>>>>
>>>> Cheers,
>>>>
>>>> Benoit
>>>>
>>>> [1] The unfamous COW bug:
>>>>
>> https://github.com/apache/james-project/pull/280/commits/09b5554bbcbbb98757910d59bac54f97ee1f8b4f
>>>>
>>>> [2] The double nested reetrant read-write lock fix attempt:
>>>> https://github.com/apache/james-project/pull/280
>>>> [3] The always copy fix:
>>>> https://github.com/apache/james-project/pull/282
>>>> [4] Benchmarks:
>>>> https://github.com/apache/james-project/pull/280#issuecomment-745211736
>>>> &
>>>> https://github.com/apache/james-project/pull/280#issuecomment-745701937
>>>> [5] The JIRA ticket: https://issues.apache.org/jira/browse/JAMES-3477
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
>>>> For additional commands, e-mail: server-dev-help@james.apache.org
>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
>>> For additional commands, e-mail: server-dev-help@james.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
>> For additional commands, e-mail: server-dev-help@james.apache.org
>>
>>
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org


Re: The case of javax.mail MimeMessage CopyOnWrite optimization

Posted by Jean Helou <je...@gmail.com>.
Hi everyone,

In light of benoit's latest comments on the related PRs, I want to report
that I spent some time setting up an infra to do benchmarks against a full
server (mx only) I'm not an ops at core, I don't know james all that well
and spare time is limited and I got sidetracked quite a few times, so I
still haven't run the benchmarks but I'm getting there.

I am strongly in favor of integrating at least #282.

#285 is nice to have for me but I won't have time to investigate the config
file option rapidly.

Jean

Le jeu. 7 janv. 2021 à 18:04, Raphaël Ouazana <ro...@linagora.com> a
écrit :

> Hi,
>
> I completely agree with what Matthieu said.
>
> Correctness is needed, so we have to fix this issue.
>
> But before merging on master and/or deploying this version I would
> expect some Gatling runs to show which potential performance decrease we
> can expect on real use cases.
>
> Thanks,
>
> Raphaël.
>
> Le 04/01/2021 à 14:42, Matthieu Baechler a écrit :
> > Hi there,
> >
> > Thank you for bringing this topic to the mailing-list.
> >
> > To me safety and correctness is much more important than raw
> > performance so I would like the always-copy implementation to replace
> > the COW version.
> >
> > However, keep in mind that the JMH benchmark figures did not told the
> > full story about the consequences of this change and be ready to
> > experience slower real-world performances.
> >
> > Cheers,
> >
> > -- Matthieu Baechler
> >
> > On Tue, 2020-12-29 at 12:54 +0700, Tellier Benoit wrote:
> >> Hello there!
> >>
> >> We had been discussing on GitHub recently about an optimization in
> >> james
> >> core around the usage of MimeMessage.
> >>
> >> Javax.mail MimeMessage is currently used to represent a message of an
> >> email as part of the mail processing in James. It is part of the Mail
> >> interface (mailet-api).
> >>
> >> As Mail envelope is composed of several recipients, mail related
> >> operations are performed once for all these recipients (we enqueue
> >> the
> >> mail one time, we strip bcc one time etc...). Troubles arise when we
> >> need different behaviors as part of mail processing across recipients
> >> (think remote recipients, that needs there mail to be relayed, versus
> >> local recipients that needs to be locally delivered). The email get's
> >> duplicated (in MatcherSplitter) and the processing will then be
> >> distinct
> >> for both entities. The underlying MimeMessage may - or may not be
> >> modified.
> >>
> >> In order to prevent MimeMessage duplication in the event the
> >> underlying
> >> MimeMessage is not modified, a Copy On Write mechanism was introduced
> >> (I
> >> guess... Sorry, I was not there yet).
> >>
> >> Upon his CI effort, Jean Helou with the help of Matthieu Baechler
> >> made
> >> he unpleasant finding that this was not thread safe, that was leading
> >> to
> >> build instabilities. The mailet processing happens in Camel, which is
> >> multi-threaded. Concurrency issues arised between modifications, and
> >> message disposal, when a same MimeMessage instance was shared. [1]
> >>
> >> A first effort was to try to achieve thread-safety, which leaded to a
> >> brittle double reetrant read-write locks in order to govern data
> >> access.
> >> However, another performance enhancement bypassed these lock
> >> mechanism
> >> (MimeMessageWrapper allows accessing the data as an InputStream
> >> instead
> >> of requiring to copy it). The effort seemed overwhelming, not to
> >> metion
> >> possible risks of dead-locks. [2]
> >>
> >> We then came up with an always copy implementation [3]. Simpler,
> >> safer... The underlying logic is to avoid trying being smarter than
> >> mutability, and leverage immutability to achieve thread safety, which
> >> is
> >> a classic functional programming idiom.
> >>
> >> JMH benchmarks were conducted. We highlighter little performance
> >> difference for small messages, in the percent realm for both memory
> >> allocation and compute time. Differences are however higher for
> >> bigger
> >> messages (~10%) for both metrics.
> >>
> >> Please note that above 100KB the MimeMessage would be stored on disk,
> >> thus limiting memory impact (see MimeMessageInputStreamSource). Maybe
> >> we
> >> should make the threshold configurable, via a system property for
> >> instance?
> >>
> >> I just want to further mentioned I encountered that very issue on a
> >> production instance: the underlying email had been corrupted by the
> >> above mentioned COW bug and kept throwing NullPointerExceptions every
> >> time the content was accessed. This resulted (on top of
> >> distributed-james) in a RabbitMQ nack of the message, that ended up
> >> in a
> >> dead-letter queue. Replaying its processing required admin
> >> intervention
> >> and had been interpreted by the user as an email loss...
> >>
> >> To conclude this effort we (Jean an I) would like to merge the
> >> "Always
> >> copy" pull request.
> >>
> >> Also, would it be beneficial to write an ADR about this topic?
> >>
> >> Thoughts?
> >>
> >> Cheers,
> >>
> >> Benoit
> >>
> >> [1] The unfamous COW bug:
> >>
> https://github.com/apache/james-project/pull/280/commits/09b5554bbcbbb98757910d59bac54f97ee1f8b4f
> >>
> >> [2] The double nested reetrant read-write lock fix attempt:
> >> https://github.com/apache/james-project/pull/280
> >> [3] The always copy fix:
> >> https://github.com/apache/james-project/pull/282
> >> [4] Benchmarks:
> >> https://github.com/apache/james-project/pull/280#issuecomment-745211736
> >> &
> >> https://github.com/apache/james-project/pull/280#issuecomment-745701937
> >> [5] The JIRA ticket: https://issues.apache.org/jira/browse/JAMES-3477
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
> >> For additional commands, e-mail: server-dev-help@james.apache.org
> >>
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
> > For additional commands, e-mail: server-dev-help@james.apache.org
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
> For additional commands, e-mail: server-dev-help@james.apache.org
>
>

Re: The case of javax.mail MimeMessage CopyOnWrite optimization

Posted by Raphaël Ouazana <ro...@linagora.com>.
Hi,

I completely agree with what Matthieu said.

Correctness is needed, so we have to fix this issue.

But before merging on master and/or deploying this version I would 
expect some Gatling runs to show which potential performance decrease we 
can expect on real use cases.

Thanks,

Raphaël.

Le 04/01/2021 à 14:42, Matthieu Baechler a écrit :
> Hi there,
>
> Thank you for bringing this topic to the mailing-list.
>
> To me safety and correctness is much more important than raw
> performance so I would like the always-copy implementation to replace
> the COW version.
>
> However, keep in mind that the JMH benchmark figures did not told the
> full story about the consequences of this change and be ready to
> experience slower real-world performances.
>
> Cheers,
>
> -- Matthieu Baechler
>
> On Tue, 2020-12-29 at 12:54 +0700, Tellier Benoit wrote:
>> Hello there!
>>
>> We had been discussing on GitHub recently about an optimization in
>> james
>> core around the usage of MimeMessage.
>>
>> Javax.mail MimeMessage is currently used to represent a message of an
>> email as part of the mail processing in James. It is part of the Mail
>> interface (mailet-api).
>>
>> As Mail envelope is composed of several recipients, mail related
>> operations are performed once for all these recipients (we enqueue
>> the
>> mail one time, we strip bcc one time etc...). Troubles arise when we
>> need different behaviors as part of mail processing across recipients
>> (think remote recipients, that needs there mail to be relayed, versus
>> local recipients that needs to be locally delivered). The email get's
>> duplicated (in MatcherSplitter) and the processing will then be
>> distinct
>> for both entities. The underlying MimeMessage may - or may not be
>> modified.
>>
>> In order to prevent MimeMessage duplication in the event the
>> underlying
>> MimeMessage is not modified, a Copy On Write mechanism was introduced
>> (I
>> guess... Sorry, I was not there yet).
>>
>> Upon his CI effort, Jean Helou with the help of Matthieu Baechler
>> made
>> he unpleasant finding that this was not thread safe, that was leading
>> to
>> build instabilities. The mailet processing happens in Camel, which is
>> multi-threaded. Concurrency issues arised between modifications, and
>> message disposal, when a same MimeMessage instance was shared. [1]
>>
>> A first effort was to try to achieve thread-safety, which leaded to a
>> brittle double reetrant read-write locks in order to govern data
>> access.
>> However, another performance enhancement bypassed these lock
>> mechanism
>> (MimeMessageWrapper allows accessing the data as an InputStream
>> instead
>> of requiring to copy it). The effort seemed overwhelming, not to
>> metion
>> possible risks of dead-locks. [2]
>>
>> We then came up with an always copy implementation [3]. Simpler,
>> safer... The underlying logic is to avoid trying being smarter than
>> mutability, and leverage immutability to achieve thread safety, which
>> is
>> a classic functional programming idiom.
>>
>> JMH benchmarks were conducted. We highlighter little performance
>> difference for small messages, in the percent realm for both memory
>> allocation and compute time. Differences are however higher for
>> bigger
>> messages (~10%) for both metrics.
>>
>> Please note that above 100KB the MimeMessage would be stored on disk,
>> thus limiting memory impact (see MimeMessageInputStreamSource). Maybe
>> we
>> should make the threshold configurable, via a system property for
>> instance?
>>
>> I just want to further mentioned I encountered that very issue on a
>> production instance: the underlying email had been corrupted by the
>> above mentioned COW bug and kept throwing NullPointerExceptions every
>> time the content was accessed. This resulted (on top of
>> distributed-james) in a RabbitMQ nack of the message, that ended up
>> in a
>> dead-letter queue. Replaying its processing required admin
>> intervention
>> and had been interpreted by the user as an email loss...
>>
>> To conclude this effort we (Jean an I) would like to merge the
>> "Always
>> copy" pull request.
>>
>> Also, would it be beneficial to write an ADR about this topic?
>>
>> Thoughts?
>>
>> Cheers,
>>
>> Benoit
>>
>> [1] The unfamous COW bug:
>> https://github.com/apache/james-project/pull/280/commits/09b5554bbcbbb98757910d59bac54f97ee1f8b4f
>>
>> [2] The double nested reetrant read-write lock fix attempt:
>> https://github.com/apache/james-project/pull/280
>> [3] The always copy fix:
>> https://github.com/apache/james-project/pull/282
>> [4] Benchmarks:
>> https://github.com/apache/james-project/pull/280#issuecomment-745211736
>> &
>> https://github.com/apache/james-project/pull/280#issuecomment-745701937
>> [5] The JIRA ticket: https://issues.apache.org/jira/browse/JAMES-3477
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
>> For additional commands, e-mail: server-dev-help@james.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
> For additional commands, e-mail: server-dev-help@james.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org