You are viewing a plain text version of this content. The canonical link for it is here.
Posted to server-dev@james.apache.org by Andreas Joseph Krogh <an...@visena.com> on 2021/09/21 09:49:30 UTC

Proposed change to LenientAddressParser.createMailbox

Hi, we're seeing that this method in mime4j's LenientAddressParser: 

private Mailbox createMailbox(final String localPart) { if (localPart != null 
&& localPart.length() >0) { return new Mailbox(null, null, localPart, null); } 
else{ return null; } } 

Creates invalid addresses when the localPart is encoded, which it is in some 
emails, so I'm suggesting to change it to this: 

private Mailbox createMailbox(final String localPart) { if (localPart != null 
&& localPart.length() >0) { return new Mailbox(null, null, DecoderUtil
.decodeEncodedWords(localPart,this.monitor), null); } else { return null; } } 

We've used a custom build with this change for years now with no problems. 

What do you think? 



-- 
Andreas Joseph Krogh 
CTO / Partner - Visena AS 
Mobile: +47 909 56 963 
andreas@visena.com <ma...@visena.com> 
www.visena.com <https://www.visena.com> 
 <https://www.visena.com> 

Re: Proposed change to LenientAddressParser.createMailbox

Posted by "btellier@apache.org" <bt...@apache.org>.
Hello Andreas,

+1 to address this need.

The fact that MIME4J entities do not decode encoded word is very tricky
to users (eg [1] ), API discoverability is poor, and relying on the end
user to do the decoding is error prone. I personally ended up
introducing several encoding bugs without noticing it because of this.

We generally not only test encoding but also unfolding, which is also
not necessarily done.

To sum up, I think most of the MIME4J DOM header entities needs to offer
a way to access decoded values.



That being said, the API stability for MIME4J seems paramount to me, I
would lean on the side of duplicated methods to handle decoding and the
like (createDecodedMailbox etc...)

If we decide to port the DOM API to "decoded by default" then we need to
ensure decoding is idempotent (decoding twice yields the same result) as
we can legitimately assume some users won't remove their decoding logic,
and might not test their encoding logic.



[1] https://github.com/apache/james-mime4j/pull/55 proposes also to
handle decoding but for Content-Disposition field.

Benoit

On 21/09/2021 16:49, Andreas Joseph Krogh wrote:
> Hi, we're seeing that this method in mime4j's LenientAddressParser:
>  
> private Mailbox createMailbox(final String localPart) {
>     if (localPart != null && localPart.length() > 0) {
>         return new Mailbox(null, null, localPart, null);
>     } else {
>         return null;
>     }
> }
>  
> Creates invalid addresses when the localPart is encoded, which it is
> in some emails, so I'm suggesting to change it to this:
>  
> private Mailbox createMailbox(final String localPart) {
>     if (localPart != null && localPart.length() > 0) {
>         return new Mailbox(null, null,
>             DecoderUtil.decodeEncodedWords(localPart, this.monitor), null);
>     } else {
>         return null;
>     }
> }
>  
> We've used a custom build with this change for years now with no problems.
>  
> What do you think?
>  
> --
> *Andreas Joseph Krogh*
> CTO / Partner - Visena AS
> Mobile: +47 909 56 963
> andreas@visena.com <ma...@visena.com>
> www.visena.com <https://www.visena.com>
> <https://www.visena.com>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
> For additional commands, e-mail: server-dev-help@james.apache.org