You are viewing a plain text version of this content. The canonical link for it is here.
Posted to server-dev@james.apache.org by Norman Maurer <no...@googlemail.com> on 2010/01/11 16:02:28 UTC

Re: Enhancement to Mailet API by the inclusion of Composite pattern matchers for And, Or, Not and Xor

Hi Ralph,

this really sounds like some nice enhancement. Please open a jira
issue and attach diff there, I think we will be happy to include your
changes :)

Bye,
Norman

2010/1/11 Ralph B Holland <ra...@arising.com.au>:
> Hi James developers,
>
> I enclose a design that uses the composite pattern in the James mail-server
> to permit declaration of complex matchers in the james-config.xml file
> (deployed as config.xml).
>
> A summary of the Mailet API changes for consideration and voting are
> enclosed:
>
> • Matchers can be pre-declared before use in a Mailet through a <matcher>
> element declaration which must precede the first use in a Mailet.
> • The Mailet refers to the pre-declared matchers via the supplied name
> attribute, the name being an alias to the composite class instance.
> • The Matchers are loaded and inited via the JamesMatcherLoader derived from
> the MatcherLoader interface that has been modified to include an additional
> signature accepting the alias name.
> • A Not matcher has been proposed to negate another matcher's result to
> provide negated logic construction - it mimics the implementation of the Not
> functionality performed in processor recipient handling.
> • And there are three initially proposed composites: And, Or and Xor.
> • The And produces the intersection of two or more child-matcher recipient
> results.
> • The Or produces the union of two or more child-matcher recipient results.
> • The Xor produces the exclusive or (non-identity) composition of two or
> more child-matcher recipient results.
> • All operations are commutative and are applied to each child matcher
> recipient collection in order but under certain cases it is short-circuited
> to optimise performance (assuming that matchers do not have side-effects).
>
> The following composite pattern declaration is typical of what can be
> achieved, and is an extract from my James server config.xml file:
>
>        <!-- this isn't a good spam-check but it illustrates what you can do
> -->
>          <matcher name="spam-check" match="Or">
>                <matcher
> match="HasRecipientsInDomainNotMatchingRegex=arising.com.au,.*(ralph|pocketf
> ms|angelflight|vk1brh|accounts|cuteftp|ralph\.holland|resume|trx)@arising.co
> m.au.*"/>
>                <matcher match="And">
>                        <matcher match="Not">
>                            <matcher match="HostIs=65.55.116.84"/>
>                        </matcher>
>                        <matcher
> match="HasHeaderWithRegex=X-Verify-SMTP,Host(.*)sending to us was not
> listening"/>
>                </matcher>
>                <matcher match="HasHeaderWithRegex=X-DNS-Paranoid,(.*)"/>
>                <matcher
> match="HasHeaderWithRegex=Subject,(.*)([Vv][Ii][Aa][Gg][Rr][Aa]|[Cc][Ii][Aa]
> [Ll][Ii][Ss]|[Vv][Ii][Cc][Oo][Dd][Ii][Nn])(.*)"/>
>                <matcher
> match="HasHeaderWithRegex=From,(.*)([Vv][Ii][Aa][Gg][Rr][Aa]|[Cc][Ii][Aa][Ll
> ][Ii][Ss]|[Vv][Ii][Cc][Oo][Dd][Ii][Nn])(.*)"/>
>                <matcher match="HasHeaderWithRegex=Subject,.*Download Adobe
> PDF Reader For Windows.*"/>
>                <matcher
> match="HasHeaderWithRegex=From,(.*)([Ee][Nn][Ll][Aa][Rr][Gg][Ee][Mm][Ee][Nn]
> [Tt])(.*)([Pp][Ii][Ll][Ll][Ss])(.*)"/>
>                <matcher
> match="HasHeaderWithRegex=Subject,(.*)([Ee][Nn][Ll][Aa][Rr][Gg][Ee][Mm][Ee][
> Nn][Tt])(.*)([Pp][Ii][Ll][Ll][Ss])(.*)"/>
>                <matcher match="InSpammerBlacklist=zen.spamhaus.org"/>
>                <matcher
> match="SenderIsRegex=(.*)([Vv][Ii][Aa][Gg][Rr][Aa]|[Cc][Ii][Aa][Ll][Ii][Ss]|
> [Vv][Ii][Cc][Oo][Dd][Ii][Nn])(.*)"/>
>         </matcher>
>
>         <mailet match="spam-check" class="ToProcessor">
>                <processor>spam</processor>
>         </mailet>
>
> The affected code is:
>
> MatcherLoader and the JamesMatcherLoader derivative with the new included
> signature:
>
>         /**
>         * @param matchName is the regular className with optional condition
> expression
>         * @param name is the alias or name attribute
>         */
>       public Matcher getMatcher(String matchName,String alias) throws
> MessagingException;
>
> New org.apache.mailet.CompositeMatcher:
>
> /**
>  * A CompositeMatcher contains child matchers that are invoked in turn and
> their
>  * recipient results are composed using the composite class operation. (See
> And, Or, Xor and Not.)
>  * One or more children may be supplied to a composite via declarations
> inside a <processor> element
>  * in the james-config.xml file. When the composite is the outer-level
> declaration it must be named as in the example below.
>  * The composite matcher is referenced by name in the match attribute of a
> subsequent mailet. It may be referenced any number
>  * of times in this way. Any matcher may be included as a child of a
> composite matcher, including another composite matcher or Not.
>  * As a consequence, the class names: And, Or, Not and Xor are permanently
> reserved.
>  * <pre>
>  *   <matcher name="a-composite" match="Or">
>  *              <matcher match="And">
>  *                      <matcher match="Not">
>  *                          <matcher match="HostIs=65.55.116.84"/>
>  *                      </matcher>
>  *                      <matcher
> match="HasHeaderWithRegex=X-Verify-SMTP,Host(.*)sending to us was not
> listening"/>
>  *              </matcher>
>  *              <matcher match="HasHeaderWithRegex=X-DNS-Paranoid,(.*)"/>
>  *       </matcher>
>  *       <mailet match="a-composite" class="ToProcessor">
>  *              <processor>spam</processor>
>  *       </mailet>
>  * </pre>
>  * @author Ralph Holland
>  *
>  */
> public interface CompositeMatcher extends Matcher
> {
>
>   /**
>    * @return iterator to children matchers
>    */
>    public Iterator iterator();
>
>  /**
>    * Add a child matcher to this composite matcher. This is called by
> SpoolManager.setupMatcher()
>    * @param matcher
>    */
>    public void add(Matcher matcher);
>
>
> }
>
> New org.apache.mailet.CompositeMatcherBase:
>
> public abstract class CompositeMatcherBase extends GenericMatcher implements
> CompositeMatcher
> {
>    /**
>     * This lets the configurator build up the composition (which might be
> composed of other composites
>     * @param matcher
>     */
>    public void add(Matcher matcher)
>    {
>        matchers.add(matcher);
>    }
>
>    /**
>     * @return iterator to child-matchers
>     */
>    public Iterator iterator()
>    {
>        return matchers.iterator();
>    }
>
>
>    private Collection matchers = new ArrayList();
>
> }
>
> New org.apache.james.transport.matchers.And:
>
> public class And extends CompositeMatcherBase
> {
>   /**
>     * This is the And CompositeMatcher - consider it to be an intersection
> of the results.
>     * If any match returns an empty recipient result the matching is
> short-circuited.
>     * @return the And composition of the nested matchers.
>     */
>    public Collection match(Mail mail) throws MessagingException
> ...
>
> New org.apache.james.transport.matchers.Or:
>
> public class Or extends CompositeMatcherBase {
>
>
>    /**
>     * This is the Or CompositeMatcher - consider it to be a union of the
> results.
>     * If any match returns an empty recipient result the matching is
> short-circuited.
>     * @return the Or composition of the nested matchers.
>     */
>    public Collection match(Mail mail) throws MessagingException
> ...
>
> New org.apache.james.transport.matchers.Xor:
> public class Xor extends CompositeMatcherBase {
>
>    /**
>     * This is the Xor CompositeMatcher - consider it to be the inequality
> operator for recipients
>     * If any recipients match in all the matcher results then the result
> does not include that recipient.
>     * @return the Xor composition of the composed matchers.
>     */
>    public Collection match(Mail mail) throws MessagingException
> ...
>
> New org.apache.james.transport.matchers.Not:
> public class NotMatcher extends CompositeMatcherBase {
>
>    /**
>     * This is the Not CompositeMatcher - consider what wasn't in the result
> set of each matcher.
>     * Of course it is easier to understand if it only includes one matcher.
>
>     * @return the Negated composition of the nested matchers.
>     */
>    public Collection match(Mail mail) throws MessagingException
>    ...
>
> Modifications to SpoolManager.initialize() to handle loading and
> initialization of the composites.
>
>
> Various unit tests have been constructed to test the composite operations,
> though there are no unit tests YET to ensure that the actual JamesLoader
> loads any matcher correctly - I think such a test should exist rather than
> mocking the class loading.
>
> If you like this design and vote for it, I will provide the complete source
> code for evaluation and re-distribution in James, with the only caveat being
> attribution acknowledgement.
>
> I am also open to the re-naming of the composite classes because: And is
> Intersection, Or is Union, and Xor is NotEqual, in terms of the recipient
> result sets. A result set is true if at least one recipient is returned, as
> per the James processing chain, and should not all recipients be returned,
> then a clone of the mail will be sent with the Not of the recipients to the
> next processor in the processing chain until the mail is acquitted.
>
>
> Regards,
> Ralph Holland
> Managing Director
> www.arising.com.au
> BH:    61 2 61271265
> AH:    61 2 62312869
> Fax:    61 2 62312768
> Mob:  0417 312869 (AH/weekends only)
> www.arising.com.au/aviation
> _______________________________________________________________________
> This email message and any accompanying attachments may contain
> information that is confidential and intended only for the use
> of the addressee(s) named above. It may also be privileged.
> If you are not the intended recipient do not read, use,
> disseminate, distribute or copy or take any action in reliance on it.
> If you have received this message in error, please notify the sender
> immediately, and delete this message. It is your responsibility to
> check attachments for viruses or defects.
> _______________________________________________________________________
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
> For additional commands, e-mail: server-dev-help@james.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org


RE: Enhancement to Mailet API by the inclusion of Composite pattern matchers for And, Or, Not and Xor

Posted by Ralph B Holland <ra...@arising.com.au>.
Stefano and Norman,

No problems about removing attribution from the code - the Jira entry
history will be fine.

Ralph

-----Original Message-----
From: Norman Maurer [mailto:norman.maurer@googlemail.com] 
Sent: Tuesday, 12 January 2010 02:02
To: James Developers List
Cc: ralph@arisng.com.au
Subject: Re: Enhancement to Mailet API by the inclusion of Composite pattern
matchers for And, Or, Not and Xor

Hi Ralph,

this really sounds like some nice enhancement. Please open a jira
issue and attach diff there, I think we will be happy to include your
changes :)

Bye,
Norman

2010/1/11 Ralph B Holland <ra...@arising.com.au>:
> Hi James developers,
>
> I enclose a design that uses the composite pattern in the James
mail-server
> to permit declaration of complex matchers in the james-config.xml file
> (deployed as config.xml).
>
> A summary of the Mailet API changes for consideration and voting are
> enclosed:
>
> • Matchers can be pre-declared before use in a Mailet through a <matcher>
> element declaration which must precede the first use in a Mailet.
> • The Mailet refers to the pre-declared matchers via the supplied name
> attribute, the name being an alias to the composite class instance.
> • The Matchers are loaded and inited via the JamesMatcherLoader derived
from
> the MatcherLoader interface that has been modified to include an
additional
> signature accepting the alias name.
> • A Not matcher has been proposed to negate another matcher's result to
> provide negated logic construction - it mimics the implementation of the
Not
> functionality performed in processor recipient handling.
> • And there are three initially proposed composites: And, Or and Xor.
> • The And produces the intersection of two or more child-matcher recipient
> results.
> • The Or produces the union of two or more child-matcher recipient
results.
> • The Xor produces the exclusive or (non-identity) composition of two or
> more child-matcher recipient results.
> • All operations are commutative and are applied to each child matcher
> recipient collection in order but under certain cases it is
short-circuited
> to optimise performance (assuming that matchers do not have side-effects).
>
> The following composite pattern declaration is typical of what can be
> achieved, and is an extract from my James server config.xml file:
>
>        <!-- this isn't a good spam-check but it illustrates what you can
do
> -->
>          <matcher name="spam-check" match="Or">
>                <matcher
>
match="HasRecipientsInDomainNotMatchingRegex=arising.com.au,.*(ralph|pocketf
>
ms|angelflight|vk1brh|accounts|cuteftp|ralph\.holland|resume|trx)@arising.co
> m.au.*"/>
>                <matcher match="And">
>                        <matcher match="Not">
>                            <matcher match="HostIs=65.55.116.84"/>
>                        </matcher>
>                        <matcher
> match="HasHeaderWithRegex=X-Verify-SMTP,Host(.*)sending to us was not
> listening"/>
>                </matcher>
>                <matcher match="HasHeaderWithRegex=X-DNS-Paranoid,(.*)"/>
>                <matcher
>
match="HasHeaderWithRegex=Subject,(.*)([Vv][Ii][Aa][Gg][Rr][Aa]|[Cc][Ii][Aa]
> [Ll][Ii][Ss]|[Vv][Ii][Cc][Oo][Dd][Ii][Nn])(.*)"/>
>                <matcher
>
match="HasHeaderWithRegex=From,(.*)([Vv][Ii][Aa][Gg][Rr][Aa]|[Cc][Ii][Aa][Ll
> ][Ii][Ss]|[Vv][Ii][Cc][Oo][Dd][Ii][Nn])(.*)"/>
>                <matcher match="HasHeaderWithRegex=Subject,.*Download Adobe
> PDF Reader For Windows.*"/>
>                <matcher
>
match="HasHeaderWithRegex=From,(.*)([Ee][Nn][Ll][Aa][Rr][Gg][Ee][Mm][Ee][Nn]
> [Tt])(.*)([Pp][Ii][Ll][Ll][Ss])(.*)"/>
>                <matcher
>
match="HasHeaderWithRegex=Subject,(.*)([Ee][Nn][Ll][Aa][Rr][Gg][Ee][Mm][Ee][
> Nn][Tt])(.*)([Pp][Ii][Ll][Ll][Ss])(.*)"/>
>                <matcher match="InSpammerBlacklist=zen.spamhaus.org"/>
>                <matcher
>
match="SenderIsRegex=(.*)([Vv][Ii][Aa][Gg][Rr][Aa]|[Cc][Ii][Aa][Ll][Ii][Ss]|
> [Vv][Ii][Cc][Oo][Dd][Ii][Nn])(.*)"/>
>         </matcher>
>
>         <mailet match="spam-check" class="ToProcessor">
>                <processor>spam</processor>
>         </mailet>
>
> The affected code is:
>
> MatcherLoader and the JamesMatcherLoader derivative with the new included
> signature:
>
>         /**
>         * @param matchName is the regular className with optional
condition
> expression
>         * @param name is the alias or name attribute
>         */
>       public Matcher getMatcher(String matchName,String alias) throws
> MessagingException;
>
> New org.apache.mailet.CompositeMatcher:
>
> /**
>  * A CompositeMatcher contains child matchers that are invoked in turn and
> their
>  * recipient results are composed using the composite class operation.
(See
> And, Or, Xor and Not.)
>  * One or more children may be supplied to a composite via declarations
> inside a <processor> element
>  * in the james-config.xml file. When the composite is the outer-level
> declaration it must be named as in the example below.
>  * The composite matcher is referenced by name in the match attribute of a
> subsequent mailet. It may be referenced any number
>  * of times in this way. Any matcher may be included as a child of a
> composite matcher, including another composite matcher or Not.
>  * As a consequence, the class names: And, Or, Not and Xor are permanently
> reserved.
>  * <pre>
>  *   <matcher name="a-composite" match="Or">
>  *              <matcher match="And">
>  *                      <matcher match="Not">
>  *                          <matcher match="HostIs=65.55.116.84"/>
>  *                      </matcher>
>  *                      <matcher
> match="HasHeaderWithRegex=X-Verify-SMTP,Host(.*)sending to us was not
> listening"/>
>  *              </matcher>
>  *              <matcher match="HasHeaderWithRegex=X-DNS-Paranoid,(.*)"/>
>  *       </matcher>
>  *       <mailet match="a-composite" class="ToProcessor">
>  *              <processor>spam</processor>
>  *       </mailet>
>  * </pre>
>  * @author Ralph Holland
>  *
>  */
> public interface CompositeMatcher extends Matcher
> {
>
>   /**
>    * @return iterator to children matchers
>    */
>    public Iterator iterator();
>
>  /**
>    * Add a child matcher to this composite matcher. This is called by
> SpoolManager.setupMatcher()
>    * @param matcher
>    */
>    public void add(Matcher matcher);
>
>
> }
>
> New org.apache.mailet.CompositeMatcherBase:
>
> public abstract class CompositeMatcherBase extends GenericMatcher
implements
> CompositeMatcher
> {
>    /**
>     * This lets the configurator build up the composition (which might be
> composed of other composites
>     * @param matcher
>     */
>    public void add(Matcher matcher)
>    {
>        matchers.add(matcher);
>    }
>
>    /**
>     * @return iterator to child-matchers
>     */
>    public Iterator iterator()
>    {
>        return matchers.iterator();
>    }
>
>
>    private Collection matchers = new ArrayList();
>
> }
>
> New org.apache.james.transport.matchers.And:
>
> public class And extends CompositeMatcherBase
> {
>   /**
>     * This is the And CompositeMatcher - consider it to be an intersection
> of the results.
>     * If any match returns an empty recipient result the matching is
> short-circuited.
>     * @return the And composition of the nested matchers.
>     */
>    public Collection match(Mail mail) throws MessagingException
> ...
>
> New org.apache.james.transport.matchers.Or:
>
> public class Or extends CompositeMatcherBase {
>
>
>    /**
>     * This is the Or CompositeMatcher - consider it to be a union of the
> results.
>     * If any match returns an empty recipient result the matching is
> short-circuited.
>     * @return the Or composition of the nested matchers.
>     */
>    public Collection match(Mail mail) throws MessagingException
> ...
>
> New org.apache.james.transport.matchers.Xor:
> public class Xor extends CompositeMatcherBase {
>
>    /**
>     * This is the Xor CompositeMatcher - consider it to be the inequality
> operator for recipients
>     * If any recipients match in all the matcher results then the result
> does not include that recipient.
>     * @return the Xor composition of the composed matchers.
>     */
>    public Collection match(Mail mail) throws MessagingException
> ...
>
> New org.apache.james.transport.matchers.Not:
> public class NotMatcher extends CompositeMatcherBase {
>
>    /**
>     * This is the Not CompositeMatcher - consider what wasn't in the
result
> set of each matcher.
>     * Of course it is easier to understand if it only includes one
matcher.
>
>     * @return the Negated composition of the nested matchers.
>     */
>    public Collection match(Mail mail) throws MessagingException
>    ...
>
> Modifications to SpoolManager.initialize() to handle loading and
> initialization of the composites.
>
>
> Various unit tests have been constructed to test the composite operations,
> though there are no unit tests YET to ensure that the actual JamesLoader
> loads any matcher correctly - I think such a test should exist rather than
> mocking the class loading.
>
> If you like this design and vote for it, I will provide the complete
source
> code for evaluation and re-distribution in James, with the only caveat
being
> attribution acknowledgement.
>
> I am also open to the re-naming of the composite classes because: And is
> Intersection, Or is Union, and Xor is NotEqual, in terms of the recipient
> result sets. A result set is true if at least one recipient is returned,
as
> per the James processing chain, and should not all recipients be returned,
> then a clone of the mail will be sent with the Not of the recipients to
the
> next processor in the processing chain until the mail is acquitted.
>
>
> Regards,
> Ralph Holland
> Managing Director
> www.arising.com.au
> BH:    61 2 61271265
> AH:    61 2 62312869
> Fax:    61 2 62312768
> Mob:  0417 312869 (AH/weekends only)
> www.arising.com.au/aviation
> _______________________________________________________________________
> This email message and any accompanying attachments may contain
> information that is confidential and intended only for the use
> of the addressee(s) named above. It may also be privileged.
> If you are not the intended recipient do not read, use,
> disseminate, distribute or copy or take any action in reliance on it.
> If you have received this message in error, please notify the sender
> immediately, and delete this message. It is your responsibility to
> check attachments for viruses or defects.
> _______________________________________________________________________
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
> For additional commands, e-mail: server-dev-help@james.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org


No virus found in this incoming message.
Checked by AVG - www.avg.com 
Version: 9.0.725 / Virus Database: 270.14.134/2613 - Release Date: 01/11/10
18:35:00


---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org