You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mailet-api@james.apache.org by Alan Williamson <al...@blog-city.com> on 2006/11/12 09:19:43 UTC

Fast-Fail Operations in the Mailet API?

Looks like we are going to disagree on this issue Noel

Noel J. Bergman wrote:
> Well, that is because you're conflating two different issues: Matchers,
> which deal with selecting messags in a message pipeline, and in-protocol
> handlers that act on events as they happen during protocol transfer.

The message pipeline as you call it, in this instance is SMTP.  The 
pipeline *is* by definition a transfer; moving data from one place to 
another.  Doesn't really matter HOW that happens, thats protocol 
specific.  By your own logic, the fast-fail operations belong in the 
Mailet API.

But lets wind back here, because I think you're being maybe a little too 
pragmatic here.

    "The Mailet API is an API designed to facilitate the development
    and deployment of a configurable email processing applications"

That is lifted straight from the Mailet WIKI.

There is nothing in that statement that suggests putting in the 
hooks/interfaces I have suggested would violate that statement; infact, 
one argues it strengthens it.

> As I said, more and more we're moving to handle filtering in the protocol
> handlers, and have just "application" type code in the pipeline, e.g.,
> mailing list managers, delivery support, etc.

But that is JAMES specific.  I am not interested in JAMES.  If JAMES is 
the only container implementing the Mailet API then this whole 
discussion is somewhat moot and academic.   How you guys handle it 
doesn't really matter, since no one else but JAMES is using it.

If however, the group is keen for the Mailet API to grow beyond JAMES 
then one has to look at where the vast majority of Mailet development is 
going to be done by Java developers; listening to emails coming in from 
SMTP.  Can we at least agree on that?   Everything else, JAMES will 
probably do for them anyway.

This sort of dogma bogged down the early days of the Servlet API; many 
argued that servlets were far more than HTTP.  They were right; however 
99.999999% of all servlets are attached to the HTTP protocol.  Therefore 
it quickly made sense to push HTTP functionality into the specification.

all fun fun on the frontline of API design!

:)

Re: Fast-Fail Operations in the Mailet API?

Posted by Danny Angus <da...@gmail.com>.

or perhaps more accurately

org.apache.mailapi.mailet
org.apache.mailapi.handler
org.apache.mailapi.service

RE: Fast-Fail Operations in the Mailet API?

Posted by "Noel J. Bergman" <no...@devtech.com>.

Danny Angus wrote:

> Noel J. Bergman wrote:
> >   - Mailet API
> >  - Handler API
> >  - Common Services

> I believe that there is a place for all three of these things, and
> that the first two both depend upon the third one, but that neither
> of the first two need to be a mandatory requirement of an API which
> specifies the other one.

Exactly.  :-)

> I think we're really close to agreeing that the scope of an API
> project can and possibly should include all three (I think we're
> just getting hung up on names)

Hence my repeated reference to nomenclature, and illustration of the
differences in the component lifecycles.

> I do still maintain that there are use-cases which do not include
> a requirement for implementing protocols.
> [... use cases ...]

Total agreement.

> I would be 100% in favour of developing the three API's under the
> mailet api banner, as long as it was done in a manner which clearly
> signposted the options which implementors have

Exactly.  We just need to be careful to not conflate the requirements for
the different container types.

Of course, if a developer wants to intentionally embed a synchronous mailet
container as an onMessage handler, that's fine.  At least two of us have
illustrated that usage in the past, and there are some use cases.

	--- Noel

Re: Fast-Fail Operations in the Mailet API?

Posted by Danny Angus <da...@gmail.com>.

On 11/12/06, Noel J. Bergman <no...@devtech.com> wrote:

>   - Mailet API
>   - Handler API
>   - Common Services

I believe that there is a place for all three of these things, and
that the first two both depend upon the third one, but that neither of
the first two need to be a mandatory requirement of an API which
specifies the other one.

Lets consider the J2ME jsr standards, there is a core and extensions.
Or Portals and CMS, there are a number of standards which can be
combined or used in isolation.

I think we're really close to agreeing that the scope of an API
project can and possibly should include all three (I think we're just
getting hung up on names), so that implementors can optionally
implement any sensible combination to build any sensible container,
but I do still maintain that there are use-cases which do not include
a requirement for implementing protocols.

One of these is a mail processor which has access to a spool which is
filled by something else. This may or may not be a thing which
implements the handler api.
Another is a piece of client software which uses JavaMail and POP3 or
IMAP to aggregate mail from a number of internet accounts.
I would be 100% in favour of developing the three API's under the
mailet api banner, as long as it was done in a manner which clearly
signposted the options which implementors have, and to my mind that
has to include seperate names and distinct packages.

org.apache.mail.mailet
org.apache.mail.handlers
org.apache.mail.services

d.

Re: Fast-Fail Operations in the Mailet API?

Posted by Stefano Bagnara <ap...@bago.org>.

Noel J. Bergman wrote:
> The lifecycle of Servlets, HTTP Servlets, JSPs, even Portlets and SIP
> Servlets are similar.  But let's take a look at these others ... consider:
> 
>   Matcher: asynch stateless message router based on recipient lists
>            match(Mail) method

In the current APIs the matcher is not a router, only a matcher/selector.

And I don't understand the "asynch" part of this definition. You call 
the service(Mail) method and it process the Mail synchronously. What is 
asynchronous?

>   Mailet:  asynch stateless message transformer/broker -- "the app"
>            service(Mail) method

Again I don't see how it is "asynch": the only asynch part is how we 
decided to implement it in james using the JamesSpoolManager.

>   In-protocol Handler API:
>     Stateful and synchronous

I don't think we want to put the current handler api as part of the 
mailet apis, but I think it worth anyway discussing this 
stateless/stateful synch/asynch topic.

The current handlers are not stateful, they are stateless like the 
mailets. We should also make sure they are threadsafe like mailets 
because they could be used for multiple connections at the same time.

And they are synchronous as the Mailets are: you call a method and wait 
for it to finish: no callback here, no callback in the mailets.

>     Arranged in a chain similar to a Filter Chain
>     onConnect    -- connection from remote client
>     on<Command>  -- receipt of a specific command
>     onMessage    -- receipt of a completed message

onMessage is really similar to what we do in the Mailet.service method.
on<Command> is what we should avoid in the Mailet APIs, but I think that 
an "onEnvelope" receiving only the Mail part (without message) would 
already be something that would give us the opportunity to refactor many 
mailets to use this method instead of the whole "service" and would let 
us to use it early in the processing or in processing that do not care 
of the included MimeMessage.

Please note that this is only an example and something to be analyzed 
and discussed much more.. But I think we should not exclude this 
possibility now.

> They have in common the thing that you don't like: the
> "onMessage/match/service" method, which takes a completed message.  The SMTP
> protocol handler is one source for such messages to the Mailet container.
> So is FetchMail.  So is the NNTP protocol handler.  So could be IMAP or
> Jabber.  Etc.
> 
> The Mailet pipeline deals with messages and routing, and is independent of
> protocol.  It could not care less how messages are inserted into the spool.
> As you noted, a Mailet Container is protocol independent.  There is a
> specific contract that the container has with its components.  The contract
> that exists with protocol handling containers and *their* components
> (in-protocol handlers) is different.  It would be entirely unreasonable (and
> quite poor design) to insist that all containers implement both contracts.

Can you summarize the 2 contracts?
I don't agree on the stateful/stateless and the asynch/synch differences 
you reported, but maybe there are other critical differences. We should 
really create a list of the 2 scenario peculiarities.

>>> As I said, more and more we're moving to handle filtering in the
> protocol
>>> handlers, and have just "application" type code in the pipeline, e.g.,
>>> mailing list managers, delivery support, etc.
> 
>> But that is JAMES specific.  I am not interested in JAMES.
> 
> We have already said that if there is interest in the in-protocol handler
> API, that we'd be interested in exploring it, too.  But we would be talking
> about three things:
> 
>   - Mailet API
>   - Handler API
>   - Common Services
> 
> I'm curious to see if Common Services turns out to be the most contentious
> part.

Well, maybe we can split the whole thing in the 3 "package" you define.
For me it means renaming "Mailet API" to "Common Services" in my mind 
and keep talking about how to provide common services that could be 
defined once and be used both in the protocol, in the spoolmanager or 
for the email client inbox rules.

Stefano

Re: Fast-Fail Operations in the Mailet API?

Posted by Stefano Bagnara <ap...@bago.org>.

Alan Williamson wrote:
> Danny Angus wrote:
>>> But to say it is outside of the current Mailet API, is wrong IMO.  The
>>> Mailet API is for the routing and handling of emails -- all we are
>>> talking about is allowing the developer to make this decision much
>>> faster/earlier.
>>
>> Actually the mailet API implies a particular process which starts with
>> the full content of the message having already entered the system
> 
> okay i can buy that.  As with the ServletAPI, i think we should consider 
> the possible of both.  For example, the ServletAPI allows a developer to 
> process the incoming stream themselves if they want to kill the 
> connection at any point they can.

This is the very example I had in my mind :-)

> BTW - I don't suggest for a minute we have a "getMailetInputStream()" 
> method, as that is way too ugly for even consideration!  :)

I agree!

>> FastFail implies that an interest in the conversation carried out
>> during the protocol can result in decisions being taken *before* the
>> message has entered the system.
> 
> Well before the complete message has arrived.  But surely we are already 
> protocol dependent.  Think about it.
> 
> We have data that arrives on "MAIL FROM" and "RCPT TO" and that data is 
> presented in the Mailet API (aka the Matcher).
> 
> HOWEVER ... this information is not necessarily in the actual "DATA" 
> part.  Therefore, if one was to dump the mail packet to the file system 
> (using whatever) then a lot of this data is lost already.
> 
> So to say the Mailet API has nothing to do with protocol is 
> contradicting the current API and specification.  Is it not?
> 
> So my thought process is not to actually change anything, but to make 
> the Matcher stuff be called as soon as any email data arrives; not just 
> when they are all in.

You could create a partial Mail object as soon as the DATA command is 
issued. This would give you everything but the body. If the 
matcher/mailet try to read the body it will force to wait for the data 
to be streamed, otherwise it could do its own matching/processing 
anyway. The main problem is that we can't do this because until the data 
is completed we don't know if that message is really arrived or not. We 
can't start routing a message until we are not sure we "received" it 
(until we received the CRLF.CRLF command and replied we accepted it).

What I think is that we should add the missing pieces to make this 
"behaviour" effective and working.

Stefano

Re: Fast-Fail Operations in the Mailet API?

Posted by Danny Angus <da...@gmail.com>.

On 11/13/06, Alan Williamson <al...@blog-city.com> wrote:

> We have data that arrives on "MAIL FROM" and "RCPT TO" and that data is
> presented in the Mailet API (aka the Matcher).

No  we have data which *may* have arrived at the boundary of our
system in that way but equally may not. It may have arrived thanks to
POP3's RETR, conceivably as a "Message/rfc822" attachment, or even
directly injected into the spool generated by another system.

It is not relevant to either thing that there is a shared concept of
"invoke decision making" what is relevant is that both invocation
frameworks can share the same investment in decision making logic.

d.

Re: Fast-Fail Operations in the Mailet API?

Posted by Alan Williamson <al...@blog-city.com>.

Danny Angus wrote:
>> But to say it is outside of the current Mailet API, is wrong IMO.  The
>> Mailet API is for the routing and handling of emails -- all we are
>> talking about is allowing the developer to make this decision much
>> faster/earlier.
> 
> Actually the mailet API implies a particular process which starts with
> the full content of the message having already entered the system

okay i can buy that.  As with the ServletAPI, i think we should consider 
the possible of both.  For example, the ServletAPI allows a developer to 
process the incoming stream themselves if they want to kill the 
connection at any point they can.

BTW - I don't suggest for a minute we have a "getMailetInputStream()" 
method, as that is way too ugly for even consideration!  :)

> FastFail implies that an interest in the conversation carried out
> during the protocol can result in decisions being taken *before* the
> message has entered the system.

Well before the complete message has arrived.  But surely we are already 
protocol dependent.  Think about it.

We have data that arrives on "MAIL FROM" and "RCPT TO" and that data is 
presented in the Mailet API (aka the Matcher).

HOWEVER ... this information is not necessarily in the actual "DATA" 
part.  Therefore, if one was to dump the mail packet to the file system 
(using whatever) then a lot of this data is lost already.

So to say the Mailet API has nothing to do with protocol is 
contradicting the current API and specification.  Is it not?

So my thought process is not to actually change anything, but to make 
the Matcher stuff be called as soon as any email data arrives; not just 
when they are all in.

Re: Fast-Fail Operations in the Mailet API?

Posted by Danny Angus <da...@gmail.com>.

> Add a mean for Matcher to specify that they don't care of the message
> but only of the envelope.

+1

>This would allow us to use it in systems wher
> the envelope does not exists. At the same time add a mean to specify
> that the matcher/mailet does not care of the enveloper but only of the
> mimemessage (or only of the mime headers): This would be useful to write
> a "incoming mail rule engine based on mailet apis" for a java based
> email client.

+1

>
> Another improvement to the api would be to give access to the message as
> a stream or as a a mime-tree of streams:

+10,000

> Adding "Fastfail" to the Mailet APIs means (to me) adding some more
> specific object/method/interface to allow processing of partial (or
> better "specific") subset of the current Mail object and not publishing
> JAMES Server (incomplete) handlerapis that are instead a command pattern
> we used to keep our smtpserver modular and more easy to be mantained.

I think the discussion is about how to invoke the processing, that
requires us to think about how you would invoke bespoke processing
during the protocol handling. Thats where the handler api comes in.


> WDYT? Is this something in the middle we could agree upon?

I think we already agree what the components are, we're just
disagreeing with Alan about whether or not the fast fail stuff should
be in the mailet API or a separate fast-fail/handler API.

d.

Re: Fast-Fail Operations in the Mailet API?

Posted by Stefano Bagnara <ap...@bago.org>.

Someone (with James Server knowledge) probably misunderstood my proposal 
to support fastfail operations in Mailet API.

I think that publishing what we are currently defining "handlerapi" in 
james server as a standard API is a bad idea.

At the same time I think that Mailet APIs must provide a mean to let 
"mailets" (or something else) to better interact with scenarios where we 
don't have the full message.

And here are some examples of what I think we should/could add to the api.

Add a mean for Matcher to specify that they don't care of the message 
but only of the envelope. This would allow us to use it in systems wher 
the envelope does not exists. At the same time add a mean to specify 
that the matcher/mailet does not care of the enveloper but only of the 
mimemessage (or only of the mime headers): This would be useful to write 
a "incoming mail rule engine based on mailet apis" for a java based 
email client.

Another improvement to the api would be to give access to the message as 
a stream or as a a mime-tree of streams: this would allow to write a 
more effective spam tool based on the mailet api. (I'm referring to 
something like the SpamAssassin rules and the way SA manage targets of 
the checks).

This would give us a lot of potentiality: we could parse the message 
once, in the container and let multiple mailets to use the parsed data, 
we could avoid to even care of the message if all of our mailets do not 
read it, and so on.

Summary:

Adding "Fastfail" to the Mailet APIs means (to me) adding some more 
specific object/method/interface to allow processing of partial (or 
better "specific") subset of the current Mail object and not publishing 
JAMES Server (incomplete) handlerapis that are instead a command pattern 
we used to keep our smtpserver modular and more easy to be mantained.


WDYT? Is this something in the middle we could agree upon?

Stefano

Re: Fast-Fail Operations in the Mailet API?

Posted by Danny Angus <da...@gmail.com>.

On 11/13/06, Alan Williamson <al...@blog-city.com> wrote:

> But to say it is outside of the current Mailet API, is wrong IMO.  The
> Mailet API is for the routing and handling of emails -- all we are
> talking about is allowing the developer to make this decision much
> faster/earlier.

Actually the mailet API implies a particular process which starts with
the full content of the message having already entered the system

FastFail implies that an interest in the conversation carried out
during the protocol can result in decisions being taken *before* the
message has entered the system.

The decision making process may or may not be the same, and the
services used in the decision making are quite likely to be the same.

But crucially the initial assumptions are different, the scenarios are
different, the process to which the decision making is attached is
different and the possible outcomes of the decision making have a
significantly smaller scope.

d.

Re: Fast-Fail Operations in the Mailet API?

Posted by Stefano Bagnara <ap...@bago.org>.

Alan Williamson wrote:
> Noel i am going to ignore the majority of your email, because in essence 
> it boils down to your statement here:
> 
>> The Mailet pipeline deals with messages and routing, and is 
>> independent of
>> protocol.  It could not care less how messages are inserted into the 
>> spool.
> 
> I fully agree with this statement.  The key phrase here is "routing".

I agree also, but I want to make sure everyone is aware of the fact that 
current API does not route anything: it matches a partial list or 
recipients (from 0 to all recipients).

We can evolve it to return a destination for each recipient, but this is 
not there: so it does not route anything. It simply match now.

Stefano

Re: Fast-Fail Operations in the Mailet API?

Posted by Alan Williamson <al...@blog-city.com>.

Noel i am going to ignore the majority of your email, because in essence 
it boils down to your statement here:

> The Mailet pipeline deals with messages and routing, and is independent of
> protocol.  It could not care less how messages are inserted into the spool.

I fully agree with this statement.  The key phrase here is "routing".

This isn't about conflating anything, this is about giving the necessary 
hooks within the API to intelligently decide on whether or not we wish 
to "route" this email any further through the system.

So forget about procotols, forget about where the email is coming from, 
it boils down to the simple question:

	"Do i want to accept this email?"

Now, ironically enough the Mailet API already answers this question in 
the Matcher classes.  So I am not sure where your arguments are coming from.

We need to move this processing/decision making further up the chain. 
We need to provide developers with the ability to get a peek at the 
message as soon as possible.

Yes, the examples you give FetchMail, IMAP (does JAMES even support this 
yet?), NNTP (which to be honest with you, anyone using Mailets/JavaMail 
to work with an NNTP server is just darn right crazy; there is one 
protocol that should never have been in the JavaMail API. We can duke 
that one out later!) will all still work.  If they do not wish to 
override/implement the interface then they don't.

But to say it is outside of the current Mailet API, is wrong IMO.  The 
Mailet API is for the routing and handling of emails -- all we are 
talking about is allowing the developer to make this decision much 
faster/earlier.

As for how the container implementation; how the container wants to 
handle the non-SMTP cases is up to them.  For my own part, I am only 
interested in the 99.9999% of cases when emails originate via SMTP.

RE: Fast-Fail Operations in the Mailet API?

Posted by "Noel J. Bergman" <no...@devtech.com>.

Alan Williamson wrote:

> Looks like we are going to disagree on this issue

That depends on whether you want to agree on standardizing on an API for
in-protocol handlers or insist on conflating the requirements of two
distinct and separate containers.

> This sort of dogma bogged down the early days of the Servlet API; many
> argued that servlets were far more than HTTP.  They were right; however
> 99.999999% of all servlets are attached to the HTTP protocol.  Therefore
> it quickly made sense to push HTTP functionality into the specification.

The lifecycle of Servlets, HTTP Servlets, JSPs, even Portlets and SIP
Servlets are similar.  But let's take a look at these others ... consider:

  Matcher: asynch stateless message router based on recipient lists
           match(Mail) method
  Mailet:  asynch stateless message transformer/broker -- "the app"
           service(Mail) method

  In-protocol Handler API:
    Stateful and synchronous
    Arranged in a chain similar to a Filter Chain
    onConnect    -- connection from remote client
    on<Command>  -- receipt of a specific command
    onMessage    -- receipt of a completed message

They have in common the thing that you don't like: the
"onMessage/match/service" method, which takes a completed message.  The SMTP
protocol handler is one source for such messages to the Mailet container.
So is FetchMail.  So is the NNTP protocol handler.  So could be IMAP or
Jabber.  Etc.

The Mailet pipeline deals with messages and routing, and is independent of
protocol.  It could not care less how messages are inserted into the spool.
As you noted, a Mailet Container is protocol independent.  There is a
specific contract that the container has with its components.  The contract
that exists with protocol handling containers and *their* components
(in-protocol handlers) is different.  It would be entirely unreasonable (and
quite poor design) to insist that all containers implement both contracts.

> > As I said, more and more we're moving to handle filtering in the
protocol
> > handlers, and have just "application" type code in the pipeline, e.g.,
> > mailing list managers, delivery support, etc.

> But that is JAMES specific.  I am not interested in JAMES.

We have already said that if there is interest in the in-protocol handler
API, that we'd be interested in exploring it, too.  But we would be talking
about three things:

  - Mailet API
  - Handler API
  - Common Services

I'm curious to see if Common Services turns out to be the most contentious
part.

	--- Noel