You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by Jem Berkes <jb...@users.pc9.org> on 2005/07/16 23:58:23 UTC

mod_smtpd design - protocol

I want to focus a bit on mod_smtpd design, in particular the protocol 
module (which accepts connections and does the E/SMTP talking). I've seen 
various ideas thrown around on what exactly the module should do. It would 
be nice if we could come up with at least the high level design specs for 
this, just so we're all on the same page about what the module will do and 
what facilities it will offer to external module users.

Talking with bfrance on IRC gave me a better sense of what other developers 
are hoping the smtp module can provide. The competing technology seems to 
be Sendmail's milter interface which allows developers to hook their custom 
filters into various stages of SMTP transactions. Anyone using mod_smtpd 
for filtering purposes will want to hook into various stages of the SMTP 
transaction - anywhere between start and end, or specific middle commands.

So let me throw this out there as a starting point because I don't think 
this has been documented yet?

The mod_smtpd protocol module accepts client connections and speaks E/SMTP, 
with default processing for all commands. e.g. it has a default greeting 
upon connection, a default response to EHLO/HELO, default accepts any 
envelop sender in MAIL FROM, default rejects any recipient in RCPT TO to 
prevent open relay configuration, etc.

However the module should provide several hooks to allow another module to 
use smtp. Off the top of my head, we need at least these hooks:

- upon connection from some client
	User might introduce delay, lookup IP for RBL, customize greeting
- upon receiving HELO/EHLO from client
- upon receiving MAIL FROM
- upon receiving RCPT TO
etc
- upon receiving other command like VRFY, RESET, NOOP
- upon receiving invalid command

I think this granularity is required. But I'm not sure about how the DATA 
hook would work? Among the two people who already have some code for smtp, 
are you coding something along these lines?

Re: mod_smtpd design - protocol

Posted by Joe Schaefer <jo...@sunstarsys.com>.

Joe Schaefer <jo...@sunstarsys.com> writes:

> "Jem Berkes" <jb...@users.pc9.org> writes:
>
>> I think this granularity is required. But I'm not
>> sure about how the DATA hook would work? 
>
> Have you considered using libapreq2 for parsing
> the mime headers in there?  The header parser
> should really convenient for that, you could
> even introduce a post-header-parser hook that
> runs when the parser finishes.

The reason I bring it up is that libapreq2's
parsers aren't tied to a request_rec, they
include a hook API, and libapreq2 even
has a multipart parser as well. It seems
like all the important DATA events can 
be mapped directly to apreq parsers and
hooks.

-- 
Joe Schaefer

Re: mod_smtpd design - protocol

Posted by Joe Schaefer <jo...@sunstarsys.com>.

"Jem Berkes" <jb...@users.pc9.org> writes:

>> Have you considered using libapreq2 for parsing
>> the mime headers in there?  The header parser
>> should really convenient for that, you could
>> even introduce a post-header-parser hook that
>> runs when the parser finishes.
>
> My own suggestion is that we don't touch or try to interpret MIME.
> Parsing the message headers into a table is straightforward but
> once you get into recognizing MIME you're moving out of the protocol
> realm and into the message format realm - and you start having to
> worry about messages within messages, boundaries, corrupt structures,
> and other things that I think are  not mod_smtpd's problem.
>

Right, the message format realm is where a request_rec
(and associated input filters) might have a role.  It's 
also where you can change from reading single lines to 
reading entire blocks, thus cutting down on round trips
through the input filters.

-- 
Joe Schaefer

Re: mod_smtpd design - protocol

Posted by Jem Berkes <jb...@users.pc9.org>.

> Have you considered using libapreq2 for parsing
> the mime headers in there?  The header parser
> should really convenient for that, you could
> even introduce a post-header-parser hook that
> runs when the parser finishes.

My own suggestion is that we don't touch or try to interpret MIME. Parsing 
the message headers into a table is straightforward but once you get into 
recognizing MIME you're moving out of the protocol realm and into the 
message format realm - and you start having to worry about messages within 
messages, boundaries, corrupt structures, and other things that I think are 
not mod_smtpd's problem.

I don't know if the other smtp project people share my opinion here.

Re: mod_smtpd design - protocol

Posted by Joe Schaefer <jo...@sunstarsys.com>.

"Jem Berkes" <jb...@users.pc9.org> writes:

> I think this granularity is required. But I'm not
> sure about how the DATA hook would work? 

Have you considered using libapreq2 for parsing
the mime headers in there?  The header parser
should really convenient for that, you could
even introduce a post-header-parser hook that
runs when the parser finishes.

-- 
Joe Schaefer

Re: mod_smtpd design - protocol

Posted by Joe Schaefer <jo...@sunstarsys.com>.

"Jem Berkes" <jb...@users.pc9.org> writes:

> The competing technology seems to be Sendmail's milter
> interface which allows developers to hook their custom
> filters into various stages of SMTP transactions. 

If you follow the design of httpd, then you want to create
an smtp_in filter that removes any extra '.'s.  Upon seeing
DATA, create a request_rec and parse the headers into 
r->headers_in.  Then add the smtp_in filter and add
request filters just like ap_invoke_handler does it.
Then process the rest of the input stream using AP_MODE_READBYTES,
letting smtp_in translate the final ".CRLF" into an eos
bucket.

That should give you enough flexibility that someone
else could write a mod_milter and provide most (if not
all) of the milter api.

-- 
Joe Schaefer

Re: mod_smtpd design - protocol

Posted by Nick Kew <ni...@webthing.com>.

On Sat, 16 Jul 2005, Jem Berkes wrote:

> I want to focus a bit on mod_smtpd design, in particular the protocol
> module (which accepts connections and does the E/SMTP talking). I've seen
> various ideas thrown around on what exactly the module should do. It would
> be nice if we could come up with at least the high level design specs for
> this, just so we're all on the same page about what the module will do and
> what facilities it will offer to external module users.

Indeed, thanks for raising the subject.  Sorry I'm not being more
responsive, but I'm @apachecon and on the road with limited
time&connectivity for another week.

> So let me throw this out there as a starting point because I don't think
> this has been documented yet?
>
> The mod_smtpd protocol module accepts client connections and speaks E/SMTP,
> with default processing for all commands. e.g. it has a default greeting
> upon connection, a default response to EHLO/HELO, default accepts any
> envelop sender in MAIL FROM, default rejects any recipient in RCPT TO to
> prevent open relay configuration, etc.
>
> However the module should provide several hooks to allow another module to
> use smtp. Off the top of my head, we need at least these hooks:
>
> - upon connection from some client
> 	User might introduce delay, lookup IP for RBL, customize greeting
> - upon receiving HELO/EHLO from client
> - upon receiving MAIL FROM
> - upon receiving RCPT TO
> etc
> - upon receiving other command like VRFY, RESET, NOOP
> - upon receiving invalid command

Hmmmm ...

I had envisaged just an ap_hook_smtp_envelope rather than one for every
individual command.  But on reflection, since it has to respond to each
command individually, it needs to run a hook for each line.
Do you think it should have a bunch of different hooks, or a single
hook and let modules simply return DECLINED on commands they're not
interested in.

> I think this granularity is required. But I'm not sure about how the DATA
> hook would work? Among the two people who already have some code for smtp,
> are you coding something along these lines?

After DATA it's basically got headers and contents, and falls into
the HTTP processing path.  That's the beauty of doing this: we get
to reuse the existing architecture.  I guess it needs additional
spooling stuff as previously discussed, though.

-- 
Nick Kew