You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Rian Hunter <ri...@MIT.EDU> on 2005/07/18 06:44:57 UTC

Initial mod_smtpd code.

Hi,

This is my first attempt at writing an experimental version of  
mod_smtpd. I don't yet have svn access yet so this code can be  
downloaded from http://rian.merseine.nu/mod_smtpd-0.1.tar.gz.

This implementation shows my vision for mod_smtpd and it isn't  
perfect, so please test/look at the code and think about where its  
method of extensibility works and doesn't work. This implementation  
is different from Jem's view of hooks for each smtp command (i do  
something similar but not so drastic as hooks for each smtp command,  
i have a apr_hash_t with function pointers for each different smtp  
command).

Jem/Paul/Nick: I'm especially interested in what you think about the  
design I've laid out in this implementation.
-rian

Re: Initial mod_smtpd code.

Posted by Jem Berkes <jb...@users.pc9.org>.
> Jem/Paul/Nick: I'm especially interested in what you think about the
> design I've laid out in this implementation.

I'll try this out today and send my feedback.

With respect to hooking every command, the reason I suggested that is to 
offer some usefl facilities to those writing filter modules. It may work 
with the way you've laid it out too, I'll check for that angle of it.



Re: Initial mod_smtpd code.

Posted by Joe Schaefer <jo...@sunstarsys.com>.
Joe Schaefer <jo...@sunstarsys.com> writes:

> Rian Hunter <ri...@MIT.EDU> writes:
>
>> I think this requires some more thought considering different
>> smtp connections and server requirements. The main drawback to
>> sub- requesting each rcpt to is that we have two different
>> handlers trying to read data from the socket. Is this problem
>> solved by spooling the data, and letting the two separate
>> requests read from the spool bucket?
>
> Hmm, what would the smtp return status for DATA be,
> if only some of the RCPT_TO addresses are handled
> successfully?
>
> I've been assuming the http analog of "RCPT_TO: <fo...@bar>"
> was "POST: /foo\nHost: bar"  but I now think that's wrong
> from a resource identifier standpoint.

OTOH, maybe we should just return success in this case,
and only retry/bounce the failed subrequests later on.

-- 
Joe Schaefer


Re: Initial mod_smtpd code.

Posted by Joe Schaefer <jo...@sunstarsys.com>.
Rian Hunter <ri...@MIT.EDU> writes:

> I think this requires some more thought considering different
> smtp connections and server requirements. The main drawback to
> sub- requesting each rcpt to is that we have two different
> handlers trying to read data from the socket. Is this problem
> solved by spooling the data, and letting the two separate
> requests read from the spool bucket?

Hmm, what would the smtp return status for DATA be,
if only some of the RCPT_TO addresses are handled
successfully?

I've been assuming the http analog of "RCPT_TO: <fo...@bar>"
was "POST: /foo\nHost: bar"  but I now think that's wrong
from a resource identifier standpoint.

-- 
Joe Schaefer


Re: Initial mod_smtpd code.

Posted by Matthieu Estrade <me...@apache.org>.
Rian Hunter wrote:

>
> On Jul 19, 2005, at 6:51 AM, Nick Kew wrote:
>
>>> the problem i found when i did my poc is when there is in the  command,
>>> different destination email. It's difficult here to keep the  
>>> virtualHost
>>> scheme.
>>> It would be nice to keep a conf file like
>>>
>>> <virtualHost>
>>>     ServerName mail.bla.com
>>>     SmtpUserMap mail.bla.com-user.map
>>>     or SmtpRelay host
>>>     ....
>>> </virtualHost>
>>>
>>
>> Yes.  I think there is a logical difficulty here.   smtpd_create_request
>> is run before process_smtp_connection_internal, and the design doesn't
>> allow for multiple recipients to be processed differently.
>>
>> My own feeling was that multiple recipients require a request each.
>> Maybe we could get that in your design by creating a subrequest
>> for each RCPT TO?
>
>
> This is possible but I'm not sure what the advantage is. Would you  
> mind setting up a hypothetical situation where this is necessary? I  
> figure that ultimately the handler will be responsible for dealing  
> with each rcpt to differently.
>
> Early on I wanted configuration possibility similar to:
> #########
> Listen 21
> <VirtualHost *:21>
>     # mod_smtpd conf
>       SmtpProtocol On
>     SetHandler unix-module
>
>     # mod_smtpd_unix
>     AcceptDomains thelaststop.net www.thelaststop.net
>     Relay On
>
>     # mod_smtpd_easyfilter
>     <Filter>
>         # matches against email in MAIL TO: smtp command
>         RegexMailTo "/thelaststop.net$/"
>         SetHandler maildir-module
>
>         # mod_smtpd_maildir conf
>         MaildirBase "/usr/local/virtual"
>         MailboxLimit 50M
>     </Filter>
>
>     # Simple spam filter using mod_smtpd_easyfilter
>     # Default handler does nothing with mail message
>     <Filter>
>         RegexHeader "Subject" "/cialis/"
>         SetHandler none
>     </Filter>
> </VirtualHost>
> ##########
>
imho, your conf is too complex.
first, it would be good to create some <VirtualDomain bla.org> or 
<VirtualDomain *>
and in each tag, put some process and filter setup.

<VirtualDomain apache.org>
ServerAlias prc.apache.org apr.apache.org

SmtpUser pam (or ldap or mysql etc like auth in httpd)
SmtpSpool /var/spool/mail (directive from mod smtp core)
or
SmtpRelay 1.2.3.4
SmtpRelay 1.3.4.5 (With something

(then here directive for other smtp modules

</VirtualDomain>

if we consider the SmtpUser as a function hooked in something created 
specialy, like ap_hook_smtp_user, we run this function to find the user 
inside the local/ldap/pam or whatever choosen. maybe a user profile 
provider.
We could also have a hook for register new command, like we could 
register new method in http.

it could be:

command 1 -> processed by function for this command
command 2 -> processed by function for this command
rcpt to: toto@tata.com -> get_domain then get_user(domain) then register 
user and action to do (maybe in a array) (SmtpSpool or SmtpRelay 
function, maybe a ap_hook_smtp_action)
....
command y -> processed by function for this command
end command

DATA

Here we execute actions (registered with users and domain before, as 
filter or as hook)
user 1 -> Relay -> execute relay
user 2 -> Local Delivery -> put mail inside Spool
user 3 -> Relay etc...

imho, delivering, relaying etc could be done sequentialy

Maybe we could have inside the user array, something ordered per vdomain 
coz they have the same action.
if user1@domain-local user2@domain-to-relay user3@domain-relay
We could directly relay user1 and user3 in the same action and local 
deliver for user inside domain-local.

Cheers,

Matthieu

> In the case of this httpd.conf embedded filtering mechanism, I  
> figured the handlers could be changed based on the certain RegExs  
> right before ap_run_handler() was called. Maybe ap_run_fixups() could  
> be called, and my hypothetical mod_smtpd_easyfilter would have a  
> fixups hook where it accomplished something similar to this  
> situation. Although after thinking about it I realize now that  
> mod_smtpd_easyfilter couldn't set different handlers for different  
> rcpt tos. Is this what you meant?
>
> I think this requires some more thought considering different smtp  
> connections and server requirements. The main drawback to sub- 
> requesting each rcpt to is that we have two different handlers trying  
> to read data from the socket. Is this problem solved by spooling the  
> data, and letting the two separate requests read from the spool bucket?
> -rian
>


Re: Initial mod_smtpd code.

Posted by Rian Hunter <ri...@MIT.EDU>.
On Jul 19, 2005, at 6:51 AM, Nick Kew wrote:
>> the problem i found when i did my poc is when there is in the  
>> command,
>> different destination email. It's difficult here to keep the  
>> virtualHost
>> scheme.
>> It would be nice to keep a conf file like
>>
>> <virtualHost>
>>     ServerName mail.bla.com
>>     SmtpUserMap mail.bla.com-user.map
>>     or SmtpRelay host
>>     ....
>> </virtualHost>
>>
>
> Yes.  I think there is a logical difficulty here.   
> smtpd_create_request
> is run before process_smtp_connection_internal, and the design doesn't
> allow for multiple recipients to be processed differently.
>
> My own feeling was that multiple recipients require a request each.
> Maybe we could get that in your design by creating a subrequest
> for each RCPT TO?

This is possible but I'm not sure what the advantage is. Would you  
mind setting up a hypothetical situation where this is necessary? I  
figure that ultimately the handler will be responsible for dealing  
with each rcpt to differently.

Early on I wanted configuration possibility similar to:
#########
Listen 21
<VirtualHost *:21>
     # mod_smtpd conf
       SmtpProtocol On
     SetHandler unix-module

     # mod_smtpd_unix
     AcceptDomains thelaststop.net www.thelaststop.net
     Relay On

     # mod_smtpd_easyfilter
     <Filter>
         # matches against email in MAIL TO: smtp command
         RegexMailTo "/thelaststop.net$/"
         SetHandler maildir-module

         # mod_smtpd_maildir conf
         MaildirBase "/usr/local/virtual"
         MailboxLimit 50M
     </Filter>

     # Simple spam filter using mod_smtpd_easyfilter
     # Default handler does nothing with mail message
     <Filter>
         RegexHeader "Subject" "/cialis/"
         SetHandler none
     </Filter>
</VirtualHost>
##########

In the case of this httpd.conf embedded filtering mechanism, I  
figured the handlers could be changed based on the certain RegExs  
right before ap_run_handler() was called. Maybe ap_run_fixups() could  
be called, and my hypothetical mod_smtpd_easyfilter would have a  
fixups hook where it accomplished something similar to this  
situation. Although after thinking about it I realize now that  
mod_smtpd_easyfilter couldn't set different handlers for different  
rcpt tos. Is this what you meant?

I think this requires some more thought considering different smtp  
connections and server requirements. The main drawback to sub- 
requesting each rcpt to is that we have two different handlers trying  
to read data from the socket. Is this problem solved by spooling the  
data, and letting the two separate requests read from the spool bucket?
-rian

Re: Initial mod_smtpd code.

Posted by Nick Kew <ni...@webthing.com>.
> Hi Rian,

Useful start: you seem to have dealt with the core SMTP stuff:-)

> I like how the code is done. I am not sure a hook for each smtp command
> is the good solution. Adding a new command here is very simple and quick.

That's my feeling too.  But I'm happy with the way Rian has done it -
and Jem was IIRC in favour of the same approach.

> the problem i found when i did my poc is when there is in the command,
> different destination email. It's difficult here to keep the virtualHost
> scheme.
> It would be nice to keep a conf file like
>
> <virtualHost>
>     ServerName mail.bla.com
>     SmtpUserMap mail.bla.com-user.map
>     or SmtpRelay host
>     ....
> </virtualHost>

Yes.  I think there is a logical difficulty here.  smtpd_create_request
is run before process_smtp_connection_internal, and the design doesn't
allow for multiple recipients to be processed differently.

My own feeling was that multiple recipients require a request each.
Maybe we could get that in your design by creating a subrequest
for each RCPT TO?

We should also deal with spooling.  Maybe a protocol-level input
filter can do that.  I still like the idea of a spool bucket
(being a long-life file bucket with crash/recovery capabilities
and reference counting), in which case the input filter would
simply convert everything to that.  That would then run before
creating any subrequests.

-- 
Nick Kew


Re: Initial mod_smtpd code.

Posted by Matthieu Estrade <me...@apache.org>.
Hi Rian,

I like how the code is done. I am not sure a hook for each smtp command 
is the good solution. Adding a new command here is very simple and quick.
the problem i found when i did my poc is when there is in the command, 
different destination email. It's difficult here to keep the virtualHost 
scheme.
It would be nice to keep a conf file like

<virtualHost>
    ServerName mail.bla.com
    SmtpUserMap mail.bla.com-user.map
    or SmtpRelay host
    ....
</virtualHost>

Matthieu


Rian Hunter wrote:

> Hi,
>
> This is my first attempt at writing an experimental version of  
> mod_smtpd. I don't yet have svn access yet so this code can be  
> downloaded from http://rian.merseine.nu/mod_smtpd-0.1.tar.gz.
>
> This implementation shows my vision for mod_smtpd and it isn't  
> perfect, so please test/look at the code and think about where its  
> method of extensibility works and doesn't work. This implementation  
> is different from Jem's view of hooks for each smtp command (i do  
> something similar but not so drastic as hooks for each smtp command,  
> i have a apr_hash_t with function pointers for each different smtp  
> command).
>
> Jem/Paul/Nick: I'm especially interested in what you think about the  
> design I've laid out in this implementation.
> -rian
>


Re: Initial mod_smtpd code.

Posted by Jem Berkes <jb...@users.pc9.org>.
> But is anyone dealing with outgoing SMTP via a proxy_smtp in the
> mod_proxy framework?  I think you were discussing that a short while
> ago, weren't you?  I think that might be higher priority.

I hesitated on that because I did not understand at all how mod_proxy fits 
into this. i.e. I don't see how the proxy mechanism helps in relaying out 
mail to other SMTP servers.

Here's an idea on how we can start on the outbound SMTP side of things:

I can start work on a "mod_smtp_relay" which takes an email and attempts to 
relay it via the appropriate MX relay. This involves some DNS queries for 
MX records, and making new TCP connections to another SMTP server. I 
recommend that mod_smtp_relay does not itself do any spooling or queueing, 
to isolate these tasks. i.e. some other delivery/scheduler will handle 
spooling and retries etc, and occasionally pass an email to mod_smtp_relay

So given an input message, mod_smtp_relay would make an immediate relay 
attempt and then return success (sent) or an error describing where along 
the lines things went wrong -- it could be DNS, TCP connect failure, or 
SMTP error dictating permanent failure or temporary failure, needing retry.



Re: Initial mod_smtpd code.

Posted by Jem Berkes <jb...@users.pc9.org>.
>     Overall blacklists aren't that effective and cause a lot of false
> positives.  They may make sense in the case of something like
> SpamAssassin which uses a blacklist in conjunction with other false
> positives,  but by themselves they really aren't a responsible way of
> dealing with the spam problem.  I think it's better to discourage "worst
> practices" than to sucumb to plugin mania.

Blocklists aren't fundamentally broken, they are a tool which can be used 
properly or misused (just like many other tools).

Many admins choose to maintain their own DNSBLs for one reason or another. 
It may be a way to control relay access based on their own subscriber IP 
addressess. At my site we keep a record of IPs that have persistently 
abused our site over the past few days.

i.e. DNSBL != (SPEWS or MAPS or whatever)



Re: Initial mod_smtpd code.

Posted by Nick Kew <ni...@webthing.com>.
On Wed, 20 Jul 2005, Paul A Houle wrote:

> Jem Berkes wrote:
>
> >
> >I could also start work on a mod_smtpd_dnsbl if the mentors feel that is
> >worthwhile? This would look up a connecting IP address against a blacklist
> >and return a descriptive string to mod_smtpd if the client should be
> >rejected with an error: "550 5.7.1 Email rejected because 127.0.0.2 is
> >listed by sbl-xbl.spamhaus.org"
> >
> >I'd also like to include support for RHSBL, a newer type of listing by
> >domain names from the envelope sender address. That's used by a growing
> >number of projects.
> >
> >
>     Overall blacklists aren't that effective and cause a lot of false
> positives.

The issue is about developing enabling technology.  Jem's proposals
are for a couple of modules we'd like to see developed.  A spamassassin
filter is also on the wishlist, but availability of solutions that avoid
the formidable spamassassin overhead seem to me a Good Thing.

Perhaps the answer is for Jem's modules to have options either to
reject outright or to accept but increment a score that can then
be aggregated with results from other modules.

-- 
Nick Kew


Re: Initial mod_smtpd code.

Posted by Paul A Houle <ph...@cornell.edu>.
Jem Berkes wrote:

>
>I could also start work on a mod_smtpd_dnsbl if the mentors feel that is 
>worthwhile? This would look up a connecting IP address against a blacklist 
>and return a descriptive string to mod_smtpd if the client should be 
>rejected with an error: "550 5.7.1 Email rejected because 127.0.0.2 is 
>listed by sbl-xbl.spamhaus.org"
>
>I'd also like to include support for RHSBL, a newer type of listing by 
>domain names from the envelope sender address. That's used by a growing 
>number of projects.
>  
>
    Overall blacklists aren't that effective and cause a lot of false 
positives.  They may make sense in the case of something like 
SpamAssassin which uses a blacklist in conjunction with other false 
positives,  but by themselves they really aren't a responsible way of 
dealing with the spam problem.  I think it's better to discourage "worst 
practices" than to sucumb to plugin mania.

Re: Initial mod_smtpd code.

Posted by Nick Kew <ni...@webthing.com>.
On Tue, 19 Jul 2005, Jem Berkes wrote:

> > Hmm. That sounds like a good idea, maybe there already is a hook
> > defined that could deal with this, I'll look into it.
>
> I could also start work on a mod_smtpd_dnsbl if the mentors feel that is
> worthwhile? This would look up a connecting IP address against a blacklist
> and return a descriptive string to mod_smtpd if the client should be
> rejected with an error: "550 5.7.1 Email rejected because 127.0.0.2 is
> listed by sbl-xbl.spamhaus.org"
>
> I'd also like to include support for RHSBL, a newer type of listing by
> domain names from the envelope sender address. That's used by a growing
> number of projects.

Happy to see you do either/both of those: it's all part of what
we want to see.

But is anyone dealing with outgoing SMTP via a proxy_smtp in the
mod_proxy framework?  I think you were discussing that a short while
ago, weren't you?  I think that might be higher priority.

-- 
Nick Kew


Re: Initial mod_smtpd code.

Posted by Jem Berkes <jb...@users.pc9.org>.
> Hmm. That sounds like a good idea, maybe there already is a hook
> defined that could deal with this, I'll look into it.

I could also start work on a mod_smtpd_dnsbl if the mentors feel that is 
worthwhile? This would look up a connecting IP address against a blacklist 
and return a descriptive string to mod_smtpd if the client should be 
rejected with an error: "550 5.7.1 Email rejected because 127.0.0.2 is 
listed by sbl-xbl.spamhaus.org"

I'd also like to include support for RHSBL, a newer type of listing by 
domain names from the envelope sender address. That's used by a growing 
number of projects.



Re: Initial mod_smtpd code.

Posted by Rian Hunter <ri...@MIT.EDU>.
> Nifty! I had some compilation problems involving regex, so in the  
> attached patch I use ap_regex.h and change some defines. Hope this  
> doesn't break anything.

that was a good idea, ap_regex.h was implictly getting included for me.

>
> The other bug I partially fixed was, strstr in smtp_protocol.c only  
> does exact matches so uppercase commands like MAIL FROM would fail.  
> I added support for the upper case, but this needs to be improved  
> still because mixed case doesn't work. Is there an APR function  
> like stristr?

agh i forgot about an upper case FROM: or TO:.  
process_smtp_connection_internal() already lowercases the actual  
command, but not the rest of the params. Either MAIL from: or RCPT  
to: would work, but not FROM or TO. Anyway, stristr can be easily  
hacked by calling strstr with two lowercased strings.

> The overall structure and the approach you took is very nice, easy  
> to understand. I would recommend adding a hook immediately upon the  
> client connection, because an external module (maybe for DNSBLs, or  
> some rate limiting control) might not even want us to return a  
> greeting at all -- i.e. close with "554 Service unavailable" right  
> away.

Hmm. That sounds like a good idea, maybe there already is a hook  
defined that could deal with this, I'll look into it.

> But I like what you have, would be happy to keep working around  
> this design.

Thanks! Do you now have svn access? After I apply your patches I'll  
see about checking this in so more of us can deal with this code. It  
sucks that most are busy with ApacheCon right now.
-rian

Re: Initial mod_smtpd code.

Posted by Jem Berkes <jb...@users.pc9.org>.
> This is my first attempt at writing an experimental version of mod_smtpd. I 
> don't yet have svn access yet so this code can be downloaded from 
> http://rian.merseine.nu/mod_smtpd-0.1.tar.gz.

Nifty! I had some compilation problems involving regex, so in the attached 
patch I use ap_regex.h and change some defines. Hope this doesn't break 
anything.

The other bug I partially fixed was, strstr in smtp_protocol.c only does 
exact matches so uppercase commands like MAIL FROM would fail. I added 
support for the upper case, but this needs to be improved still because 
mixed case doesn't work. Is there an APR function like stristr?

The overall structure and the approach you took is very nice, easy to 
understand. I would recommend adding a hook immediately upon the client 
connection, because an external module (maybe for DNSBLs, or some rate 
limiting control) might not even want us to return a greeting at all -- 
i.e. close with "554 Service unavailable" right away.

But I like what you have, would be happy to keep working around this 
design.