You are viewing a plain text version of this content. The canonical link for it is here.
Posted to server-dev@james.apache.org by "Noel J. Bergman" <no...@devtech.com> on 2003/06/10 02:38:43 UTC

James as a Mailing List Delivery Agent

I am moving this discussion to the James Developer List, since none of it is
considered sensitive.  Messages ordered from bottom (oldest) to top.  Mark
Imel, Jason Webb, Craig Mattson, and others interested in mailing list
management with James please take note.

Feedback, corrections, etc. are all actively solicited.

Oh, but guys ... please do NOT reply with this message embedded.  It is long
enough as it is.  :-)  Just trim quotes, if any are needed at all.

----------------------

> Any realistic performance test has to account for the fact that
> SMTP deliveries take anywhere from seconds to minutes depending
> on network connections; the average for a fresh message seems to
> be about 5 seconds, based on the fact that on daedalus qmail's
> max concurrency is 509 and when delivering to a fresh list you
> see 80-100 deliveries per second.  That actually matters more
> than any internal performance in terms of adding and removing
> messages from the queue.

I agree, which is why I wanted to clarify the situation.  I am sorry for the
initial confusion.  FWIW, my normal load test generates about 1/2 the normal
ASF load in individual outgoing messages, one per connection, although that
is on a LAN.  My current tests don't simulate the mailing list connection
situation right now, but I'll put together a suitable test as James gets
closer to being able to deliver on your needs.

> Also, is the queue database transactional?  Are we sure if the system
> goes down that we won't lose mail?

We may not be able to guarantee that the message won't be sent twice, if the
system happens to go down at the wrong time, but the message is not removed
from the queue until after it is confirmed to have been sent.

> delivery can't be sequential to a list; when you're given a list of 100
> recipients, the mail server should be sending to those recipients in
> parallel, not waiting for the first recipient delivery to finish before
> beginning the next.  If this is already addressed, great.

We can tell James how many delivery threads to use.  Yes, multiple threads
would be used except where a message has multiple recipients at the same
domain.  The current code separates the recipients by domain, and spools a
separate outgoing message per domain.

In the case of VERP, each message delivery would be unique because of the
MAIL FROM.  However, that does not mean that we could not optimize message
delivery.  Doing VERP with the current code would mean a unique Mail spooled
per receipient, each with its own unique sender.  I believe that we can
optimize that process.

The link I usually use for VERP is http://cr.yp.to/proto/verp.txt.

Unless I'm missing something, it could be done similarly to how we take a
message in LocalDelivery, and then for each mailbox add a unique
Delivered-To header to a clone of the message being streamed into the
mailbox.  For VERP, we would use a unique sender (an envelope attribute, not
a Sender: header).

> With VERP, the MAIL FROM line is different for every recipient, and
> thus every recipient requires a separate SMTP connection.  100K AOL
> users on a list?  100K connections.

Yes, for each recipient we need a unique MAIL FROM.  But both RFC 821 & 2821
permit more than one MAIL FROM per connection (for that matter, my mail
client expects it).  We wouldn't save anything on the data transfer, but why
not reuse the connection, and save the connection establishment delay, when
the remote MTA permits it and there are multiple mail messages to transfer?
What am I missing?

I've no ego investment in any particular idea; just mentally exploring what
more we could do inside the delivery engine that would reduce transfer time
and cycles.  If I've missed something, e.g., if I had overlooked that VERP
means that we cannot send one copy to AOL with list of RCPT TO commands,
then I'm wrong.  It has been known to happen.  Occassionally.  ;-)

> Is there some web-based means for admins to cruise through the
> pending delivery queue?

Not a one.  Something to do.  What operations would you like to see
supported?  The probable solution will use JMX.

	--- Noel

-----Original Message-----
From: Noel J. Bergman [mailto:noel@devtech.com]
Sent: Monday, June 09, 2003 2:22
To: Brian Behlendorf
Cc: James-PMC Mailing List
Subject: RE: James delivery volume


Brian,

Don't get too carried away by the performance. :-)  A key point that you
might not have caught was that I was using an SMTP sink, which means that
instead of opening a connection per recipient, there was one connection per
message.  There would be a performance degradation opening connections per
recepient.  But from these few measurements, it does appear that the key
issue is going to be optimizing those outgoing connections.  Everything else
appears to be well within the performance that James is capable of
providing.

For really high volume, Craig Mattson provided some thoughts based upon
supporting lists larger that all but a few ISPs (not with James :-)):

  http://nagoya.apache.org/wiki/apachewiki.cgi?JamesV3/HighVolume

but I don't think that we need to do all of those things (e.g., clustering)
before we can handle the ASF load.

What I tried to describe in my previous message is to change the list
delivery.  Instead of taking a message, attaching the recipient list, and
spooling it for the standard delivery engine, I was thinking that the list
delivery engine could be (at least conceptually) a subclass of the standard
delivery engine.  Not having to spool the message with the recipient list
attached would provide some savings (James does create only one queue entry,
but it has the expanded recipient list attached).

But the major improvement, or so goes my hypothesis, would come from the
delivery engine having a more tightly coupled relationship with the delivery
list.  The list delivery engine could pre-sort addresses by MX record based
upon recipient domains, for example, and just run through this pre-sorted
delivery data for each message, at least for normal list delivery
(individual retries might be kicked out to the standard delivery process).
That is what I meant by applying the message to the list, instead of the
list to the message.

The test system runs RedHat 6.2 (latest linux 2.2 kernel), soon to be RedHat
8.0 (latest updates).  Both the mailet pipeline spool and the outgoing
gateway spool were in the file system.  The mailing list roster was in
MySQL, though.

A quick look in the apmail hierarchy was heartening.  Although I just looked
at a few lists, my guess is that the biggest list is tomcat-user, which has
just under 2300 subscribers.  The user list for httpd is 60% that size, and
other lists were running in the 100s of users.  I didn't play with any of
the tools, since I don't know them well enough not to be sure of not
screwing anything up.  Don't laugh, but I just cat'd (cat * >>
~noel/tmp/<list>) the subscriber list files, and then counted the number of
@ signs.  :-)

The new Mailing List Manager that Mark wrote is extensible.  Basically, it
provides a structure for adding new commands by adding new command classes.
Right now it just provides confirmed subscribe, confirmed unsubscribe and
info.  There are features missing that you've already expressed as wanting:

  http://nagoya.apache.org/wiki/apachewiki.cgi?HostApacheOnJames

so we've got some "marching orders."  :-)  Mark's new MLM just provides a
framework for implementing them.

I have no idea what the volume of James installations might be, nor a good
way to count it.  This page:
http://nagoya.apache.org/wiki/apachewiki.cgi?JamesUsers, has comments from
some of our users, and there is a similar page for volunteers.  There are
businesses using James as their primary mail server.  That's about all I can
tell you.  I could post another request or add something to the front page
asking people to tell us how they use James.

One other data point, actually.  "Alice K" started a survey last November.
Most of the entries in the survey pre-date v2.1.0, but I still find it
informative, e.g., the percentages of people using file vs SQL or Windows vs
linux:

  http://infopoll.net/live/surveys.dll/r?sid=19892&r=29845

People can still contribute (http://infopoll.net/live/surveys/s19892.htm),
but their responses will be mixed with those related to old versions.

If nothing that we've discussed in the past few messages is sensitive, I'll
post it up to james-dev.

	--- Noel


-----Original Message-----
From: Brian Behlendorf [mailto:brian@collab.net]
Sent: Sunday, June 08, 2003 21:27
To: Noel J. Bergman
Cc: James-PMC Mailing List
Subject: Re: James delivery volume


On Sat, 7 Jun 2003, Noel J. Bergman wrote:
> I have adjusted my load testing.  Focusing on other usage scenarios, my
most
> recent load test had been lots of short simultaneous messages with a mix
of
> local and remote mailboxes.  My revised load test cuts way back to only
100
> 15K messages per minute, incoming, of which roughly 1/3 are relayed to
1111
> remote users, 1/3 are relayed individually, and 1/3 are local.  That works
> out to ~5.3 million messages per day.  Mind you, since I don't have 1111
> individual target hosts, I'm using a single SMTP sink, so the performance
> isn't really representative.
>
> On a 400mhz Celeron (current test server), the CPU averages at least 1/3
> idle, with a range from just below 20% to ~50%.  That is after the JVM has
> had time to run the jitter (Hotspot Server).

Wow, that's terrific.  What OS?  Also, is the delivery queue stored on
disk or in mysql?

> I'm pleasantly surprised.  We have really not focused at all on optimizing
> mailing list performance, although there was some work done last week to
> resolve complaints from a user who was testing lists of over 10000
> recipients.  We have a new mailing list manager coming imminently from
Mark
> Imel, which adds a lot of new features, and some of our users have custom
> modifications to James for doing high volume delivery.

I wonder how hard it would be to put together a comparison between the
feature sets in the new MLM from Mark, and EZMLM.  If Mark could send me a
doc on features I'd put some time into seeing what's missing to make it
suitable for the ASF.

Sounds stable - how many live JAMES installations to you think are out
there?  Any way to sample?

> No action item, but I do consider this to be good news.
>
> To more accurately simulate the target environment, it would be helpful to
> have a statistical snapshot of the ASF lists.  Just the number of lists,
> although I can estimate from the eyebrowse archives, and the number of
> recipients on each.  Is that accessible on daedalus?

The best thing I could do is allow you to sudo to the apmail user, so you
can sniff around ~apmail/lists/.  Should be easy to careen though the
lists using foreach and ezmlm-list.  OK, done,

> I hypothesize that one optimization would be to more tightly couple a
remote
> delivery engine with the list manager, so that we don't have to queue the
> recipient list with each message.  We could optimize the recipient
> information in the list, and then apply the message to the list, rather
than
> apply the list to the message.

I don't exactly follow, but if you're suggesting that you optimize by
creating only one delivery queue entry for a message to the list,
versus a queue entry per recipient on the list, that makes sense to me.
That's how all the ones I'm aware of do it.

> P.S. I didn't know if the details about daedalus you'd mentioned would be
> considered sensitive, so I didn't CC the James developer list, but I did
> want the other folks on the James PMC to stay informed.

Basic stats are fine to share widely.

	Brian


-----Original Message-----
From: Noel J. Bergman [mailto:noel@devtech.com]
Sent: Saturday, June 07, 2003 13:07
To: Brian Behlendorf
Cc: James-PMC Mailing List
Subject: James delivery volume


> > what is the approx volume of incoming (unique) messages?  When
> > you talk about 1 million per day, that is the output from the
> > list server?

> That's in SMTP delivery attempts, both remote and local, though the vast
> majority are remote.  Deadalus has been up for 74 days and made as of just
> now 95,950,272 delivery attempts.  Incoming, I have no specific estimate,
> but running a "tail -f /var/log/qmail/smtpd/current" can be fun.

I have adjusted my load testing.  Focusing on other usage scenarios, my most
recent load test had been lots of short simultaneous messages with a mix of
local and remote mailboxes.  My revised load test cuts way back to only 100
15K messages per minute, incoming, of which roughly 1/3 are relayed to 1111
remote users, 1/3 are relayed individually, and 1/3 are local.  That works
out to ~5.3 million messages per day.  Mind you, since I don't have 1111
individual target hosts, I'm using a single SMTP sink, so the performance
isn't really representative.

On a 400mhz Celeron (current test server), the CPU averages at least 1/3
idle, with a range from just below 20% to ~50%.  That is after the JVM has
had time to run the jitter (Hotspot Server).

I'm pleasantly surprised.  We have really not focused at all on optimizing
mailing list performance, although there was some work done last week to
resolve complaints from a user who was testing lists of over 10000
recipients.  We have a new mailing list manager coming imminently from Mark
Imel, which adds a lot of new features, and some of our users have custom
modifications to James for doing high volume delivery.

No action item, but I do consider this to be good news.

To more accurately simulate the target environment, it would be helpful to
have a statistical snapshot of the ASF lists.  Just the number of lists,
although I can estimate from the eyebrowse archives, and the number of
recipients on each.  Is that accessible on daedalus?

I hypothesize that one optimization would be to more tightly couple a remote
delivery engine with the list manager, so that we don't have to queue the
recipient list with each message.  We could optimize the recipient
information in the list, and then apply the message to the list, rather than
apply the list to the message.

	--- Noel

P.S. I didn't know if the details about daedalus you'd mentioned would be
considered sensitive, so I didn't CC the James developer list, but I did
want the other folks on the James PMC to stay informed.


---------------------------------------------------------------------
To unsubscribe, e-mail: james-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: james-dev-help@jakarta.apache.org


Re: James as a Mailing List Delivery Agent

Posted by Brian Behlendorf <br...@collab.net>.
I'm not on james-dev@jakarta, so please CC me if you want me to see the
response.

On Mon, 9 Jun 2003, Noel J. Bergman wrote:
> > Also, is the queue database transactional?  Are we sure if the system
> > goes down that we won't lose mail?
>
> We may not be able to guarantee that the message won't be sent twice, if the
> system happens to go down at the wrong time, but the message is not removed
> from the queue until after it is confirmed to have been sent.

That's the failsafe position, that's fine.  Better to have a failure mode
of twice-sent than never-sent.

[VERP]
> Yes, for each recipient we need a unique MAIL FROM.  But both RFC 821 & 2821
> permit more than one MAIL FROM per connection (for that matter, my mail
> client expects it).  We wouldn't save anything on the data transfer, but why
> not reuse the connection, and save the connection establishment delay, when
> the remote MTA permits it and there are multiple mail messages to transfer?
> What am I missing?

You mean reusing connections?  Yeah, that should save a little bit of
bandwidth and timing.  DJB's position has long been that persistant
connections are bad because they favor the busy senders, reducing the
chances that a host with just a few emails will find an open slot to get
through (the same way that non-real-time OS's usually don't allow a
process to exclusively lock system resources for very long).  E.g., if
apache.org has 100K emails to send to aol.com, and consumes all 50 of
AOL's allowed incoming connections, it becomes an exclusive lock for
awhile.  Dunno if that is a realistic concern.

Anyone looked at QMTP?

> > Is there some web-based means for admins to cruise through the
> > pending delivery queue?
>
> Not a one.  Something to do.  What operations would you like to see
> supported?  The probable solution will use JMX.

I'm thinking of a web-based way to inspect the list of messages
not yet delivered, the equivalent of running qmail-qread.  Maybe web-based
isn't needed, but about once a week I hear "I've not seen any messages
from the lists, where's my mail" in which case I run qmail-qread and find
messages for that user queued up because their MX host is down or
something.  Command-line tools are OK, web-based would be better,
especially if the delivery queue were in the DB, and I could see whether
there are any pending messages for a given address, for example.

I should avoid getting carried away here.

	Brian


---------------------------------------------------------------------
To unsubscribe, e-mail: james-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: james-dev-help@jakarta.apache.org