You are viewing a plain text version of this content. The canonical link for it is here.
Posted to server-dev@james.apache.org by "Peter M. Goldstein" <pe...@yahoo.com> on 2002/08/08 02:24:16 UTC

Synchronization Fix

All,

Attached is a modified version of a performance enhancement to the
LinearProcessor class.  It is a modified version of a fix submitted by
Steve Short roughly a month ago.

The fix has two parts:

i) The synchronization of the LinearProcessor service method has been
removed, and the class has been adjusted to be thread-safe in this
situation.

ii) The mail is not renamed on each pass of the LinearProcessor.
Rather, the JamesSpoolManager is modified so that it only removes mail
from the spool when the processing state is finished or when the
recipient list is empty.  (Thanks to Shilpa Dalmia)

Note that the diffs also include the String=>StringBuffer changes.

As users of this relatively minor fix have experienced a 30% increase in
throughput, I'd like to get this into the code base ASAP.  So if I don't
hear any objections I'll be submitting these sometime tomorrow.

Please feel free to respond with comments, questions, or critiques.

--Peter


> -----Original Message-----
> From: Peter M. Goldstein [mailto:peter_m_goldstein@yahoo.com]
> Sent: Wednesday, August 07, 2002 2:05 PM
> To: 'James Developers List'
> Subject: RE: Feedback IMAP
> 
> 
> Noel,
> 
> I've certainly got no objection, but I'm new to the committer role and
> don't know all the rules.  Wouldn't want to do it without the ok of
> someone with more experience.
> 
> --Peter
> 
> > -----Original Message-----
> > From: Noel J. Bergman [mailto:noel@devtech.com]
> > Sent: Wednesday, August 07, 2002 10:17 AM
> > To: James Developers List
> > Subject: RE: Feedback IMAP
> >
> > Peter & Charles,
> >
> > I don't know the convention, but can't his code be committed into
the
> > proposal tree (where the old version is now), so that other
interested
> > parties can play with it, without impacting the mainstream code?
> >
> > 	--- Noel
> >
> >
> > --
> > To unsubscribe, e-mail:   <mailto:james-dev-
> > unsubscribe@jakarta.apache.org>
> > For additional commands, e-mail: <mailto:james-dev-
> > help@jakarta.apache.org>
> 
> 
> 
> --
> To unsubscribe, e-mail:   <mailto:james-dev-
> unsubscribe@jakarta.apache.org>
> For additional commands, e-mail: <mailto:james-dev-
> help@jakarta.apache.org>

RE: Synchronization Fix

Posted by "Noel J. Bergman" <no...@devtech.com>.
> What cleans up that message?  I don't see the path by
> which the message is removed from the spool.

> Furthermore, this: unprocessed[i + 1].add(mail);
> will end up throwing an out of bounds exception.

OK, I had missed this: unprocessed[unprocessed.length - 1].clear(); at the
top of the loop, but I still don't see the path for removing the message
from the spool.

	--- Noel


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


[PATCH] Synchronization Fix v.3

Posted by "Peter M. Goldstein" <pe...@yahoo.com>.
All,

Noel and I have chatted offline and come to an agreement as to the
structure of the code.  The modifications are a little more substantial.
Thanks to Noel for all his contributions.  Comments, questions,
criticism still welcome.  :)

If there are no objections I'll submit this on Saturday.

--Peter

> -----Original Message-----
> From: Peter M. Goldstein [mailto:peter_m_goldstein@yahoo.com]
> Sent: Thursday, August 08, 2002 11:01 AM
> To: 'James Developers List'
> Subject: [PATCH] Synchronization Fix v.2
> 
> 
> All,
> 
> Here's rev 2 of the Synchronization fix.  Noel's objections have been
> addressed.  Comments, questions, criticisms welcome.
> 
> If I don't hear any objections to this modified version I'll commit it
> tomorrow.
> 
> --Peter


[PATCH] Synchronization Fix v.2

Posted by "Peter M. Goldstein" <pe...@yahoo.com>.
All,

Here's rev 2 of the Synchronization fix.  Noel's objections have been
addressed.  Comments, questions, criticisms welcome.

If I don't hear any objections to this modified version I'll commit it
tomorrow.

--Peter


RE: Synchronization Fix

Posted by "Peter M. Goldstein" <pe...@yahoo.com>.
Noel,

> So I'll go you one further.  Why not establish that calling
> LinearProcessor.add(Matcher,Mailet) after calling
> LinearProcessor.service()
> results in an exception?

Because we have to track (and deal with the synchronization of)
additional state.  That is, we'd need to add a volatile boolean that
service updates every time it gets called.  We'd have to have add check
that boolean on every call and throw an exception if its true.  This is
all to produce a class with less functionality than synchronizing the
add method.  In short, it's a lot of effort for no real gain.

--Peter 



--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


RE: Synchronization Fix

Posted by "Noel J. Bergman" <no...@devtech.com>.
Peter,

> The fact that LinearProcessor matcher/mailets are added before the
> service method is called is not something in the contract of the class.

Maybe it should be; see below.

> synchronizing the add call [costs] essentially nothing

Agreed.  The overhead is measurable but insignificant compared to the rest
of the service method.

> Declaring the variables final ...

Go ahead and do it.  As I said earlier today, after I thought through the
process of reloading the configuration, it appears to me that the way to do
this is to create a new instance of JamesSpoolManager with the new
configuration, and to gracefully shutdown the old one.  Essentially, the old
one would finish up existing message processing, but wouldn't pull new
messages from the spool.

So I'll go you one further.  Why not establish that calling
LinearProcessor.add(Matcher,Mailet) after calling LinearProcessor.service()
results in an exception?

	--- Noel


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


RE: Synchronization Fix

Posted by "Peter M. Goldstein" <pe...@yahoo.com>.
Noel,

> As for this fix ...
> 
> Why is LinearProcessor.add(Matcher, Mailet) synchronized now?  From
where
> does that re-entrancy issue arise?  The previous scheme had service()
> synchronized because there are multiple threads that could potentially
> want
> to use the processor, and the service method wasn't re-entrant.  You
have
> endevoured to make it re-entrant by making the unprocessed list local,
so
> I
> don't see where the add(Matcher, Mailet) synchronization becomes
> necessary.
> The processors are all initialized serially by JamesSpoolManager
before
> the
> threads are started.  In fact, the current code even adds the
processors
> to
> the Map before they are prepared, which ought to indicate that it
doesn't
> expect to use them yet.

The fact that LinearProcessor matcher/mailets are added before the
service method is called is not something in the contract of the class.
There is nothing save the current implementation of the code for
JamesSpoolManager that guarantees the behavior you describe.  Coding
defensively requires synchronizing the add call.  It costs essentially
nothing (in the current case it amounts to a single thread acquiring an
uncontested lock, very cheap) but makes the class vastly more robust.

> There is no option to remove matchers/mailets at them moment, simply
to
> add
> them, so shortening the chain isn't an issue.  At first glance, it
appears
> that the way to do this would be to initialize a JamesSpoolManager
with
> the
> new configuration, and ask the old spool manager to start shutting
down
> gracefully.  Existing messages already being processed would not see
the
> changes.
> 
> Agreed that this is beyond the scope of the fix.  I was simply raising
the
> topic, and I think that the above might be a reasonable way to
implement
> it.

Again, this is largely a defensive strategy.  Declaring the variables
final means that when somebody alters the class to add a remove or reset
method they'll be more inclined to be careful about the threading.
Alteration of those variables should be done only with the greatest of
care.

--Peter




--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


RE: Synchronization Fix

Posted by "Noel J. Bergman" <no...@devtech.com>.
> The synchronization calls are designed to protect
> against additional matchers being added.  Consider
> the case of a set of mailets/matchers that are
> smaller than the original mailets/matchers
> list when service was invoked.

There is no option to remove matchers/mailets at them moment, simply to add
them, so shortening the chain isn't an issue.  At first glance, it appears
that the way to do this would be to initialize a JamesSpoolManager with the
new configuration, and ask the old spool manager to start shutting down
gracefully.  Existing messages already being processed would not see the
changes.

Agreed that this is beyond the scope of the fix.  I was simply raising the
topic, and I think that the above might be a reasonable way to implement it.

As for this fix ...

Why is LinearProcessor.add(Matcher, Mailet) synchronized now?  From where
does that re-entrancy issue arise?  The previous scheme had service()
synchronized because there are multiple threads that could potentially want
to use the processor, and the service method wasn't re-entrant.  You have
endevoured to make it re-entrant by making the unprocessed list local, so I
don't see where the add(Matcher, Mailet) synchronization becomes necessary.
The processors are all initialized serially by JamesSpoolManager before the
threads are started.  In fact, the current code even adds the processors to
the Map before they are prepared, which ought to indicate that it doesn't
expect to use them yet.

	--- Noel


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


RE: Synchronization Fix

Posted by "Peter M. Goldstein" <pe...@yahoo.com>.
Noel,

> Why shouldn't it be possible to reset the mailets and matchers?  The
code
> appears to protect itself from that during the processing of an
individual
> message, and even has comments discussing the possibilities.  James
> doesn't
> do it right now, but I'd like to be able to reload the config without
> having
> to restart James.

The code most certainly doesn't protect itself from resets of the
matchers/mailets.  The synchronization calls are designed to protect
against additional matchers being added.  Consider the case of a set of
mailets/matchers that are smaller than the original mailets/matchers
list when service was invoked.  This could lead to spurious
out-of-bounds exceptions, as the service method attempts to retrieve a
matcher with an index beyond the end of the new list.  Ugh.

Rewriting this code so it handles the case of arbitrary matcher/mailet
changes without slowing it down to molasses (even the current code  with
its coarse synchronization doesn't handle this situation properly) is
tricky.  Not impossible, but tricky.  And while certainly desirable, I
think it's beyond the scope of what we want to accomplish right now.

--Peter 



--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


RE: Synchronization Fix

Posted by "Noel J. Bergman" <no...@devtech.com>.
> Well, I guess.  I wanted to use final variables to
> ensure that no instance could reset the mailets and
> matchers variable.  But if you want to put them in
> initialize as per an Avalon paradigm, I'm willing to
> go that way.

Why shouldn't it be possible to reset the mailets and matchers?  The code
appears to protect itself from that during the processing of an individual
message, and even has comments discussing the possibilities.  James doesn't
do it right now, but I'd like to be able to reload the config without having
to restart James.

> I don't think you'll ever see an out of
> bounds exception

I had realized that after my initial post.  My "retraction" and your note
crossed in the ether.

> Before clearing [the last bucket] we can iterate
> through the bucket and confirm that for each message
> either the state is set to GHOST or the # of
> recipients is zero.

JamesSpoolManager does this:

                if (processorName.equals(Mail.ERROR)) {
                    // We got an error on the error processor...
                    // kill the message
                    mail.setState(Mail.GHOST);
                    mail.setErrorMessage(e.getMessage());
                } else {
                    //We got an error... send it to the error processor
                    mail.setState(Mail.ERROR);
                    mail.setErrorMessage(e.getMessage());
                }
            }

if LinearProcessor throws an exception.  I'm not advocating throwing an
exception, since it would be the original message that got handled (which
might have been fine), and we might have multiple messages in our
unprocessed queue.  But the basic pattern (with a log message) sounds like a
plan.

	--- Noel


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


RE: Synchronization Fix

Posted by "Peter M. Goldstein" <pe...@yahoo.com>.
Noel,

> 
> This is wrong, AIUI:
> 
> -    private List mailets;
> -    private List matchers;
> +    private final List mailets;  // The list of
> mailets for this processor
> +    private final List matchers; // The list of
> matchers for this processor
> 
> +    public LinearProcessor() {
> +        matchers = new ArrayList();
> +        mailets = new ArrayList();
> +    }
> 
> The reason I think it is wrong is that I believe
> that the approach preferred
> by the Avalon developers is for those assignments to
> be in initialize();
> that is what it means to be Initializable.

Well, I guess.  I wanted to use final variables to
ensure that no instance could reset the mailets and
matchers variable.  But if you want to put them in
initialize as per an Avalon paradigm, I'm willing to
go that way.

Just need to remove the final modifiers and move the
declarations to initialize()
 
> Anyhow, I do think that I am seeing an error ...
> what happens if there is an
> error in either configuration or implementation, and
> the last matcher/mailet
> neither consumes all recipients nor sets the state
> to someplace new?  What
> cleans up that message?  I don't see the path by
> which the message is
> removed from the spool.  Furthermore, this:
> 
>     unprocessed[i + 1].add(mail);
> 
> will end up throwing an out of bounds exception.  Am
> I missing something, or
> do we need to handle the boundary case for the end
> of the processor list?

Good point.  I missed this.  You're basically right
here, but I think this is actually really easy to
handle.  I don't think you'll ever see an out of
bounds exception (the inner loop guarantees that i has
a maximum length of #matchers), but you will see mail
left on the spool.

At the top of the while(true) loop we see that the
last bucket in the processor is cleared.  Before
clearing it we can iterate through the bucket and
confirm that for each message either the state is set
to GHOST or the # of recipients is zero.  If neither
is true, that signals (as you mention above) a serious
error in configuration/implementation.  So we just
need to figure out what to do in the error case. 
Probably a well chosen exception and a log message
would be appropriate.  How does that sound?

--Peter

  

__________________________________________________
Do You Yahoo!?
HotJobs - Search Thousands of New Jobs
http://www.hotjobs.com

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


RE: Synchronization Fix

Posted by "Noel J. Bergman" <no...@devtech.com>.
LinearProcessor.java:

This is wrong, AIUI:

-    private List mailets;
-    private List matchers;
+    private final List mailets;  // The list of mailets for this processor
+    private final List matchers; // The list of matchers for this processor

+    public LinearProcessor() {
+        matchers = new ArrayList();
+        mailets = new ArrayList();
+    }

The reason I think it is wrong is that I believe that the approach preferred
by the Avalon developers is for those assignments to be in initialize();
that is what it means to be Initializable.

It would be nice to be able to tell James to reload the configuration, so
that we don't have to stop and restart when changing config.xml.  I suspect
that this is something that we'd need to take up with the Avalon developers.

Anyhow, I do think that I am seeing an error ... what happens if there is an
error in either configuration or implementation, and the last matcher/mailet
neither consumes all recipients nor sets the state to someplace new?  What
cleans up that message?  I don't see the path by which the message is
removed from the spool.  Furthermore, this:

    unprocessed[i + 1].add(mail);

will end up throwing an out of bounds exception.  Am I missing something, or
do we need to handle the boundary case for the end of the processor list?
It used to work because JamesSpoolManager always removed the message.

JamesSpoolManager.java:
-                spool.remove(key);
-                getLogger().info("==== Removed from spool mail "
-                                 + mail.getName() + " ====");
+                // Only remove an email from the spool is processing is
+                // complete, or if it has no recipients
+                if ((Mail.GHOST.equals(mail.getState())) ||
+                    (mail.getRecipients() == null) ||
+                    (mail.getRecipients().size() == 0)) {
+                    spool.remove(key);
+                    if (infoEnabled) {
+                        StringBuffer infoBuffer =
+                            new StringBuffer(64)
+                                    .append("==== Removed from spool mail
")
+                                    .append(mail.getName())
+                                    .append("====");
+                        getLogger().info(infoBuffer.toString());
+                    }
+                }

The new approach is more efficient, but by removing the side-effect, we've
exposed something that needs to be handled.

Thoughts?

	--- Noel

-----Original Message-----
From: Peter M. Goldstein [mailto:peter_m_goldstein@yahoo.com]
Sent: Wednesday, August 07, 2002 20:24
To: 'James Developers List'
Subject: Synchronization Fix



All,

Attached is a modified version of a performance enhancement to the
LinearProcessor class.  It is a modified version of a fix submitted by
Steve Short roughly a month ago.

The fix has two parts:

i) The synchronization of the LinearProcessor service method has been
removed, and the class has been adjusted to be thread-safe in this
situation.

ii) The mail is not renamed on each pass of the LinearProcessor.
Rather, the JamesSpoolManager is modified so that it only removes mail
from the spool when the processing state is finished or when the
recipient list is empty.  (Thanks to Shilpa Dalmia)

Note that the diffs also include the String=>StringBuffer changes.

As users of this relatively minor fix have experienced a 30% increase in
throughput, I'd like to get this into the code base ASAP.  So if I don't
hear any objections I'll be submitting these sometime tomorrow.

Please feel free to respond with comments, questions, or critiques.

--Peter


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>