You are viewing a plain text version of this content. The canonical link for it is here.
Posted to server-user@james.apache.org by Harmeet <ha...@kodemuse.com> on 2001/08/08 10:01:40 UTC

Proposal to fix 'Can 2 instances of JAMES share the same database'

I think what would be best is to have locking not tied to a process. This
would allow scalability.

This could be done in several ways.
a) For File System based repositories, it would be a matter of changing the
extension of message a spool thread is working on and renaming to orig if
spool processing fails.

for example
File spoolFile = ....
boolean reset = false;
try {
   spoolFile.renameTo(<spooled file with status in-process>)
catch ( Throwable t) {
  reset = true;
  throw t;
} finally {
  if ( reset )
      <spooled file with status in-process>.renameTo(spoolFile);
}
The same thing could be done in db, by setting a 'in-process' flag for spool
message.
The spool threads will pick and process messages that are not being
processed.

Another way could be to
b) have a lock-server process that controls object locking and lifetime.
Basically lock facility could be leased out for sometime, and renewed or
staus success/failed returned. Lock Sever solution is nice and general but
it may be an overkill. It may however be a good Avalon Block to have. It is
a nice Server Piece to have when you need it.

This would allow multiple processes to process the spool messages.
This would not be as fast as the current single process based locking, but I
think the spool processing does not need to be fast, but it does need to be
scalable and correct (i.e one email for one message)
This can be very scalable, one could have multiple instances of James behind
a load-balancer/virtual address, to service high volume.

Here is a proposal for your vote.
Let us implement method (a) to allow multiple processes of James to process
same spoll db.

If you like the idea, I can do the File Repository part of it over the
weekend.

What do you think ? Does this make sense ?
Harmeet



---------------------------------------------------------------------
To unsubscribe, e-mail: james-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: james-user-help@jakarta.apache.org


Re: Proposal to fix 'Can 2 instances of JAMES share the same database'

Posted by Serge Knystautas <se...@lokitech.com>.
Problems I see with a) is
1. when a process crashes that has a lock, how does that message get
unlocked and
2. how do other processes get notified when a message is in the spool?
(there's no call back mechanism).  this means instead of a wait() or some
callback, we have to check rather frequently to see whether any messages are
in a spool.  it does solve multithreading though.

I like b).  Certainly performance might suffer, but if you're running
multiple instances, that's to be expected (like why dual processor isn't
twice as fast as a single).  Sounds like an interesting concept though.  I'm
not sure how you'd determine which was the central authority for the locking
or how callbacks would work (for spools to get notified of unlocks).

Also I wanted to note again that spool != locking.  There might be issues
with POP3 mailboxes, but certainly with IMAP where you need the repository
to be threadsafe (have locking) even though it's not a spool.

Serge Knystautas
Loki Technologies
http://www.lokitech.com/
----- Original Message -----
From: "Harmeet" <ha...@kodemuse.com>
To: "james-user" <ja...@jakarta.apache.org>
Sent: Wednesday, August 08, 2001 4:01 AM
Subject: Proposal to fix 'Can 2 instances of JAMES share the same database'


> I think what would be best is to have locking not tied to a process. This
> would allow scalability.
>
> This could be done in several ways.
> a) For File System based repositories, it would be a matter of changing
the
> extension of message a spool thread is working on and renaming to orig if
> spool processing fails.
>
> for example
> File spoolFile = ....
> boolean reset = false;
> try {
>    spoolFile.renameTo(<spooled file with status in-process>)
> catch ( Throwable t) {
>   reset = true;
>   throw t;
> } finally {
>   if ( reset )
>       <spooled file with status in-process>.renameTo(spoolFile);
> }
> The same thing could be done in db, by setting a 'in-process' flag for
spool
> message.
> The spool threads will pick and process messages that are not being
> processed.
>
> Another way could be to
> b) have a lock-server process that controls object locking and lifetime.
> Basically lock facility could be leased out for sometime, and renewed or
> staus success/failed returned. Lock Sever solution is nice and general but
> it may be an overkill. It may however be a good Avalon Block to have. It
is
> a nice Server Piece to have when you need it.
>
> This would allow multiple processes to process the spool messages.
> This would not be as fast as the current single process based locking, but
I
> think the spool processing does not need to be fast, but it does need to
be
> scalable and correct (i.e one email for one message)
> This can be very scalable, one could have multiple instances of James
behind
> a load-balancer/virtual address, to service high volume.
>
> Here is a proposal for your vote.
> Let us implement method (a) to allow multiple processes of James to
process
> same spoll db.
>
> If you like the idea, I can do the File Repository part of it over the
> weekend.
>
> What do you think ? Does this make sense ?
> Harmeet
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: james-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: james-user-help@jakarta.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: james-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: james-user-help@jakarta.apache.org


Re: Proposal to fix 'Can 2 instances of JAMES share the same database'

Posted by Darrell DeBoer <dd...@bigdaz.com>.
On Thu,  9 Aug 2001 01:08, Serge Knystautas wrote:
> I started writing something, but now that I think about this, locking and
> spool really aren't related.  A standard mail repository needs locking...
> if you have multiple threads potentially modifying messages in a
> repository, you need locking, regardless of whether it's being used as a
> spool.

I'm sure you're more familiar with this than I, Serge. As I said, I was just 
thinking (out loud, I guess). I hope it didn't create too much noise :-)

>
> I like the pluggable locking mechanisms idea (if someone else has the
> time/interest to write such stuff), so what about having some conf setting
> for any mail or spool repository to use an alternate locking mechanism
> rather than the simple in-memory one?
>
> Serge Knystautas
> Loki Technologies
> http://www.lokitech.com/
> ----- Original Message -----
> From: "Darrell DeBoer" <dd...@bigdaz.com>
> To: <ja...@jakarta.apache.org>
> Sent: Wednesday, August 08, 2001 10:10 AM
> Subject: Re: Proposal to fix 'Can 2 instances of JAMES share the same
> database'
>
> > On Wed,  8 Aug 2001 18:01, Harmeet wrote:
> > > I think what would be best is to have locking not tied to a process.
>
> This
>
> > > would allow scalability.
> >
> > Agreed - the locking mechanism should probably be implemented
>
> independently
>
> > of java thread locking.
> >
> > > This could be done in several ways.
> > > a) For File System based repositories, it would be a matter of changing
>
> the
>
> > > extension of message a spool thread is working on and renaming to orig
>
> if
>
> > > spool processing fails.
> > >
> > > for example
> > > File spoolFile = ....
> > > boolean reset = false;
> > > try {
> > >    spoolFile.renameTo(<spooled file with status in-process>)
> > > catch ( Throwable t) {
> > >   reset = true;
> > >   throw t;
> > > } finally {
> > >   if ( reset )
> > >       <spooled file with status in-process>.renameTo(spoolFile);
> > > }
> > > The same thing could be done in db, by setting a 'in-process' flag for
> > > spool message.
> > > The spool threads will pick and process messages that are not being
> > > processed.
> >
> > Just thinking...
> > Would it be possible/useful to separate the SpoolRepository
> > implementation from the MailRepository; so that a SpoolRepository
> > delegates calls to a MailRepository contained within? The SpoolRepository
> > implementation would handle locking/respooling of message ids, but not
> > the physical storage of
>
> the
>
> > mail.
> >
> > Possible benefits:
> > a) keep the MailRepository cleaner, since it wouldn't have to handle
>
> locking,
>
> > timeouts etc.
> > b) allow us to mix-and-match Spool implementations with MailRepository
> > implementations (eg - use a db for spool locking, but a file-based
> > MailRepository)
> > c) allow us to use a locking mechanism like the current one (using a Lock
> > object) for single machine implementations (if this was more performant).
> > d) later we can switch to a dedicated lock-server component more easily.
> >
> > > Another way could be to
> > > b) have a lock-server process that controls object locking and
> > > lifetime. Basically lock facility could be leased out for sometime, and
> > > renewed or staus success/failed returned. Lock Sever solution is nice
> > > and general
>
> but
>
> > > it may be an overkill. It may however be a good Avalon Block to have.
> > > It
>
> is
>
> > > a nice Server Piece to have when you need it.
> > >
> > > This would allow multiple processes to process the spool messages.
> > > This would not be as fast as the current single process based locking,
>
> but
>
> > > I think the spool processing does not need to be fast, but it does need
>
> to
>
> > > be scalable and correct (i.e one email for one message)
> > > This can be very scalable, one could have multiple instances of James
> > > behind a load-balancer/virtual address, to service high volume.
> > >
> > > Here is a proposal for your vote.
> > > Let us implement method (a) to allow multiple processes of James to
>
> process
>
> > > same spoll db.
> >
> > +1
> > - I reckon this is the right direction, whether we split the
>
> SpoolRepository
>
> > and MailRepository implementations or not.
> >
> > > If you like the idea, I can do the File Repository part of it over the
> > > weekend.
> > >
> > > What do you think ? Does this make sense ?
> > > Harmeet
> > >
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: james-user-unsubscribe@jakarta.apache.org
> > > For additional commands, e-mail: james-user-help@jakarta.apache.org
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: james-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: james-user-help@jakarta.apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: james-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: james-user-help@jakarta.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: james-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: james-user-help@jakarta.apache.org


Re: Proposal to fix 'Can 2 instances of JAMES share the same database'

Posted by Oki DZ <ok...@pindad.com>.
On Wed, 8 Aug 2001, Serge Knystautas wrote:

> I started writing something, but now that I think about this, locking and
> spool really aren't related.  A standard mail repository needs locking... if
> you have multiple threads potentially modifying messages in a repository,
> you need locking, regardless of whether it's being used as a spool.

If you have started to write something, could it be about overhauling how
the mailets and the spool work?
 
> I like the pluggable locking mechanisms idea (if someone else has the
> time/interest to write such stuff), so what about having some conf setting
> for any mail or spool repository to use an alternate locking mechanism
> rather than the simple in-memory one?

Yes, the lock server might work; as I understand it, James starts working
by a "trigger" generated by the spool. To be exact, by a new message that
arrives in the spool; the message that was received by the SMTP handler
and stored in the spool. When a message arrives, the spool's store() 
method will notify all the waiting processor threads to start working and
process the message that is just stored in the spool. Those processor
threads, once were started by the spool manager(s) and wait for any
message to arrive in the spool (they wait due to the wait() method in the
spool's accept()). So, as long as you have a spool repository for each
instance of James, then you'd have no problem.

When the processor threads get notified by the spool to start, they will
get notified. I mean, the processors and the spool are in the same JVM, so
notifyAll() will work as usual no matter how unlock() gets done. In the
mail repository, you may have lockServer.unlock(message) instead of just
unlock(message), so you wouldn't have the callback problem.

BTW, I think I'm done with my spool repository implementation. I'm going
to send it to you (and Charles). It's for you to take a look. The
temporary message names cache is there. So you can see wether it would be
good to implement the feature in the main branch. It's heavily logged;  I
log everything including each entry/exit to/from the methods. 

By examining the logs (../logs/mailstore.log), having a message cache
would be good for the outgoing spool; yes, I separate the incoming and the
outgoing spools (even though they end up in the same db table). I have to
have both, so that each of the lock cache will store the incoming and
outgoing message names separately. The message cache could have been
working better if James had decoupled the spool manager and the processor
threads. 

I think, in the next release of James, the message processing would be
better if it's not triggered by the incoming messages, but by time. You
can have threads that start running periodically. By having them, you can
separate how the spool manager and the processor threads work. It could be
beneficial, so that you can have a spool that can receive messages at any
time and put them in the outgoing spool according to the destinations. if
you have more than one spool manager, then it would be all right, the
process of wether storing the messages in the local repository or outgoing
spool could be done as fast as the messages arrive. The processor threads,
waking up timely, would see wether any messages exist in the outgoing
spool. If there's any, then try to send them. To have this, I think
there's only some change needed in the RemoteDelivery mailet (or in the
LinearProcessor).

I believe it would be good to have the (incoming) spool and the remote
delivery threads to work separately (not bind by the notifyAll()), so that
you can set the number of the spool manager threads and the remote
delivery thread differently. Currently, if you have, say 10 spool managers
running, and you have 10 mail destinations that couldn't be reached (then
James waits to retry sending them), basically, there's no process going in
the spool (eg: moving messages from incoming to outgoing), and no more
mail get sent out. Well, I may be wrong, but that's what I can see by
looking into the logs.

Oki





---------------------------------------------------------------------
To unsubscribe, e-mail: james-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: james-user-help@jakarta.apache.org


Re: Proposal to fix 'Can 2 instances of JAMES share the same database'

Posted by Serge Knystautas <se...@lokitech.com>.
I started writing something, but now that I think about this, locking and
spool really aren't related.  A standard mail repository needs locking... if
you have multiple threads potentially modifying messages in a repository,
you need locking, regardless of whether it's being used as a spool.

I like the pluggable locking mechanisms idea (if someone else has the
time/interest to write such stuff), so what about having some conf setting
for any mail or spool repository to use an alternate locking mechanism
rather than the simple in-memory one?

Serge Knystautas
Loki Technologies
http://www.lokitech.com/
----- Original Message -----
From: "Darrell DeBoer" <dd...@bigdaz.com>
To: <ja...@jakarta.apache.org>
Sent: Wednesday, August 08, 2001 10:10 AM
Subject: Re: Proposal to fix 'Can 2 instances of JAMES share the same
database'


> On Wed,  8 Aug 2001 18:01, Harmeet wrote:
> > I think what would be best is to have locking not tied to a process.
This
> > would allow scalability.
>
> Agreed - the locking mechanism should probably be implemented
independently
> of java thread locking.
>
> >
> > This could be done in several ways.
> > a) For File System based repositories, it would be a matter of changing
the
> > extension of message a spool thread is working on and renaming to orig
if
> > spool processing fails.
> >
> > for example
> > File spoolFile = ....
> > boolean reset = false;
> > try {
> >    spoolFile.renameTo(<spooled file with status in-process>)
> > catch ( Throwable t) {
> >   reset = true;
> >   throw t;
> > } finally {
> >   if ( reset )
> >       <spooled file with status in-process>.renameTo(spoolFile);
> > }
> > The same thing could be done in db, by setting a 'in-process' flag for
> > spool message.
> > The spool threads will pick and process messages that are not being
> > processed.
>
> Just thinking...
> Would it be possible/useful to separate the SpoolRepository implementation
> from the MailRepository; so that a SpoolRepository delegates calls to a
> MailRepository contained within? The SpoolRepository implementation would
> handle locking/respooling of message ids, but not the physical storage of
the
> mail.
>
> Possible benefits:
> a) keep the MailRepository cleaner, since it wouldn't have to handle
locking,
> timeouts etc.
> b) allow us to mix-and-match Spool implementations with MailRepository
> implementations (eg - use a db for spool locking, but a file-based
> MailRepository)
> c) allow us to use a locking mechanism like the current one (using a Lock
> object) for single machine implementations (if this was more performant).
> d) later we can switch to a dedicated lock-server component more easily.
>
> >
> > Another way could be to
> > b) have a lock-server process that controls object locking and lifetime.
> > Basically lock facility could be leased out for sometime, and renewed or
> > staus success/failed returned. Lock Sever solution is nice and general
but
> > it may be an overkill. It may however be a good Avalon Block to have. It
is
> > a nice Server Piece to have when you need it.
> >
> > This would allow multiple processes to process the spool messages.
> > This would not be as fast as the current single process based locking,
but
> > I think the spool processing does not need to be fast, but it does need
to
> > be scalable and correct (i.e one email for one message)
> > This can be very scalable, one could have multiple instances of James
> > behind a load-balancer/virtual address, to service high volume.
> >
> > Here is a proposal for your vote.
> > Let us implement method (a) to allow multiple processes of James to
process
> > same spoll db.
>
> +1
> - I reckon this is the right direction, whether we split the
SpoolRepository
> and MailRepository implementations or not.
>
> >
> > If you like the idea, I can do the File Repository part of it over the
> > weekend.
> >
> > What do you think ? Does this make sense ?
> > Harmeet
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: james-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: james-user-help@jakarta.apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: james-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: james-user-help@jakarta.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: james-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: james-user-help@jakarta.apache.org


Re: Proposal to fix 'Can 2 instances of JAMES share the same database'

Posted by Darrell DeBoer <dd...@bigdaz.com>.
On Wed,  8 Aug 2001 18:01, Harmeet wrote:
> I think what would be best is to have locking not tied to a process. This
> would allow scalability.

Agreed - the locking mechanism should probably be implemented independently 
of java thread locking.

>
> This could be done in several ways.
> a) For File System based repositories, it would be a matter of changing the
> extension of message a spool thread is working on and renaming to orig if
> spool processing fails.
>
> for example
> File spoolFile = ....
> boolean reset = false;
> try {
>    spoolFile.renameTo(<spooled file with status in-process>)
> catch ( Throwable t) {
>   reset = true;
>   throw t;
> } finally {
>   if ( reset )
>       <spooled file with status in-process>.renameTo(spoolFile);
> }
> The same thing could be done in db, by setting a 'in-process' flag for
> spool message.
> The spool threads will pick and process messages that are not being
> processed.

Just thinking...
Would it be possible/useful to separate the SpoolRepository implementation 
from the MailRepository; so that a SpoolRepository delegates calls to a 
MailRepository contained within? The SpoolRepository implementation would 
handle locking/respooling of message ids, but not the physical storage of the 
mail.

Possible benefits: 
a) keep the MailRepository cleaner, since it wouldn't have to handle locking, 
timeouts etc.
b) allow us to mix-and-match Spool implementations with MailRepository 
implementations (eg - use a db for spool locking, but a file-based 
MailRepository)
c) allow us to use a locking mechanism like the current one (using a Lock 
object) for single machine implementations (if this was more performant).
d) later we can switch to a dedicated lock-server component more easily.

>
> Another way could be to
> b) have a lock-server process that controls object locking and lifetime.
> Basically lock facility could be leased out for sometime, and renewed or
> staus success/failed returned. Lock Sever solution is nice and general but
> it may be an overkill. It may however be a good Avalon Block to have. It is
> a nice Server Piece to have when you need it.
>
> This would allow multiple processes to process the spool messages.
> This would not be as fast as the current single process based locking, but
> I think the spool processing does not need to be fast, but it does need to
> be scalable and correct (i.e one email for one message)
> This can be very scalable, one could have multiple instances of James
> behind a load-balancer/virtual address, to service high volume.
>
> Here is a proposal for your vote.
> Let us implement method (a) to allow multiple processes of James to process
> same spoll db.

+1 
- I reckon this is the right direction, whether we split the SpoolRepository 
and MailRepository implementations or not.

>
> If you like the idea, I can do the File Repository part of it over the
> weekend.
>
> What do you think ? Does this make sense ?
> Harmeet
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: james-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: james-user-help@jakarta.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: james-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: james-user-help@jakarta.apache.org


Re: Proposal to fix 'Can 2 instances of JAMES share the same database'

Posted by Oki DZ <ok...@pindad.com>.
On Wed, 8 Aug 2001, Harmeet wrote:
....
> Another way could be to
> b) have a lock-server process that controls object locking and lifetime.
> Basically lock facility could be leased out for sometime, and renewed or
> staus success/failed returned. Lock Sever solution is nice and general but
> it may be an overkill. It may however be a good Avalon Block to have. It is
> a nice Server Piece to have when you need it.

This one is better, I think; once you have the lock server running, it is
applicable on all the spool repositories. All you have to do is to have
the right version of "if (lock(message))" in the spool repositories. It
could become "if (lockServer.lock(message))"; in which lockServer is an
instance of an RMI based Avalon block lock server. I don't know whether
Avalon blocks can be RMI'zed, but if it does, I think it's a nice
solution. 

In this configuration, then you'd have a James server (the one which does
the loading of the lock server block, and some other James "clients" which
utilizes the lock server.
 
> This would allow multiple processes to process the spool messages.
> This would not be as fast as the current single process based locking, but I
> think the spool processing does not need to be fast, but it does need to be
> scalable and correct (i.e one email for one message)
> This can be very scalable, one could have multiple instances of James behind
> a load-balancer/virtual address, to service high volume.
> 
> Here is a proposal for your vote.
> Let us implement method (a) to allow multiple processes of James to process
> same spoll db.

-1
(b) is better.

> If you like the idea, I can do the File Repository part of it over the
> weekend.

I think it's important to be able to switch the kind of spool repositories
without breaking the configuration.
 
> What do you think ? Does this make sense ?
> Harmeet

(b)? Yes.

Oki




---------------------------------------------------------------------
To unsubscribe, e-mail: james-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: james-user-help@jakarta.apache.org