You are viewing a plain text version of this content. The canonical link for it is here.
Posted to server-user@james.apache.org by Robert Munn <ro...@gmail.com> on 2016/01/02 04:06:17 UTC

Re: James new event system

Benoit,

This is very interesting work, thank you for contributing. I am interested in learning more about the event system.

Robert


> On Nov 28, 2015, at 10:09 AM, Tellier Benoit <bt...@apache.org> wrote:
> 
> Hi,
> 
> I just wanted to present my work on James event system.
> 
> ## What is James event system ?
> 
> The mailbox event system conveys notifications about modifications of
> the mailboxes and messages states. You can register listener to it so
> that you can be notified.
> 
> ## What it is used for ?
> 
> It is used for :
> 
> - IMAP IDLE : allow one to subscribe to a specific mailbox and gets
> notified about changes without to pull the mailbox.
> 
> - Quota system : updates about stored quota are made outside the
> MailboxManager as it may involve large quota calculations
> 
> - Indexing of messages for the Search feature (ElasticSearch and Lucene
> implementation )
> 
> - IMAP Sequence Number handling.
> 
> - Cache invalidation (caching project, not yet exposed to configuration)
> 
> - Many others
> 
> ## Why do we need it to be distributed ?
> 
> I want to see this feature distributed as I personally really love IDLE
> feature. I want my Thunderbird to be allowed to use this in a
> distributed environment.
> 
> I also think one might be interested to make several James work in
> parallel with any kind of architecture (Quotas, messages search indexes).
> 
> ## What are different configuration options ?
> 
> I reviewed the event system.
> 
> First thing is to explicitly specify a listener distributed status. It
> can be either :
> 
> - Registered per mailbox
> - The listener needs just to be notified about all local events
> - The listener needs to be notified about all events in your James cluster.
> 
> Then, we keep the in memory default implementation (little reworked
> using guava). And I added two other architectures for the event system.
> 
> #### Registration based event system
> 
> With this implementation, you want to exchange events on the network.
> You want a James system to be only notified about events it explicitly
> registered to. Because of that :
> 
> - This approach is thought for architecture with a large number of
> James server
> - It does not support event listener that needs to be notified of all
> events in the cluster.
> 
> Each server listens on a message queue and a registration mechanism is
> used to identify to which server we need to send the events. Of course
> you have event serialization / deserialization.
> 
> Today :
> - Kafka is used for the messaging
> - Cassandra is used for registration management
> 
> This solution was presented at Paris Cassandra Meet-up.
> 
> #### Broadcast event system
> 
> With this implementation, you want to have several James working
> together but you relies on Mailbox Listeners that needs to be notified
> about every event in your data center.
> 
> These listeners could be :
> 
> - Lucene document indexing
> - In memory quotas
> - In memory cache
> 
> The idea here is to naively broadcast the events to all your James. They
> are notified about every events (so scalability will be limited).
> 
> You also have to be aware that events can be duplicated /non emitted
> (james server crash, network partitions) so local data might be
> inconsistent. It seems OK for instance for quota calculation.
> 
> ## What do I need to know as an administrator ?
> 
> Distributed use of Message Sequence Number (that demands high degree of
> coordination) is risky. The inconsistency window between server may be
> large, and the corresponding between UID and message sequence number is
> not eventually consistent. This topic is in discussion on the dev
> mailing list.
> 
> I corrected an issue I spotted month before : a faulty mailbox listener
> might stop the event delivery chain and generate IMAP service
> unavailability. I added a commit to not propagate errors inside mailbox
> Listeners.
> 
> I want to finish this section by speaking of event serialization. You
> can either choose :
> 
> - JSON
> - MessagePack
> 
> The first one is faster to compute but larger. So it let you trade
> compute power versus network.
> 
> ## Event delivery modes
> 
> As you might have noticed, Mailbox Listener can take a long time to
> execute, and for some of them, they can safely be executed
> asynchronously (IDLE, indexation and even quotas).
> 
> I added an Event Delivery abstraction. Thanks to this, you can configure
> your James to :
> 
> - Synchronously deliver events (todays behavior)
> - Asynchronously deliver events ( returns before having delivered
> events, Mailbox Listener are notified in parallel in a thread pool)
> - Mixed mode : Every Mailbox Listener indicates if it should be
> synchronously or asynchronously executed.
> 
> The asynchronous option can be considered as risky. The mixed one is
> safe, and significantly reduces latencies if you rely on document indexing.
> 
> ## Re indexers
> 
> I also added the availability to re index documents in a Message Search
> index using the CLI :
> 
> - per mailbox : the event system is used to track changes made to the
> given mailbox and significantly reduce the concurrent changes window.
> - your whole James mailboxes : the event system is used to keep track
> of deleted mailboxes.
> 
> ## My future works on the event system.
> 
> Finish the work on MAILBOX-257 : one should be able to recalculate quotas.
> 
> Unfortunately it is not yet planned in my todo list...
> 
> Benoit
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: server-user-unsubscribe@james.apache.org
> For additional commands, e-mail: server-user-help@james.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: server-user-unsubscribe@james.apache.org
For additional commands, e-mail: server-user-help@james.apache.org


Re: James new event system

Posted by Benoit Tellier <be...@minet.net>.
Hi Robert,

While traveling back from holidays, I took some time to write a quick
blog post on the topic :

http://blog.btellier.com/article/72

If you are interrested...

And if you have any question, don't hesitate to ask me directly.

Benoit





Le 02/01/2016 04:06, Robert Munn a écrit :
> Benoit,
> 
> This is very interesting work, thank you for contributing. I am interested in learning more about the event system.
> 
> Robert
> 
> 
>> On Nov 28, 2015, at 10:09 AM, Tellier Benoit <bt...@apache.org> wrote:
>>
>> Hi,
>>
>> I just wanted to present my work on James event system.
>>
>> ## What is James event system ?
>>
>> The mailbox event system conveys notifications about modifications of
>> the mailboxes and messages states. You can register listener to it so
>> that you can be notified.
>>
>> ## What it is used for ?
>>
>> It is used for :
>>
>> - IMAP IDLE : allow one to subscribe to a specific mailbox and gets
>> notified about changes without to pull the mailbox.
>>
>> - Quota system : updates about stored quota are made outside the
>> MailboxManager as it may involve large quota calculations
>>
>> - Indexing of messages for the Search feature (ElasticSearch and Lucene
>> implementation )
>>
>> - IMAP Sequence Number handling.
>>
>> - Cache invalidation (caching project, not yet exposed to configuration)
>>
>> - Many others
>>
>> ## Why do we need it to be distributed ?
>>
>> I want to see this feature distributed as I personally really love IDLE
>> feature. I want my Thunderbird to be allowed to use this in a
>> distributed environment.
>>
>> I also think one might be interested to make several James work in
>> parallel with any kind of architecture (Quotas, messages search indexes).
>>
>> ## What are different configuration options ?
>>
>> I reviewed the event system.
>>
>> First thing is to explicitly specify a listener distributed status. It
>> can be either :
>>
>> - Registered per mailbox
>> - The listener needs just to be notified about all local events
>> - The listener needs to be notified about all events in your James cluster.
>>
>> Then, we keep the in memory default implementation (little reworked
>> using guava). And I added two other architectures for the event system.
>>
>> #### Registration based event system
>>
>> With this implementation, you want to exchange events on the network.
>> You want a James system to be only notified about events it explicitly
>> registered to. Because of that :
>>
>> - This approach is thought for architecture with a large number of
>> James server
>> - It does not support event listener that needs to be notified of all
>> events in the cluster.
>>
>> Each server listens on a message queue and a registration mechanism is
>> used to identify to which server we need to send the events. Of course
>> you have event serialization / deserialization.
>>
>> Today :
>> - Kafka is used for the messaging
>> - Cassandra is used for registration management
>>
>> This solution was presented at Paris Cassandra Meet-up.
>>
>> #### Broadcast event system
>>
>> With this implementation, you want to have several James working
>> together but you relies on Mailbox Listeners that needs to be notified
>> about every event in your data center.
>>
>> These listeners could be :
>>
>> - Lucene document indexing
>> - In memory quotas
>> - In memory cache
>>
>> The idea here is to naively broadcast the events to all your James. They
>> are notified about every events (so scalability will be limited).
>>
>> You also have to be aware that events can be duplicated /non emitted
>> (james server crash, network partitions) so local data might be
>> inconsistent. It seems OK for instance for quota calculation.
>>
>> ## What do I need to know as an administrator ?
>>
>> Distributed use of Message Sequence Number (that demands high degree of
>> coordination) is risky. The inconsistency window between server may be
>> large, and the corresponding between UID and message sequence number is
>> not eventually consistent. This topic is in discussion on the dev
>> mailing list.
>>
>> I corrected an issue I spotted month before : a faulty mailbox listener
>> might stop the event delivery chain and generate IMAP service
>> unavailability. I added a commit to not propagate errors inside mailbox
>> Listeners.
>>
>> I want to finish this section by speaking of event serialization. You
>> can either choose :
>>
>> - JSON
>> - MessagePack
>>
>> The first one is faster to compute but larger. So it let you trade
>> compute power versus network.
>>
>> ## Event delivery modes
>>
>> As you might have noticed, Mailbox Listener can take a long time to
>> execute, and for some of them, they can safely be executed
>> asynchronously (IDLE, indexation and even quotas).
>>
>> I added an Event Delivery abstraction. Thanks to this, you can configure
>> your James to :
>>
>> - Synchronously deliver events (todays behavior)
>> - Asynchronously deliver events ( returns before having delivered
>> events, Mailbox Listener are notified in parallel in a thread pool)
>> - Mixed mode : Every Mailbox Listener indicates if it should be
>> synchronously or asynchronously executed.
>>
>> The asynchronous option can be considered as risky. The mixed one is
>> safe, and significantly reduces latencies if you rely on document indexing.
>>
>> ## Re indexers
>>
>> I also added the availability to re index documents in a Message Search
>> index using the CLI :
>>
>> - per mailbox : the event system is used to track changes made to the
>> given mailbox and significantly reduce the concurrent changes window.
>> - your whole James mailboxes : the event system is used to keep track
>> of deleted mailboxes.
>>
>> ## My future works on the event system.
>>
>> Finish the work on MAILBOX-257 : one should be able to recalculate quotas.
>>
>> Unfortunately it is not yet planned in my todo list...
>>
>> Benoit
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: server-user-unsubscribe@james.apache.org
>> For additional commands, e-mail: server-user-help@james.apache.org
>>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: server-user-unsubscribe@james.apache.org
> For additional commands, e-mail: server-user-help@james.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: server-user-unsubscribe@james.apache.org
For additional commands, e-mail: server-user-help@james.apache.org