You are viewing a plain text version of this content. The canonical link for it is here.
Posted to server-user@james.apache.org by Bernd Waibel <BW...@intarsys.de> on 2015/03/13 17:36:51 UTC

AW: Tracking Mail After Folder Moves [unsigned]

Hello Jerry,

a very good question. I would like to tell my "opinion", not sure if I could help.
We use James v2.3.2. We currently do not use the mailboxes, but anyway. We develop with James.

Some time ago we moved our old mail system (postfix based) to a new mail system (MS Exchange).
Because every user would like to keep the old emails (GB of it), we used a tool to move the mails by IMAP from one system to another.
We used a tool called IMAPSync, I think that was the name, and the author does support many different mail systems, and does have a lot of experience.

As I could remember, there does not have to be a "ID" of an email. It could by, especially the "Message-ID", but this header is "optional".
The code in IMAPSync for syncing this mails did a lot of "identity handling". 
The software tried to sync only "missing mails", so mail in both systems needed to be identified as "identical", to not get transferred a second time on second sync. 
Same problem you may have. The author of the software wrote something about this, and had a lot of options in his software to handle this.

As I could remember, the software tried to identify the identical mail by using headers, and if the headers missed, it tried some hash values (or something like that).
Worked fine with some exceptions:  Some mails got "changed" by the MS Exchange "on arrival". It seemed to be "calendar events", which will be handled by Exchange Servers, to get stored in the Outlook calendar. 
This mails got changed every time on every sync. So we had "some" mails, which got duplicated with every sync. We simply accepted that. It was a "oneway" sync.
So you may use the "message-id" and some other headers to identify the "identical" mail. But I think this is "risky".
I think it could be possible to identify a mail by it content.

The IMAP folder structure is a "virtual" structure, it does not need to be the same on the IMAP server. Even the folder names in the client do not need to be the same on the server.
As you will have a look at James, the storage of the files may be a "file storage", but it could be also a "database storage" or anything else. James does support that.

So what happens if you store the mails in a database engine, representing the folder structure as database schema?
Every mail is an object. The folder structure is nothing more than tables or something like that.
Because most database do keep IDs of each object, or hash values, the object identity should be simply a database "field".

I am not firm with IMAP, is there a "move" operation?
If the "move" operation is implemented as a "delete" and "create" operation, the identity will be lost.
Is it possible to implement the "move" operation as a "database renaming operation", to keep the identity?

Or another: You could set a header (UUID) every time a mail arrives. 
Just needs a "set header" action in james. Than you have a "sure" trackable ID. 
But you may need to implement something like a "trash" inside the database? To cover the delete and insert action.
Would this help?


Regards
Bernd Waibel

-----Ursprüngliche Nachricht-----
Von: Jerry Malcolm [mailto:techstuff@malcolms.com] 
Gesendet: Freitag, 13. März 2015 16:50
An: James Users List
Betreff: Tracking Mail After Folder Moves

This is somewhat an IMAP question.  But also a JAMES implementation question.  My client has a massive amount of mail that must be kept and accessed.  They use Thunderbird and Outlook to do the normal mail handling stuff.  No problems at all on the client side.  But on the back end, I need to sort and organize and keep track of emails and be able to pull them up using a web interface on demand, completely independent of folders that they may currently be in.  In other words, I need to keep track of 'email x' and be able to find it at a later time no matter how many times the user moves it from folder to folder.

I believe I understand the philosophy of IMAP for the client is to find a folder, display the contents, refresh periodically and add/remove mail from its records for that folder as contents change.  Basically if the user moves a mail item from one folder to another, the first folder recognizes it's no longer there, and is done with it.  The other folder subsequently realizes it has a new email item and displays it.  But there is no knowledge that this is the same email.  Have I got it pretty much correct?

So... I realize I may be stretching/bending the intent of IMAP.  But that doesn't diminish the fact that I have the requirement.  I've dug through all of the database table schemas for JAMES and have a pretty good handle on how mail is stored and tracked internally. But I may have missed something.  So my main question is.... is there a way for me to permanently track an email item and be able to locate it at some point down the road even if it's been moved around folders several times?  
Basically, is there a global unique ID for every email stored?  BTW.... 
I'm not bound by having to use only IMAP.  I have no problem at all back-dooring to the JAMES database and writing code to use SQL to track through the database tables to find the email.  I just don't think there is anything unique/unchangeable that will allow me to permanently track a particular email.

Am I totally off the wall in considering something like this?  Seems a complete waste to have to duplicate a hundred gigs of mail data for my own archive when JAMES has a perfectly good copy of everything.

Suggestions?

Thanks.

Jerry

---------------------------------------------------------------------
To unsubscribe, e-mail: server-user-unsubscribe@james.apache.org
For additional commands, e-mail: server-user-help@james.apache.org


Re: AW: Tracking Mail After Folder Moves [unsigned]

Posted by Benoit Tellier <bt...@linagora.com>.
Le 13/03/2015 17:36, Bernd Waibel a écrit :
> I am not firm with IMAP, is there a "move" operation?
> If the "move" operation is implemented as a "delete" and "create" operation, the identity will be lost.
> Is it possible to implement the "move" operation as a "database renaming operation", to keep the identity?


The MOVE IMAP operation is not implemented in James :

 - the processor of the IMAP command is incomplete
 - lot's of MAILBOX implementation does not have this operation implemented.

But, yes you can imagine just updating the mail entry, with setting a
new mailbox, new UID and new ModSeq.

The actual behaviour is the "copy and delete" one

Le 13/03/2015 17:36, Bernd Waibel a écrit :
> But you may need to implement something like a "trash" inside the
> database? To cover the delete and insert action.
> Would this help?

You can do this by "logging" add, copy and delete operations, but you
still have to do modifications in James to achieve this, and need to
look in these logs each time you want the history of an e-mail. I think
this can be expansive.

If I had this problem, I would add to the database schema a value that
identifies a mail and its copies...

Regards,

Benoit

---------------------------------------------------------------------
To unsubscribe, e-mail: server-user-unsubscribe@james.apache.org
For additional commands, e-mail: server-user-help@james.apache.org