You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Sanford Whiteman <sw...@cypressintegrated.com> on 2006/06/10 06:49:24 UTC

Re[2]: The Future of Email is SQL

> If  we are talking about making a SQL application that is usable for
> a  multitude of people then why lock them into something. That's the
> easiest way to drive them away from supporting it.

Word.  Perl  can  play  nice with plenty of RDBMSs. If this discussion
belongs  here  at  all, I can't see how RDBMS partisanship is going to
take it anywhere good.

FTR,  there are several (commercial) spam quarantine applications, and
at  least  three  very  big  compliance/archival services, that take a
SQL-based  back-end as a given. Their traffic and access patterns have
clearly  been taken into account here, but nonetheless these are proof
that  the concept already has real-world purchase, depending on budget
and application.

--Sandy



Re: The Future of Email is SQL

Posted by Marc Perkel <ma...@perkel.com>.

John Rudd wrote:
>
> On Jun 13, 2006, at 7:52 PM, Marc Perkel wrote:
>
>>
>> John Rudd wrote:
>>>
>>> and maybe a decent perl MTA to put in front of it too (something 
>>> that will work with sendmail milters...).
>>>
>>
>> I think that a local delivery program could be written fairly easily 
>> that Exim or any other existing MTA could pipe messages into for 
>> delivery. So one wouldn't have to rewrite the MTA but just use 
>> existing MTAs and just change the delivery mechanism.
>
>
> It's not a matter of have to.  It's a matter of want to.
>

Well - I'm a member of the Exim cult - but if something better comes 
along I might convert. :)


Re: The Future of Email is SQL

Posted by John Rudd <jr...@ucsc.edu>.
On Jun 13, 2006, at 7:52 PM, Marc Perkel wrote:

>
> John Rudd wrote:
>>
>> and maybe a decent perl MTA to put in front of it too (something that 
>> will work with sendmail milters...).
>>
>
> I think that a local delivery program could be written fairly easily 
> that Exim or any other existing MTA could pipe messages into for 
> delivery. So one wouldn't have to rewrite the MTA but just use 
> existing MTAs and just change the delivery mechanism.


It's not a matter of have to.  It's a matter of want to.



Re: The Future of Email is SQL

Posted by Marc Perkel <ma...@perkel.com>.

Kenneth Porter wrote:
> On Tuesday, June 13, 2006 8:52 PM -0700 kbaker <kb...@missionvi.com> 
> wrote:
>
>> It is visionary in that it is not the "norm", but again DBMail does 
>> all of
>> this very well and has been production quality for quite some time.
>
> I asked on the Dovecot list about how Dovecot compares to DBMail and 
> got this reply from Dovecot's author:
>

I think Timo will eventually add a MySQL backend to Dovecot.

Re: The Future of Email is SQL

Posted by Kenneth Porter <sh...@sewingwitch.com>.
On Tuesday, June 13, 2006 8:52 PM -0700 kbaker <kb...@missionvi.com> wrote:

> It is visionary in that it is not the "norm", but again DBMail does all of
> this very well and has been production quality for quite some time.

I asked on the Dovecot list about how Dovecot compares to DBMail and got 
this reply from Dovecot's author:

------------ Forwarded Message ------------
Date: Tuesday, June 13, 2006 9:43 AM +0300
From: Timo Sirainen <ts...@iki.fi>
To: dovecot@dovecot.org
Subject: Re: [Dovecot] DBMail versus Dovecot (was: Using MySQL to store 
email?)

On Mon, 2006-06-12 at 18:12 -0700, Kenneth Porter wrote:
> On Saturday, June 10, 2006 10:07 AM -0400 Charles Marcus
> <CM...@Media-Brokers.com> wrote:
>
> > A reference to DBMail was among the first responses, and there have been
> > others.
>
> Has anyone compiled a comparison of Dovecot to DBMail? Why would I chose
> one over the other?

I think their goals are quite different. Don't know if any such
comparisons would be all that useful.

Or I guess I can give you one difference: Dovecot tries very hard to be
secure. DBMail then seems to keep adding SQL injection security holes. I
said about this to them a few years ago and they fixed them, but now
that I looked at the code a few months ago they had added more of those.
---------- End Forwarded Message ----------

Re: The Future of Email is SQL

Posted by kbaker <kb...@missionvi.com>.
Thank you for a very well thought out *open* message. I would guess that most of 
these reasons are why DBMail was started 5 years ago ;)

I'm gonna response with some pro-DBMail stuff... just because it's in my head 
and pretty much addresses all of Marc's comments below.

Marc Perkel wrote:
> This is still visionary so take it for what it's worth. People are more 
> familiar with MAILDIR and MBOX because they are files. You can read them 
> with VI and PICO and FGREP and all the stuff that we are familiar with. 
> MySQL is also easy but might require new tools and some learning. Once 
> you become familiar with it them everything is just as easy.
> 
> One could expoir and import to and from maildir and mbox, so that 
> doesn't go away.
DBMail has both a maildir import and export.


> With MySQL there are a lot of problems that go away. MySQL is a magic 
> port that does everything for you. It doesn't care about what filesystem 
> you're using, what OS you are running, what kinds of file locks or NFS 
> mounts, or if you're using Reiser for maildir speed or if you have 
> enough inodes. All that stuff goes away.
One of the great aspects of DBMail... SQL Clustering and replication independent 
of the OS. Cyrus and other great IMAP server have only recently gotten this 
working in Alpha versions. Otherwise it would require very expensive storage 
solutions to get any kind of failover or "realtime" replication. With 
DBMail/MySQL just setup another cheap server and configure replication done.


> MBOX and MAILDIR have no indexing. You can add indexes externally but 
> there are no standards for that. With MySQL you can index anything and 
> everything. You can add fields to the message, any fiels, as many as you 
> want, and they too can be keys and indexes. With maildir and mbox you 
> can't really do that.
Many of the filesystem storage solutions do have indexing, but in file hashes. 
Zimbra has gone so far as to have a filesystem store with MySQL indexes for 
speed. As you point out these are not "standard" and don't scale unless they are 
on an expensive storage solution.


> With MySQL you can access the data with any MySQL application. And the 
> access is consistent no matter what programming language you use, what 
> OS you use, anything. It's all SQL. So if you want a web interface you 
> just write a PHP app.
There are a number of PHP and other scripts that access SQL directly for 
everything from webmail to administration... works great and very easy to work with.


> Spamassassin for example has migrated from GB files to MySQL for the AWL 
> and bayes and we all can see how this has improved performance and ease 
> of implementation. Before SQL having 5 servers sharing the same bayes is 
> difficult. With SQL it's trivial. The SQL does it all for you. They do 
> the magic so you don't have to.
> 
> The indexing is a real key feature. If I have a key based on the sending 
> host or index all the received lines, I could delete all messages that 
> had an IP in any received line almost instantly. I can do it thousands 
> of times faster than mbox or maildir because it's indexed. Indexing 
> gives you incredible power and the SQL engine does all that for you. 
> That SA and the IMAP and the MTA and the Web GUI - everything - all 
> taking to a standard database - all integrated - all comnpatible.
> 
> So - like I said - this is visionary stuff. Think SQL - think outside 
> the box.

It is visionary in that it is not the "norm", but again DBMail does all of
this very well and has been production quality for quite some time.

This is a great thread, but as far as starting a new project I'd just go with 
what is already working. DBMail is built in C, so very speedy.

Someone mentioned rewriting an IMAP server in perl... not sure if I'd go that 
way from a speed standpoint, but would certainly be interesting.

If you are looking for another approach Zimbra is written entirely in Java and 
Open. It uses only MySQL indexes, but would be very straight forward to replace 
its existing MailStore Class with one that writes to MySQL rather than the 
filesystem.



-- 
Kevin Baker


Re: The Future of Email is SQL

Posted by Ramprasad <ra...@netcore.co.in>.
On Wed, 2006-06-14 at 11:50 -0700, Steve Thomas wrote:
> > So - like I said - this is visionary stuff. Think SQL - think outside
> > the box.
> 
> It's not all that visionary. Microsoft's been working on WinFS - a SQL
> based system for storing files - for years. It's supposed to have been
> released as a part of longhorn (vista), but they're pushing it back.

   Oracle has OCS , which consists of a
mail/calendar/ldap/fileserver/webserver/  ... blah blah all with SQL
storage. And the database is .. no points for guessing that. 
But OCS is a terrible resource HOG ( understatement ) I dont think there
are many users for OCS

IMHO SQL storage is definitely going to be there.
The common indexing mechanism is what makes such storage interesting. I
agree it is slow now, but hardware and software will get better then
resource will not be an issue

Ram


Re: The Future of Email is SQL

Posted by Steve Thomas <li...@sthomas.net>.
> So - like I said - this is visionary stuff. Think SQL - think outside
> the box.

It's not all that visionary. Microsoft's been working on WinFS - a SQL
based system for storing files - for years. It's supposed to have been
released as a part of longhorn (vista), but they're pushing it back.

I'm still confused as to why this is even being discussed on this list,
though. SA is just a system for identifying and labeling certain types of
messages. It has nothing whatsoever to do with where or how those messages
are stored.

St-



Re: The Future of Email is SQL

Posted by Marc Perkel <ma...@perkel.com>.
This is still visionary so take it for what it's worth. People are more 
familiar with MAILDIR and MBOX because they are files. You can read them 
with VI and PICO and FGREP and all the stuff that we are familiar with. 
MySQL is also easy but might require new tools and some learning. Once 
you become familiar with it them everything is just as easy.

One could expoir and import to and from maildir and mbox, so that 
doesn't go away.

With MySQL there are a lot of problems that go away. MySQL is a magic 
port that does everything for you. It doesn't care about what filesystem 
you're using, what OS you are running, what kinds of file locks or NFS 
mounts, or if you're using Reiser for maildir speed or if you have 
enough inodes. All that stuff goes away.

MBOX and MAILDIR have no indexing. You can add indexes externally but 
there are no standards for that. With MySQL you can index anything and 
everything. You can add fields to the message, any fiels, as many as you 
want, and they too can be keys and indexes. With maildir and mbox you 
can't really do that.

With MySQL you can access the data with any MySQL application. And the 
access is consistent no matter what programming language you use, what 
OS you use, anything. It's all SQL. So if you want a web interface you 
just write a PHP app.

Spamassassin for example has migrated from GB files to MySQL for the AWL 
and bayes and we all can see how this has improved performance and ease 
of implementation. Before SQL having 5 servers sharing the same bayes is 
difficult. With SQL it's trivial. The SQL does it all for you. They do 
the magic so you don't have to.

The indexing is a real key feature. If I have a key based on the sending 
host or index all the received lines, I could delete all messages that 
had an IP in any received line almost instantly. I can do it thousands 
of times faster than mbox or maildir because it's indexed. Indexing 
gives you incredible power and the SQL engine does all that for you. 
That SA and the IMAP and the MTA and the Web GUI - everything - all 
taking to a standard database - all integrated - all comnpatible.

So - like I said - this is visionary stuff. Think SQL - think outside 
the box.


Re: The Future of Email is SQL

Posted by Marc Perkel <ma...@perkel.com>.

John Rudd wrote:
>
> I had been thinking about how feasible it would be to re-implement 
> dbmail in perl..
>
> and maybe a decent perl MTA to put in front of it too (something that 
> will work with sendmail milters...).
>
> Then you could be pretty database agnostic.  Just whatever perl wants 
> to put back there.
>
>

I think that a local delivery program could be written fairly easily 
that Exim or any other existing MTA could pipe messages into for 
delivery. So one wouldn't have to rewrite the MTA but just use existing 
MTAs and just change the delivery mechanism. Eventually I think that 
MTAs would integrate MySQL delivery. My guess is that it's easier to 
deliver to MySQL than MBOX or MAILDIR because MySQL does all the work 
for you. You just pass the data and let MySQL do that magic.

I'm also thinking that if SQL is used for mail storage that the SQL 
folks will evolve their databases to handle the needs of the email 
community. So those who point to Exchange as a disaster, I look at it as 
a first step. Something to take the good ideas and improve on them.


Re: Re[2]: The Future of Email is SQL

Posted by John Rudd <jr...@ucsc.edu>.
On Jun 9, 2006, at 9:49 PM, Sanford Whiteman wrote:

>> If  we are talking about making a SQL application that is usable for
>> a  multitude of people then why lock them into something. That's the
>> easiest way to drive them away from supporting it.
>
> Word.  Perl  can  play  nice with plenty of RDBMSs. If this discussion
> belongs  here  at  all, I can't see how RDBMS partisanship is going to
> take it anywhere good.


I had been thinking about how feasible it would be to re-implement 
dbmail in perl..

and maybe a decent perl MTA to put in front of it too (something that 
will work with sendmail milters...).

Then you could be pretty database agnostic.  Just whatever perl wants 
to put back there.