You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Michael Scheidell <sc...@secnap.net> on 2006/10/10 15:14:06 UTC

Auto_increment vs SERIAL key types

I am experimenting with mysql replication, and have done some research
on key collisions in the case of a 'load balancing' situation (live sql
servers running on each amavisd server), using either same mx weight, or
VRRP/CARP, heartbeat, virtual ip type setups.  'random' smtp connections
could hit each server, and each server has a local mysql DB, in a dual
master/slave replication setup. (updates to either db propagate to the
other, works fine, creates lots of traffic, so maybe use a second nic
and an xover cable..)

My concern is over use of SERIAL keys in amavisd-new tables, vs
AUTO_INCREMENT keys.
(are SERIAL keys an alias for AUTO_INCREMENT? Are SERIAL keys safe in
replication situations?)

I have seen documentation saying that 'auto_increment' works as expected
in replication situations, but can't find any information on SERIAL
keys.

http://www.weberdev.com/Manuals/MySQL3.X_4.X/replication.html#replicatio
n-features

Another issue may be AWL files, (I suppose a spamassassin question
also?).  Every 'new' ip/email incoming will create a new  PRIMARY KEY
(username,email,ip).  If two connections, one on each box, first one
wins, replication stops and you need to manually issue a bunch of
commands to skip (two?) transactions and restart slave.

 --slave-skip-errors=[err_code1,err_code2,... | all]

Normally, replication stops when an error occurs, which gives you the
opportunity to resolve the inconsistency in the data manually. This
option tells the slave SQL thread to continue replication when a
statement returns any of the errors listed in the option value.

Do not use this option unless you fully understand why you are getting
errors. If there are no bugs in your replication setup and client
programs, and no bugs in MySQL itself, an error that stops replication
should never occur. Indiscriminate use of this option results in slaves
becoming hopelessly out of sync with the master, with you having no idea
why this has occurred

I am using Innodb DB type on Freebsd5, and mysql 4.1.20ish.


-- 
Michael Scheidell, CTO
561-999-5000, ext 1131
SECNAP Network Security Corporation
Keep up to date with latest information on IT security: Real time
security alerts: http://www.secnap.com/news


Re: Auto_increment vs SERIAL key types

Posted by SM <sm...@resistor.net>.
At 06:14 10-10-2006, Michael Scheidell wrote:
>I am experimenting with mysql replication, and have done some research
>on key collisions in the case of a 'load balancing' situation (live sql

[snip]


>My concern is over use of SERIAL keys in amavisd-new tables, vs
>AUTO_INCREMENT keys.
>(are SERIAL keys an alias for AUTO_INCREMENT? Are SERIAL keys safe in
>replication situations?)

It's an alias for BIGINT UNSIGNED NOT NULL AUTO_INCREMENT UNIQUE.

See auto_increment_increment and auto_increment_offset (MySQL 5.x).

Regards,
-sm 


R: Auto_increment vs SERIAL key types

Posted by Giampaolo Tomassoni <g....@libero.it>.
>
> ...omissis...
>
> it did does for, say, one year. It may have reached a very high 

Of course, "high" is instead "low"...


> totscore and count. Well, now suppose your reliable source 
> started sending a lot of spam. Would you like to have to wait a 
> month or so before its whitelistening score would start to lower 

Of course, "lower" is instead "increase".


> enough to allow the spam detector not to pass that stuff? Well, 
> no. One may, in example, have a sql script run, say, hourly from 
> a cron job which deletes awl entries older than, say, three months.


R: Auto_increment vs SERIAL key types

Posted by Giampaolo Tomassoni <g....@libero.it>.
> Another issue may be AWL files, (I suppose a spamassassin question
> also?).  Every 'new' ip/email incoming will create a new  PRIMARY KEY
> (username,email,ip).  If two connections, one on each box, first one
> wins, replication stops and you need to manually issue a bunch of
> commands to skip (two?) transactions and restart slave.

To my opinion, the best way to implement awl is to have a table for each server which is basicly one-way replicated (from the only originating server to the others in the cluster). The table is to be made up of the fields timestamp, username, email, ip, and score. Please note I sayd just "score", not "count" + "totscore".

Then, the database may offer a view which merges the tables replicated from the various servers (the one "managed" by the server and the ones managed by the other servers) in such a way that spamassassin may simply access it like a "standard" awl table. Ie, something like:

	select username, email, ip, count(*) as count, sum(score) as totscore
	from (
		select username, email, ip, score from awl0
		union all select username, email, ip, score from awl1
		...
		union all select username, email, ip, score from awlN
	)
	group by username, mail, ip

The view should be made in such a way that an insert or an update into it would automatically trigger an insert in the awl table managed by the server.

Of course, the underlying sql engine has to support views and, most important, updates to a view. Maybe I'm wrong, but this is something that mysql doesn't do. Besides, that's one of the reasons for which I prefer much more postgresql.

You may see that the timestamp field is defined but never used. The idea is that the timestamp field is meant to record the time at which a new entry entered into the database. This way one may also implement some methods to delete "stale" entries. Ie.: suppose a source (email+ip pair) was used to send mostly ham and it did does for, say, one year. It may have reached a very high totscore and count. Well, now suppose your reliable source started sending a lot of spam. Would you like to have to wait a month or so before its whitelistening score would start to lower enough to allow the spam detector not to pass that stuff? Well, no. One may, in example, have a sql script run, say, hourly from a cron job which deletes awl entries older than, say, three months.

Do you like it?

-----------------------------------
Giampaolo Tomassoni - IT Consultant
Piazza VIII Aprile 1948, 4
I-53044 Chiusi (SI) - Italy
Ph: +39-0578-21100