You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2013/08/07 11:35:38 UTC
[Bug 6965] New: BayesStore/Redis: dealing with server restarts and
out-of-lockstep protocol
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6965
Bug ID: 6965
Summary: BayesStore/Redis: dealing with server restarts and
out-of-lockstep protocol
Product: Spamassassin
Version: SVN Trunk (Latest Devel Version)
Hardware: PC
OS: All
Status: NEW
Severity: normal
Priority: P2
Component: Libraries
Assignee: dev@spamassassin.apache.org
Reporter: Mark.Martinec@ijs.si
Noticed some problems with the Redis backend for bayes:
1. About once in a week I can see in our logs that Redis goes out of sync
on the protocol with a server, and doesn't recover. This happens when
some redis operation times out: the result comes back from the server
but is not read by SpamAssassin because of a timeout. The next mail check
task in the same child process then tries to make its query, but receives
remains of a previous reply. This situation does not correct itself
for the lifetime of the child process.
As the Redis CPAN module does not offer any method to flush input buffers,
the cleanest way to recover is to just drop the TCP session, and re-connect
on the next request. The current operation still fails, but at least all
subsequent operations will be able to work on a fresh session.
To implement this, I propose to fully decouple the redis server session's
connect/disconnect from the SpamAssassin's notion of a "logical connection",
i.e. the tie_db_readonly / tie_db_writable / untie_db.
2. To be able to detect such out-of-sequence results in the future,
more detailed checks are added, including passing a random 'nonce'
to a multi_hmget_script and expecting the same value to come back
with a reply - and dropping a session if it doesn't match.
3. In view of the Redis module's problem report 38:
| https://github.com/melo/perl-redis/issues/38
| Select new database doesn't survive after reconnect.
| If you select to a different database, and then loose the connection
| and reconnect, you'll end back into the default database. We should
| keep track of the current database and select to it after reconnect.
we are currently in trouble if one choses a non-zero database index
and enables automatic reconnects. After a reconnect, SpamAssassin
ends up connected to a database index zero.
The solution is in providing a method on_connect() to the Redis module.
This callback is called on every connect to a server, initial and
automatic re-connects. In this routine we can select() the database
or do whatever is necessary to ensure a consistent state.
Even if the issue 38 is eventually resolved in Redis.pm, it does
no harm to call select() twice on a connect.
The only drawback is that the on_connect callback is only available
since Redis.pm version 1.956, so we need to bump the minimal requirement
for this module from 1.954 to 1.956. Btw, the current version is 1.961
from January 2013.
Attached is a patch to implement this. It is a bit larger then necessary
as it also renames the {is_really_open} to {connected} and swaps some
code sections for better readability (shorter 'if' branch first, longer
branch after an 'else').
--
You are receiving this mail because:
You are the assignee for the bug.
[Bug 6965] BayesStore/Redis: dealing with server restarts and
out-of-lockstep protocol
Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6965
--- Comment #1 from Mark Martinec <Ma...@ijs.si> ---
Created attachment 5166
--> https://issues.apache.org/SpamAssassin/attachment.cgi?id=5166&action=edit
The patch implementing the above changes
--
You are receiving this mail because:
You are the assignee for the bug.
[Bug 6965] BayesStore/Redis: dealing with server restarts and
out-of-lockstep protocol
Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6965
Mark Martinec <Ma...@ijs.si> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution|--- |FIXED
--- Comment #3 from Mark Martinec <Ma...@ijs.si> ---
Now with the Bug 6972 proposed changes, this PR becomes mostly irrelevant
(but served as a foundation for Bug 6972 rewrites).
Closing.
--
You are receiving this mail because:
You are the assignee for the bug.
[Bug 6965] BayesStore/Redis: dealing with server restarts and
out-of-lockstep protocol
Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6965
Mark Martinec <Ma...@ijs.si> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|Undefined |3.4.0
--- Comment #2 from Mark Martinec <Ma...@ijs.si> ---
trunk:
Bug 6965: BayesStore/Redis: dealing with server restarts
and out-of-lockstep protocol
Sending lib/Mail/SpamAssassin/BayesStore/Redis.pm
Sending lib/Mail/SpamAssassin/Util/DependencyInfo.pm
Committed revision 1511228.
--
You are receiving this mail because:
You are the assignee for the bug.